In Python pandas, you can use the unique() function to identify unique values in a column of a DataFrame. This makes it easy to get a quick overview of the different values within your dataset.

Web Hosting
Fast, scalable hosting for any website
  • 99.9% uptime
  • PHP 8.3 with JIT compiler
  • SSL, DDoS protection, and backups

What is the syntax of pandas DataFrame[].unique()?

The basic syntax for using pandas unique() is simple. This is because the function doesn’t take any parameters:

DataFrame['column_name'].unique()
python

Keep in mind that unique() can only be applied to one column. Before calling the function, you’ll need to indicate which column you want to evaluate. The unique() function returns a numpy array containing all the different values in the order they appear, with duplicate values in the column removed. It doesn’t, however, sort the values.

Note

If you’ve been working with Python for a while, you may be familiar with the numpy equivalent to pandas unique(). For efficiency reasons, the pandas version is generally preferable.

How to use pandas DataFrame[].unique()

To use unique() in a pandas DataFrame, you need to first specify the column you want to check. In the following example, we’ll use a DataFrame that contains information about the age and city of residence of a group of individuals.

import pandas as pd
# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Edward'],
    'Age': [24, 27, 22, 32, 29],
    'City': ['New York', 'Los Angeles', 'New York', 'Chicago', 'Los Angeles']
}
df = pd.DataFrame(data)
print(df)
python

The resulting DataFrame looks like this:

Name  	Age       City
0    Alice    	24    New York
1    Bob   	27  	Los Angeles
2  Charlie    	22    	New York
3    David   	32    Chicago
4   Edward  	29  	Los Angeles

Now, let’s say we want to create a list of all the cities where the people in the DataFrame live. We can apply the pandas unique() function to the column that contains the cities.

# Find different cities
unique_cities = df['City'].unique()
print(unique_cities)
python

The output is a numpy array that lists each city once, showing that the individuals in the DataFrame are from a total of three cities: New York, Los Angeles and Chicago.

['New York' 'Los Angeles' 'Chicago']
Was this article helpful?
Go to Main Menu