The Python pandas function DataFrame.iterrows() is used to iterate over rows in a pandas DataFrame. For each row, it provides a Python tuple that contains the row index and a Series object with the row’s data.

Web Hosting
Fast, scalable hosting for any website
  • 99.9% uptime
  • PHP 8.3 with JIT compiler
  • SSL, DDoS protection, and backups

What is the syntax for pandas iterrows()?

The basic syntax of pandas DataFrame.iterrows() is simple since the function doesn’t take any parameters:

df.iterrows()
python

In this code example, df is the DataFrame you want to iterate through.

How to use the pandas iterrows() function

The DataFrame.iterrows() function is typically used when you need to process data row by row. It’s often combined with Python for-loops.

Adding up values in a column

Let’s look at an example DataFrame that contains the columns Name, Age and Score:

import pandas as pd
# Creating an example DataFrame
data = {'Name': ['Anna', 'Ben', 'Clara'],
    'Age': [23, 35, 29],
    'Score': [88, 92, 85]}
df = pd.DataFrame(data)
print(df)
python

The code above results in the following DataFrame:

Name  Age  Score
0   Anna    23     88
1    Ben    35     92
2  Clara    29     85

Now, let’s calculate the sum of the scores. We can use pandas DataFrame.iterrows() to do this:

# Calculating the total score
total_score = 0
for index, row in df.iterrows():
    total_score += row['Score']
print(f"The total score is: {total_score}")
python

In this example, we used the pandas iterrows() function to loop through each row, adding up the values in the Score column one by one. This produces the following result:

The total score is: 265
Note

When using pandas iterrows(), it’s important not to directly modify the data you’re iterating over. Depending on the data type, doing so may lead to unexpected results and unintended behavior.

Processing rows using conditions

The iterrows() function can also be used to apply conditions to individual rows in your DataFrame. For example, let’s say you want to retrieve the names of everyone over 30 years old in the DataFrame from the last example:

# Retrieving names of people over 30 years old
names = []
for index, row in df.iterrows():
    if row['Age'] > 30:
        names.append(row['Name'])
print(f"People over 30 years old: {names}")
python

In this example, we used DataFrame.iterrows() to go through each row of data. Inside the for-loop, it checks the values in the Age column and only stores the names of people over 30 years old in the Python list names. This is done using the Python append() function. Here’s the result:

People over 30 years old: ['Ben']
Note

While it’s easy to use DataFrames.iterrows(), keep in mind that it may not run efficiently on large DataFrames. In many cases, other options like apply() or vectorized calculations can be used to achieve better performance.

Was this article helpful?
Go to Main Menu