What is Pandas fillna() and how to use it
The Python pandas DataFrame.fillna()
function is used to replace missing values in a DataFrame. This can help to simplify data cleaning processes or be a useful tool when performing analyses.
What is the syntax for pandas fillna()
?
The fillna()
function accepts up to five parameters and is structured as follows:
DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None)
pythonImportant parameters for fillna()
The behavior of the DataFrame.fillna()
function can be adjusted using various parameters:
Parameter | Description | Default Value |
---|---|---|
value
|
A scalar value or a dictionary (or series) to replace NaNs | None
|
method
|
Specifies the fill method; forward fill (ffill ) or backward fill (bfill )
|
None
|
axis
|
Determines which axis to perform the operation on (0 or index for rows, 1 or columns for columns)
|
0 |
inplace
|
If True , the changes are made directly in the original DataFrame
|
False
|
limit
|
An integer that limits the number of NaN values to be replaced | None
|
In future versions of Pandas, the method
parameter will likely no longer be supported. If this takes place, you can rely on obj.ffill()
or obj.bfill()
instead, since these functions have the same effect as the method
parameter.
How to use Pandas DataFrame.fillna()
The Pandas fillna()
function can be used in several different ways:
Replacing NaN values with a fixed value
First, let’s create a DataFrame:
import pandas as pd
# Sample DataFrame with different values
data = {
'A': [1, 2, None, 4],
'B': [None, 2, 3, 4],
'C': [1, None, 3, 4]
}
df = pd.DataFrame(data)
print(df)
pythonThe DataFrame looks like this:
A B C
0 1.0 NaN 1.0
1 2.0 2.0 NaN
2 NaN 3.0 3.0
3 4.0 4.0 4.0
In pandas, the value None
in DataFrames and Series is interpreted as NaN
To replace the missing values with 0, you can use the pandas fillna()
function:
# Replacing missing values with zero
df_filled = df.fillna(0)
print(df_filled)
pythonThe result is that every NaN value has been replaced with 0:
A B C
0 1.0 0.0 1.0
1 2.0 2.0 0.0
2 0.0 3.0 3.0
3 4.0 4.0 4.0
Using the forward filling method ffill
If you want to fill NaN values with the value that directly precedes them in the column where they are located, you can pass the ffill
method as a parameter:
# Replace all NaN values with the value that precedes them
df_ffill = df.fillna(method='ffill')
print(df_ffill)
pythonIn this example, the NaN values in columns A and C have been filled with the preceding values in the same column. Since there was no preceding value in column B for row 0, the NaN value is retained:
A B C
0 1.0 NaN 1.0
1 2.0 2.0 1.0
2 2.0 3.0 3.0
3 4.0 4.0 4.0
Using the backward filling method bfill
for rows
NaN values can also be filled with succeeding values based on their row position. To do this, you need to use the bfill
method and set the axis
parameter to 1:
df_bfill = df.fillna(method='bfill', axis=1)
print(df_bfill)
pythonThe result shows that the NaN values in rows 0 and 2 have been replaced by the values that follow them in the same row. The NaN value in the first row, however, remains the same because it’s the last value in that row:
A B C
0 1.0 1.0 1.0
1 2.0 2.0 NaN
2 3.0 3.0 3.0
3 4.0 4.0 4.0
- 99.9% uptime
- PHP 8.3 with JIT compiler
- SSL, DDoS protection, and backups