Handling Null Values — Pandas vs Polars
Published:
• Last updated:
• By Jeferson Peter
Polars & Pandas
Imagine you’re working with a dataset that has missing values for some columns.
How do you clean or fill them? Pandas and Polars provide simple tools to handle nulls.
Example data
import pandas as pd
import polars as pl
import numpy as np
data = {"name": ["Alice", "Bob", None], "score": [10, None, 30]}
df_pd = pd.DataFrame(data)
df_pl = pl.DataFrame(data)
Dropping nulls
# Pandas
print(df_pd.dropna())
# name score
# 0 Alice 10.0
# 2 None 30.0
# Polars
print(df_pl.drop_nulls())
# shape: (1, 2)
# ┌───────┬───────┐
# │ name ┆ score │
# │ --- ┆ --- │
# │ str ┆ i64 │
# ╞═══════╪═══════╡
# │ Alice ┆ 10 │
# └───────┴───────┘
Filling nulls
# Pandas
print(df_pd.fillna({"name": "Unknown", "score": 0}))
# name score
# 0 Alice 10.0
# 1 Bob 0.0
# 2 Unknown 30.0
# Polars
print(df_pl.fill_null(strategy="forward"))
# shape: (3, 2)
# ┌───────┬───────┐
# │ name ┆ score │
# │ --- ┆ --- │
# │ str ┆ i64 │
# ╞═══════╪═══════╡
# │ Alice ┆ 10 │
# │ Bob ┆ 10 │
# │ Bob ┆ 30 │
# └───────┴───────┘
Conclusion
- Pandas:
dropna()
,fillna()
are the main methods. - Polars:
drop_nulls()
,fill_null()
provide similar functionality. - Both give flexible options to handle missing values efficiently.