Python Snippets: Dropping Infinite Values From Dataframes in Pandas

Infinite values can occur more often than people expect, especially for calculated data.

For example, in a recent post I calculated the Twitter Follower-Friend ratio by dividing the followers_count series by the friends_count series. But what happens when friends_count is zero? Inf.

In that particular case, I wanted to drop the rows. Here’s how to do it: 1

import pandas as pd
import numpy as np

# example dataframe
df = pd.DataFrame({"a": [1, 2, 3, 4], "b": [9, 0, 8, 0]})
df["c"] = df["a"] / df["b"]

df

abc
0190.111111
120inf
2380.375000
340inf
# replace inf with NaN then dropna
df.replace([np.inf, -np.inf], np.nan).dropna(subset=["c"], how="all")

abc
0190.111111
2380.375000

Mind you, this is only helpful if you want to discard rows with inf values. Otherwise, df.replace() can be used to “fix” your values to something that makes sense for the application without discarding the row.


  1. https://stackoverflow.com/a/17478495/2533247 ↩︎

Related