Missing Values
Table of contents
import pandas as pd
Missing Data
isnull()
== isna()
(alias) $\longleftrightarrow$ notnull()
== notna()
(alias)
DataFrame.isnull() # return : DataFrame
Series.isnull() # return : Series
Index.isnull() # return : numpy.ndarray[bool]
pandas.isnull(obj) # return : bool or array-like of bool
return : boolean_mask
Q. How many missing values in df.column_name? The following 3 are equivalent.
df.column_name.isna().sum() # equivalent
pd.isna(df.coloumn_name).sum() # equivalent
len(df[df.coloumn_name.isna()]) # equivalent
fillna()
DataFrame.fillna(value, method, axis, inplace)
Series.fillna(value, method, axis, inplace)
Index.fillna(value) # value : scalar only
- value : scalar, dict, Series, or DataFrame
- scalar : value to use to fill holes
- dict, Series, or DataFrame : specify values to use for each index (for a Series) or column (for a DataFrame)
list: unable
- method : {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
- pad / ffill: propagate last valid observation forward to next valid
- backfill / bfill: use next valid observation to fill gap.
- axis : {0 or ‘index’, 1 or ‘columns’}
- inplace : bool, default False
replace()
DataFrame.replace(to_replace, value, inplace)
Series.replace(to_replace, value, inplace)
- to_replace : str, regex, list, dict, Series, int, float, or None
- how to find the values that will be replaced
- value : scalar, dict, list, str, regex, default None
- value to replace any values matching to_replace with
- inplace : bool, default False