Issue
I am used to replacing empty string with NaN and dropping to remove empty data.
import pandas as pd
import numpy as np
df.replace('', np.nan).dropna()
However, I want my function to run using serverless framework. I need to import numpy
just to use np.nan
, which eats up my precious 250MB limit for package size.
Importing pd.np.nan
works, but there is warning that pandas.np
module is deprecated and will be removed from a future version of pandas.
Is there any solution to use np.nan
without importing numpy?
Solution
Use pd.NA
instead.
From the Docs:
Starting from pandas 1.0, an experimental
pd.NA
value (singleton) is available to represent scalar missing values. At this moment, it is used in the nullable integer, boolean and dedicated string data types as the missing value indicator. The goal ofpd.NA
is provide a “missing” indicator that can be used consistently across data types (instead ofnp.nan
,None
orpd.NaT
depending on the data type).
Answered By - Isak
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.