Issue
I am using the following code to remove some rows with missing data in pandas:
df = df.replace(r'^\s+$', np.nan, regex=True)
df = df.replace(r'^\t+$', np.nan, regex=True)
df = df.dropna()
However, I still have some cells in the data frame looks blank/empty. Why is this happening? Any way to get rid of rows with such empty/blank cells? Thanks!
Solution
You can use:
df = df.replace('', np.nan)
If want simplify your code is possible join regexes by |
and for empty space use ^$
:
df = pd.DataFrame({'A':list('abcdef'),
'B':['',5,4,5,5,4],
'C':['',' ',' ',4,2,3],
'D':[1,3,5,7,' ',0],
'E':[5,3,6,9,2,4],
'F':list('aaabbb')})
df = df.replace(r'^\s+$|^\t+$|^$', np.nan, regex=True)
print (df)
A B C D E F
0 a NaN NaN 1.0 5 a
1 b 5.0 NaN 3.0 3 a
2 c 4.0 NaN 5.0 6 a
3 d 5.0 4.0 7.0 9 b
4 e 5.0 2.0 NaN 2 b
5 f 4.0 3.0 0.0 4 b
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.