Issue
I would like to fill missing values in a pandas dataframe with the average of the cells directly before and after the missing value. So if it was [1, NaN, 3], the NaN value would be 2 because (1 + 3)/2. I could not find any way to do this with Pandas or Scikit-learn. Is there any way to do this?
Solution
Consider this dataframe
df = pd.DataFrame({'val': [1,np.nan, 4, 5, np.nan, 10]})
val
0 1.0
1 NaN
2 4.0
3 5.0
4 NaN
5 10.0
You can use fillna along with shift() to get the desired output
df.val = df.val.fillna((df.val.shift() + df.val.shift(-1))/2)
You get
val
0 1.0
1 2.5
2 4.0
3 5.0
4 7.5
5 10.0
Answered By - Vaishali
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.