Issue
I have a data frame - the table below
row_id | name | age | state | |
---|---|---|---|---|
1 | Sta | [email protected] | 29 | AK |
2 | NaN | [email protected] | NaN | NB |
3 | Elle | NaN | 32 | CA |
4 | Elle | NaN | 44 | NaN |
I want to have the row id and column name wherever my df is NaN/Empty in the below format
row_id | col_id | value |
---|---|---|
2 | Name | NaN |
2 | age | NaN |
3 | name | NaN |
4 | name | NaN |
4 | state | NaN |
import numpy as np
tmp = pd.DataFrame(columns= ['row_id', 'col_id','Value','Gap'], dtype=object) #Creating new DF
for name, values in proc_df.iteritems(): #For loop by items
for i,v in enumerate(values.tolist()): #For loop on values
if np.isnan(v):
tmp.loc[len(tmp)] = [proc_df.at[i,'row_id'], name, v, len([v])]
tmp
but I am getting the below error
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Solution
Here, my basic idea is to read DF by columns and if its founds None value
create new list
and append
into new df
the list such as [2, A, NaN, 1]
.
Note : Dont know your logic for GAP column, so am just counting the None value in that specific cell.
Code:
import pandas as pd
tmp = pd.DataFrame(columns= ['row_id', 'Col_id','Value']) #Creating new DF
for name, values in df.iteritems(): #For loop by items
for i,v in enumerate(values.tolist()): #For loop on values
if pd.isna(v):
tmp.loc[len(tmp)] = [df.at[i,'row_id'], name, v]
tmp
Output:
row_id Col_id Value
0 2 name None
1 3 email None
2 4 email None
3 2 age None
4 4 sate None
Answered By - R. Baraiya
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.