Issue
I have the following dataframe, where the first three columns have specific names which should not be changed ('col1' - 'col3') and the numbered columns ranging from 3 - 7.
data = [[0, 0.5, 0.5, 1, 0, 1, 0, 0],
[1, 0.5, 0.5, 1, 1, 0, 1, 1],
[2, 0.5, 0.5, 1, 1, 0, 1, 1]]
df = pd.DataFrame(data)
df = df.rename(columns = {0: 'Col1', 1:'Col2', 2: 'Col3'})
I would like to select all the numbered columns (column indexes 3-7) that contain the value 1 in the first row.
df2 = df.loc[0, df.iloc[0, 3:] == 1]
This throws me the following error: AssertionError
Afterwards I would like to use the indexes from df2 that represent the columns that fulfill the criteria of value 1 in row 1 (e.g. column 3 and 5) to be used to select those columns from the second row and check whether these also have value 1 or not.
df3 = df.loc[1, df.iloc[1, df2.index] == 1]
This throws the following error: IndexError: .iloc requires numeric indexers, got [3 5]
The final expected output should be that only column index 3 from row 2 fulfills the criteria of value 1 based on the fact that from row 1 only column index 3 and 5 had value 1.
How can I do this?
Solution
Use:
df1 = df.iloc[:, 3:]
fin = df1.columns[(df1.iloc[0] == 1) & (df.iloc[1, 3:] == 1)]
print (fin)
Index([3], dtype='object')
Original solution:
out = df.columns[3:][df.iloc[0, 3:] == 1]
s = df.loc[1, out]
fin = s.index[s == 1]
print (fin)
Index([3], dtype='object')
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.