Issue
I'm consuming an API and some column names are too big for mysql database.
How to ignore field in dataframe?
I was trying this:
import pandas as pd
import numpy as np
lst =['Java', 'Python', 'C', 'C++','JavaScript', 'Swift', 'Go']
df = pd.DataFrame(lst)
limit = 7
for column in df.columns:
if (pd.to_numeric(df[column].str.len())) > limit:
df -= df[column]
print (df)
result:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
My preference is to delete the column that is longer than my database supports.
But I tried slice to change the name and it didn't work either.
I appreciate any help
Solution
Suppose the following dataframe
>>> df
col1 col2 col3 col4
0 5uqukp g7eLDgm0vrbV Bnssm tRJnSQma6E
1 NDsApz lu02dO ogbRz5 481riI6qne
2 UEfni YV2pCXYFbd pyHYqDH fghpTgItm
3 a0PvRSv 0FwxzFqk jUHQliB W2dBhH
4 BQgTFp FMseKnR ifgt tw1j7Ld
5 1vvF2Hv cwTyt2GtpC4 P039m2 1qR2slCmu
6 JYnABTr oLdZVz KYBspk RgsCsu
To remove columns where at least one value have a length greater than 7 characters, use:
>>> df.loc[:, df.apply(lambda x: x.str.len().max() <= 7)]
col1 col3
0 5uqukp Bnssm
1 NDsApz ogbRz5
2 UEfni pyHYqDH
3 a0PvRSv jUHQliB
4 BQgTFp ifgt
5 1vvF2Hv P039m2
6 JYnABTr KYBspk
To understand the error, read this post
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.