Issue
This is likely a very obvious question, but I have a column containing lists that I am trying to flatten: i.e [AB, CD] -> AB, CD
.
Sample dataframe:
data = [
["ABC", ["AB", "AB", "EF"]],
["DEF", ["CD", "EF"]],
["GHI", ["JK"]],
[
"JKL",
np.nan,
],
]
df = pd.DataFrame(data, columns=["ID", "list"])
df
I am applying this function to said column:
def func(string):
flattened_string = ", ".join(map(str, string))
return flattened_string
df = df["list_column"].apply(func)
However there are a few rows containing NaN
values that initiate the error "TypeError: 'float' object is not iterable"
. Is there any way I can modify this function to fix this? Or perhaps (likely) a much better way to do this?
Thank you!!!
Solution
If needing to preserve the NaN
values in the column, we can use setting with enlargement by only joining the notna
values in the column and assigning back:
df['list'] = df.loc[df['list'].notnull(), 'list'].map(','.join)
Or if there are non-strings in the lists:
df['list'] = df.loc[
df['list'].notnull(), 'list'
].map(lambda x: ','.join(map(str, x)))
df
:
ID list
0 ABC AB,AB,EF
1 DEF CD,EF
2 GHI JK
3 JKL NaN # Still NaN not empty string
Answered By - Henry Ecker
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.