Issue
I have a data frame like this:
df:
C1 C2 C3
1 4 6
2 NaN 9
3 5 NaN
NaN 7 3
I want to concatenate the 3 columns to a single column with comma as a seperator. But I want the comma(",") only in case of non-null value.
I tried this but this doesn't work for non-null values:
df['New_Col'] = df[['C1','C2','C3']].agg(','.join, axis=1)
This gives me the output:
New_Col
1,4,6
2,,9
3,5,
,7,3
This is my ideal output:
New_Col
1,4,6
2,9
3,5
7,3
Can anyone help me with this?
Solution
Judging by your (wrong) output, you have a dataframe of strings and NaN values are actually empty strings (otherwise it would throw TypeError: expected str instance, float found
because NaN is a float).
Since you're dealing with strings, pandas is not optimized for it, so a vanilla Python list comprehension is probably the most efficient choice here.
df['NewCol'] = [','.join([e for e in x if e]) for x in df.values]
Answered By - New England cottontail
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.