Issue
I am trying to calculate the sum of all the numbers separated by a comma in a dataframe column however I keep getting error. This is what the dataframe looks like:
Description scores
logo
graphics
eyewear 0.360740,-0.000758
glasses 0.360740,-0.000758
picture -0.000646
tutorial 0.001007,0.000968,0.000929,0.000889
computer 0.852264 0.001007,0.000968,0.000929,0.000889
This is what the code looks like
test['Sum'] = test['scores'].apply(lambda x: sum(map(float, x.split(','))))
However I keep getting the following error
ValueError: could not convert string to float:
I though it could it be because of missing values at the start of the dataframe. But I subset the dataframe to exclude the missing the values, still I get the same error.
Output
Description scores SUM
logo
graphics
eyewear 0.360740,-0.000758 0.359982
glasses 0.360740,-0.000758 0.359982
picture -0.000646 -0.000646
tutorial 0.001007,0.000968,0.000929,0.000889 0.003793
computer 0.852264 0.001007,0.000968,0.000929,0.000889 0.856057
How can I resolve it?
Solution
There are times when using Python seems to be very effective, this might be one of those.
df['scores'].apply(lambda x: sum(float(i) if len(x) > 0 else np.nan for i in x.split(',')))
0 NaN
1 NaN
2 0.359982
3 0.359982
4 -0.000646
5 0.003793
6 0.856057
Answered By - Vaishali
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.