Issue
I am working on the dataset AMAZON ALEXA REVIEW RATINGS
When I uploaded my dataset as below
df_alexa = pd.read_csv('amazon_alexa.tsv', sep='\t')
I wanted to add a new feature, which is called length, as shown below
df_alexa['length'] = df_alexa['verified_reviews'].apply(len)
But, I get the below, error:
TypeError: object of type 'float' has no len()
Any assistance, please?
Solution
I think that this output is caused of the fact that there are sometimes columns where the verified reviews is nan, so you have to calculate the length of the review, if it is a column with nan value it should return 0. So i've created a function to return 0 if it is an empty row. first i wanted to try it with a subset perhaps it would work but i didn't figure it out how to show lines with no reviews.
import pandas as pd
df = pd.read_csv('amazon_alexa.tsv', sep='\t')
df.dropna(subset=['verified_reviews'], inplace=True)
df['length'] = df['verified_reviews'].apply(len)
print(df)
Here beneath the code if there are empty rows and you want to see them.
import pandas as pd
def calculate_length(review):
if pd.notna(review):
return len(str(review))
else:
return 0
df = pd.read_csv('amazon_alexa.tsv', sep='\t')
df['length'] = df['verified_reviews'].apply(calculate_length)
print(df)
let me know if it worked and if not please comment
Answered By - Fridolin
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.