Issue
I have the the following dataframe:
# dictionary with list object in values
details = {
'Name' : ['D', 'C', 'F', 'G','A','N'],
'values' : ['21%','45%','10%',12,14,15],
}
df = pd.DataFrame(details)
The column value has values in %, however, some were originally saves as string with symbol % and some as number. I would like to get rid of the % and have them all as int type. For that I have used replace and then as_type. however, when I repalce the '%', the values that son't have % change to Nan values:
df['values']=df['values'].str.replace('%', '')
df
>>> Name values
0 D 21
1 C 45
2 F 10
3 G NaN
4 A NaN
5 N NaN
My reuired output should be:
>>> Name values
0 D 21
1 C 45
2 F 10
3 G 12
4 A 14
5 N 15
My question is, how can I get rid of the % and get the column with the values , without getting these NaN values? and why is this happenning?
Solution
There are numeric values, so if use str
function get missing values for numeric, possible solution is use Series.replace
with regex=True
for replace by substring and then because get numeric with strings convert output to integers:
df['values']=df['values'].replace('%', '', regex=True).astype(int)
print (df)
Name values
0 D 21
1 C 45
2 F 10
3 G 12
4 A 14
5 N 15
Or your solution with replace missing values:
df['values']=df['values'].str.replace('%', '').fillna(df['values']).astype(int)
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.