Issue
I want to use clean_currency method below to strip out commas and dollar signs from Savings ($) column and put into a new column called 'Savings_Clean'.
def clean_currency(curr):
return float(curr.replace(",", "").replace("$", ""))
clean_currency("$60,000")#The output is proof that function is working
Output: 60000.0
How to I clean Savings ($) column, because when I put clean_currency(df.Savings)
I get the following error: TypeError: cannot convert the series to <class 'float'>
Solution
To make it work with ordinary strings and also the pd.Series
that contain strings, an if guard can be used:
def clean_currency(curr):
if isinstance(curr, str):
return float(curr.replace(",", "").replace("$", ""))
return curr.str.replace(",", "").str.replace("$", "").astype(float)
Now it will look the argument passed curr
and if it is a str
ing, cast with float
, otherwise assume it is a series and use astype
. (note that we use str
accessor for replacing the series).
>>> clean_currency("$60,000")
60000.0
>>> clean_currency(pd.Series(["$60,000", "$120,000", "$1,000,000"]))
0 60000.0
1 120000.0
2 1000000.0
dtype: float64
Answered By - Mustafa Aydın
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.