Issue
How do I replace only the integer values in the ID column with a sequence of consecutive numbers? I'd like any non-integer or NaN cells skipped.
Current df:
ID AMOUNT
1 0.00
test 5.00
test test 0.00
test 0.00
1 0.00
xx 304.95
x xx 304.95
1 0.00
1 0.00
xxxxx 0.00
1 0.00
xxx 0.00
xx xx 0.00
1 0.00
Desired Outcome:
ID AMOUNT
1 0.00
test 5.00
test test 0.00
test 0.00
2 0.00
xx 304.95
x xx 304.95
3 0.00
4 0.00
xxxxx 0.00
5 0.00
xxx 0.00
xx xx 0.00
6 0.00
I tried making a new column using np.arange(len(df)) and then replacing the ID values with that, but it's not giving me the expected outcome.
Thank you!
Solution
You can use:
df['ID'] = (pd
.to_numeric(df['ID'], errors='coerce') # convert to numeric
.cumsum() # increment numbers
.convert_dtypes().astype(object) # back to integer
.fillna(df['ID']) # restore non-numeric
)
Alternative using slicing and updating:
s = pd.to_numeric(df['ID'], errors='coerce')
df['ID'].update(s[s.notna()].cumsum().astype(int).astype(object))
output:
ID AMOUNT
0 1 0.00
1 test 5.00
2 test test 0.00
3 test 0.00
4 2 0.00
5 xx 304.95
6 x xx 304.95
7 3 0.00
8 4 0.00
9 xxxxx 0.00
10 5 0.00
11 xxx 0.00
12 xx xx 0.00
13 6 0.00
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.