Issue
I have a dataframe like below:
strategy | duration | sales | revenue |
---|---|---|---|
TP1 | 22 | 2000 | 1800 |
TP2 | 10 | 3000 | 1500 |
TP3 | 5 | 1000 | 1200 |
I would like to order it using the strategy column values.
This is what I have tried:
order = ['TP3', 'TP1', 'TP2']
df = df.set_index('strategy').reindex(order).reset_index()
When I run this code, I get Nan values as in below:
Strategy duration Sales Revenue
TP3 Nan Nan Nan
TP1 22 2000 1800
TP2 Nan Nan Nan
Important note: I am doing this over multiple files, reading the files pd.read_csv(file) and trying the above. Some files will have only 2 strategies not all 3.
Appreciate the help!
Solution
Try this below, order = ['TP3', 'TP1', 'TP2']
df['strategy'] = pd.Categorical(df['strategy'], categories=order, ordered=True)
# Sort the DataFrame based on the 'strategy' column
df = df.sort_values('strategy').reset_index(drop=True)
Answered By - Ugochukwu Obinna
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.