Issue
I have a pandas dataframe that looks like this:
idx | A | B |
---|---|---|
01/01/01 00:00:01 | 5 | 2 |
01/01/01 00:00:02 | 4 | 5 |
01/01/01 00:00:03 | 5 | 4 |
02/01/01 00:00:01 | 3 | 8 |
02/01/01 00:00:02 | 7 | 4 |
02/01/01 00:00:03 | 1 | 3 |
I would like to group data based on its periodicity such that the final dataframe is:
new_idx | 01/01/01 | 02/01/01 | old_column |
---|---|---|---|
00:00:01 | 5 | 3 | A |
00:00:02 | 4 | 7 | A |
00:00:03 | 5 | 1 | A |
00:00:01 | 2 | 8 | B |
00:00:02 | 5 | 4 | B |
00:00:03 | 4 | 3 | B |
Is there a way to this that holds when the first dataframe gets big (more columns, more periods and more samples)?
Solution
One way is to melt
the DataFrame, then split the datetime to dates and times; finally pivot
the resulting DataFrame for the final output:
df = df.melt('idx', var_name='old_column')
df[['date','new_idx']] = df['idx'].str.split(expand=True)
out = df.pivot(['new_idx','old_column'], 'date', 'value').reset_index().rename_axis(columns=[None]).sort_values(by='old_column')
Output
new_idx old_column 01/01/01 02/01/01
0 00:00:01 A 5 3
2 00:00:02 A 4 7
4 00:00:03 A 5 1
1 00:00:01 B 2 8
3 00:00:02 B 5 4
5 00:00:03 B 4 3
Answered By - enke
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.