Issue
I have the following dataset:
0 1 2 3
xxx_qsaqx 2 2.2 2.3 2.4
xxx_gtygs 3 3.2 3.3 3.4
xxx_uytgo 4 4.2 4.3 4.4
xxx_ytghr 5 5.2 5.3 5.4
xxy_uyhga 6 6.2 6.3 6.4
xxy-uytei 7 7.2 7.3 7.4
xyy_uiyta 8 8.2 8.3 8.4
And I want to split it into the following 3 dataframes:
xxx-df
0 1 2 3
xxx_qsaqx 2 2.2 2.3 2.4
xxx_gtygs 3 3.2 3.3 3.4
xxx_uytgo 4 4.2 4.3 4.4
xxx_ytghr 5 5.2 5.3 5.4
xxy-df
0 1 2 3
xxy_uyhga 6 6.2 6.3 6.4
xxy-uytei 7 7.2 7.3 7.4
xyy-df
0 1 2 3
xyy_uiyta 8 8.2 8.3 8.4
Note that strings are indeed the row indices while 0, 1, 2, and 3 are the columns of the dataframes.
Solution
Don't try to generate variables dynamically, this is considered bad practice.
Instead, collect the DataFrames in a dictionary using a combination of str.extract
to get the prefix and groupby
to split the groups:
grouper = df.index.str.extract('^([^-_]+)', expand=False)
dfs = {k: g for k,g in df.groupby(grouper)}
Output:
{'xxx': 0 1 2 3
xxx_qsaqx 2 2.2 2.3 2.4
xxx_gtygs 3 3.2 3.3 3.4
xxx_uytgo 4 4.2 4.3 4.4
xxx_ytghr 5 5.2 5.3 5.4,
'xxy': 0 1 2 3
xxy_uyhga 6 6.2 6.3 6.4
xxy-uytei 7 7.2 7.3 7.4,
'xyy': 0 1 2 3
xyy_uiyta 8 8.2 8.3 8.4,
}
And access your sub-DataFrames with:
dfs['xxx']
0 1 2 3
xxx_qsaqx 2 2.2 2.3 2.4
xxx_gtygs 3 3.2 3.3 3.4
xxx_uytgo 4 4.2 4.3 4.4
xxx_ytghr 5 5.2 5.3 5.4
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.