Issue
Trying to find a way to sort [A1, A2, A3, ..., H12] to [A1, B1, C1, ..., H12] in a data frame.
Have tried this so far:
def key(row):
match = re.match(r'(\d*)([A-H]\d+)', row)
if match:
num, letters = match.groups()
return letters, int(num) if num else 0
return row
df['Id'] = sorted(df['Id'], key=key)
But it is not sorting it correctly.
Sample Data Frame Column: Id A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 G1 G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12
Expected Output: Id A1 B1 C1 D1 E1 F1 G1 H1 A2 B2 C2 D2 E2 F2 G2 H2 A3 B3 C3 D3 E3 F3 G3 H3 A4 B4 C4 D4 E4 F4 G4 H4 A5 B5 C5 D5 E5 F5 G5 H5 A6 B6 C6 D6 E6 F6 G6 H6 A7 B7 C7 D7 E7 F7 G7 H7 A8 B8 C8 D8 E8 F8 G8 H8 A9 B9 C9 D9 E9 F9 G9 H9 A10 B10 C10 D10 E10 F10 G10 H10 A11 B11 C11 D11 E11 F11 G11 H11 A12 B12 C12 D12 E12 F12 G12 H12
Solution
If you had a simple list instead of a dataframe, sorting would look like this:
sorted(df["Id"], key=lambda x: (int(x[1:]), x[0]))
pd.DataFrame.sort_values()
also has key
parameter, the only difference is that the function has to be vetorized: receive a vector and return a vector. You can do something like this:
df.sort_values(by=["Id"], key=lambda col: list(zip(col.str[1:].apply(int), col.str[0])))
Answered By - Maria K
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.