Issue
I have scoured this forum, and have yet to find a solution to my problem. I am using pandas dataframes, and I need to order this column, which of type string because its alphanumeric, by two attributes, length being first, then alphanumerically. The input will be something like this:
input = [10, 100110, 222754430, 777000, TEST10, 800022110, 210, 1960, 30, TERM20, 22100, 22300, 487854750, TEST20, 2200010, 220, 20, 22200, 1100, 2200020]
output = [10, 20, 30, 210, 220, 1100, 1960, 22100, 22200, 22300, TERM20, TEST10, TEST20, 100110, 777000, 2200010, 2200020, 222754430, 487854750, 800022110]
If someone could help, that would be great! Any extra info can be provided
I tried sorting by length, which partially worked, but not entirely. I also tried sort_values, but it sorts in something like: 10, 100110, 210, 220, 777000, etc. This has been stumping me for something that feels so trivial.
Solution
I hope I've understood your question right. You can try:
lst = [
10,
100110,
222754430,
777000,
"TEST10",
800022110,
210,
1960,
30,
"TERM20",
22100,
22300,
487854750,
"TEST20",
2200010,
220,
20,
22200,
1100,
2200020,
]
def key_fn(val):
if isinstance(val, str):
return len(val), False, val
else:
s = str(val)
return len(s), True, s
output = sorted(lst, key=key_fn)
print(output)
Prints:
[
10,
20,
30,
210,
220,
1100,
1960,
22100,
22200,
22300,
"TERM20",
"TEST10",
"TEST20",
100110,
777000,
2200010,
2200020,
222754430,
487854750,
800022110,
]
EDIT: To apply it to the dataframe you can do e.g.:
def key_fn(series):
def __to_int(val):
try:
return int(val)
except:
return val
def __inner(val):
val = __to_int(val)
if isinstance(val, str):
return len(val), False, val
else:
s = str(val)
return len(s), True, s
return pd.Series([__inner(val) for val in series], index=series.index)
df = df.sort_values(by=["column1"], key=key_fn)
print(df)
Prints:
column1
0 10
16 20
8 30
6 210
15 220
18 1100
7 1960
10 22100
17 22200
11 22300
9 TERM20
4 TEST10
13 TEST20
1 100110
3 777000
14 2200010
19 2200020
2 222754430
12 487854750
5 800022110
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.