Issue
I have a pandas dataframe:
city crime pollution ... housing weather Similarity
0 Atlanta GA 101 97 ... 119 60 2.75
2 Baltimore MD 103 126 ... 103 80 2.50
4 San Antonio TX 98 126 ... 100 65 2.50
1 Akron OH 100 119 ... 97 70 1.75
3 Charleston SC 106 127 ... 82 75 1.50
I want to first sort based on Similarity, descending. Got that code:
df.sort_values('Similarity', ascending=False)
But if there is a tie, I want to sort based on a user input list. Of the 8 columns in my data, users can add UP TO 5 to the preference list, which is stored in order of adding it:
userPrefs = []
# ex: userPrefs = ['crime','education','weather','pollution']
So if there is a tie (such as between df[2] and df[4]), I need the code to then sort based on the list. Got that code:
df.sort_values(userPrefs[0], ascending=False).sort_values(userPrefs[1], ascending=False) # -- etc...
.sort_values('Similarity', ascending=False)
The issue I'm facing is that depending on the column added to userPrefs, the sorting may need to be ascending or it may need to be descending. So if 'crime' == userPref[0]
, I want it to be ASC (lowest crime is best); however, if 'education' == userPref[0]
, I want to sort DESC (higher education is best).
How do I sort based on a condition? I was thinking :
ascList = ['crime','housing','pollution'] # the lower, the better
descList = ['education'] # the higher, the better
df.sort_values(userPrefs[0], ascending= if x in ascList)
..but that doesn't work and I'm not clear on lambda functions. Thanks!
Solution
You don't need lambda functions, create parameters by=
and ascending=
of the function .sort_values()
on the fly:
# use sets here:
ascList = {"crime", "housing", "pollution"}
descList = {"education"}
# example userPrefs:
userPrefs = ["crime", "weather", "pollution"]
df = df.sort_values(
by=["Similarity", *userPrefs], ascending=[False, *[p in ascList for p in userPrefs]]
)
print(df)
Prints:
city crime pollution housing weather Similarity
0 Atlanta GA 101 97 119 60 2.75
4 San Antonio TX 98 126 100 65 2.50
2 Baltimore MD 103 126 103 80 2.50
1 Akron OH 100 119 97 70 1.75
3 Charleston SC 106 127 82 75 1.50
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.