Issue
I have a dataframe that I am trying to manipulate to show the difference in severity for accidents that occur in dark vs light conditions.
This is the df that has 200k entries.
SEVERITYCODE LIGHTCOND
0 Injury Light
1 Damage Dark
2 Damage Light
3 Damage Light
4 Injury Light
5 Damage Light
6 Damage Light
7 Injury Light
8 Damage Light
9 Injury Light
10 Damage Light
11 Damage Light
12 Damage Dark
13 Damage Dark
14 Injury Dark
15 Damage Dark
16 Injury Light
17 Damage Light
18 Injury Light
19 Damage Dark
20 Injury Dark
I need to be able to get this data in to the df to see something like this where the number of occurrences of dark-injury are in the upper left hand box, dark and damage are in the upper right hand box, and so on.
Injury Damage
Dark: 10023 1132
Light: 1234 98474
How do I make Python count across columns like this? I wasn't sure if the data in the included picture is required to help me out or not.1
I then want to make it into a stacked bar graph for easy visualization which I think I can manage through other tutorials.
Thanks
Solution
(
df.groupby(['LIGHTCOND', 'SEVERITYCODE']) # create a groubpy object
.size() # aggregate by counting the rows in each group
.unstack() # move the inner-most index level to columns, i.e. 'SEVERITYCODE'
)
Answered By - RichieV
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.