Issue
I'm having a beginning to the language issue with unpivoting a table. I'm hoping that it's just a vocabulary thing and I'll be off and running. I have a table with three dimensions, within the dimensions, there are three elements and this table covers three time periods. The tables that I work with are more complex than that, but this one is already ridiculously wide.
male_female | Total | Total | Total | Total | Total | Total | Total | Total | Total | M | M | M | M | M | M | M | M | M | F | F | F | F | F | F | F | F | F |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
adult_youth | Total | Total | Total | A | A | A | Y | Y | Y | Total | Total | Total | A | A | A | Y | Y | Y | Total | Total | Total | A | A | A | Y | Y | Y |
urban_rural | Total | U | R | Total | U | R | Total | U | R | Total | U | R | Total | U | R | Total | U | R | Total | U | R | Total | U | R | Total | U | R |
2021 | 46 | 22 | 24 | 22 | 11 | 11 | 24 | 11 | 13 | 22 | 8 | 14 | 5 | 1 | 4 | 17 | 7 | 10 | 24 | 14 | 10 | 17 | 10 | 7 | 7 | 4 | 3 |
2020 | 48 | 22 | 26 | 22 | 11 | 11 | 26 | 11 | 15 | 26 | 10 | 16 | 7 | 2 | 5 | 19 | 8 | 11 | 22 | 12 | 10 | 15 | 9 | 6 | 7 | 3 | 4 |
2019 | 50 | 22 | 28 | 22 | 11 | 11 | 28 | 11 | 17 | 30 | 12 | 18 | 9 | 3 | 6 | 21 | 9 | 12 | 20 | 10 | 10 | 13 | 8 | 5 | 7 | 2 | 5 |
I want it to look like:
year | male_female | adult_youth | urban_rural | Value | |
---|---|---|---|---|---|
0 | 2021 | Total | Total | Total | 46 |
... | |||||
40 | 2020 | M | A | U | 2 |
... | |||||
80 | 2019 | F | Y | R | 5 |
I'm just at a loss on how to transpose it into something that can be pivoted or crosstabbed with modifications. I've tried stack and unstack, melt, transpose, wide_to_long, but I just can't get the syntax, or the procedure for peeling off layer. This has been a manual process, but it's somethat that if I could master, would allow me to use Excel and Tableau less, and get more done.
Solution
Reading a Multi-Dimensional Table in CSV format:
df = pd.read_csv('df_pivoted.csv', header=[0,1,2], index_col=[0])
...
male_female Total M F
adult_youth Total A Y Total A Y Total A Y
urban_rural Total U R Total U R Total U R Total U R Total U R Total U R Total U R Total U R Total U R
2021 46 22 24 22 11 11 24 11 13 22 8 14 5 1 4 17 7 10 24 14 10 17 10 7 7 4 3
2020 48 22 26 22 11 11 26 11 15 26 10 16 7 2 5 19 8 11 22 12 10 15 9 6 7 3 4
2019 50 22 28 22 11 11 28 11 17 30 12 18 9 3 6 21 9 12 20 10 10 13 8 5 7 2 5
Doing:
df.index.name = 'year'
out = df.unstack().reset_index(name='value')
print(out)
Output:
male_female adult_youth urban_rural year value
0 Total Total Total 2021 46
1 Total Total Total 2020 48
2 Total Total Total 2019 50
3 Total Total U 2021 22
4 Total Total U 2020 22
.. ... ... ... ... ...
76 F Y U 2020 3
77 F Y U 2019 2
78 F Y R 2021 3
79 F Y R 2020 4
80 F Y R 2019 5
Answered By - BeRT2me
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.