Issue
I have the following DataFrame
named pop:
California 2000 33871648
2010 37253956
New York 2000 18976457
2010 19378102
Texas 2000 20851820
2010 25145561
I want to print out values of 2010 in California and Texas. Whenever I try pop[['California','Texas'], 2010]
I meet the error '(['California', 'Texas'], 2010)' is an invalid key
How can I print the information then?
Solution
TLDR
df.loc[(level_1_indices_list, level_2_indices_list), :]
which is, in this case:
df.loc[(['California','Texas'], ['2010']), :]
Below is a more elaborated version.
# import packages & set seed
import numpy as np
import pandas as pd
np.random.seed(42)
Create example dataframe
Using the Pandas docs:
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
s = pd.Series(np.random.randn(8), index=arrays)
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
This will look like:
Slicing using multiindex
With df
you can do:
df.loc[(['qux','foo'], 'one'), :]
For notational consistency you can use []
on the second element of the slice:
df.loc[(['qux','foo'], ['one']), :]
which will yield the same result.
Which is:
(Selecting 'one'
is equivalent to selecting 2010
in your df
. ['qux','foo']
should be equivalent to selecting ['California','Texas']. Based on this, I think you can apply the steps here to your data.)
This might also be helpful.
Answered By - zabop
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.