Issue
I'm having issues with one of my functions after switching from a regular Index to a MultiIndex, and I'm not sure how to address this. Let me take the DataFrame from the pandas documentation for pandas.DataFrame.at to illustrate the problem:
>>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
... index=[4, 5, 6], columns=['A', 'B', 'C'])
>>> df
A B C
4 0 2 3
5 0 4 1
6 10 20 30
>>> df.at[4, 'B']
2
If you now convert this into a MultiIndex, the same call will fail and raise a KeyError:
>>> df = df.set_index("A", append=True)
>>> df
B C
A
4 0 2 3
5 0 4 1
6 10 20 30
>>> df.at[4, 'B']
Traceback (most recent call last):
File "<input>", line 1, in <module>
df.at[4, "B"]
~~~~~^^^^^^^^
File "/.../pandas/core/indexing.py", line 2419, in __getitem__
return super().__getitem__(key)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pandas/core/indexing.py", line 2371, in __getitem__
return self.obj._get_value(*key, takeable=self._takeable)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pandas/core/frame.py", line 3882, in _get_value
loc = engine.get_loc(index)
^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/index.pyx", line 822, in pandas._libs.index.BaseMultiIndexCodesEn
gine.get_loc
KeyError: 4
This kind of behavior would be fine, if loc was behaving in the same way - which it doesn't:
>>> df.loc[4, 'B']
A
0 2
Name: B, dtype: int64
You can get around this by specifying all levels of the index of course...
df.at[(4,0), 'B']
2
but given that I have quite a number of MultiIndex-levels that does not seem like an feasible solution. And using loc and then appending a .iloc[0] doesn't feel very pythonic either... Does anybody know how to make .at work without specifying more than the first level?
Solution
at
is designed to select a single value in a DataFrame.
Access a single value for a row/column label pair.
Thus you must provide all indexers.
As you shown in your example, loc
with an incomplete indexer yields a Series, not a value:
df.loc[4, 'B']
A
0 2
Name: B, dtype: int64
This wouldn't be compatible with at
's behavior of selecting a single value.
The KeyError
is the result of an explicit check for a complete indexer:
See the code of pandas/core/frame.py
# For MultiIndex going through engine effectively restricts us to
# same-length tuples; see test_get_set_value_no_partial_indexing
loc = engine.get_loc(index)
return series._values[loc]
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.