Issue
I am trying to understand why 5 in df['ids']
in code below return True
since number 5 doesn't exist in the pandas dataframe.
In [82]: df = pd.DataFrame()
In [83]: df['ids'] = list(range(5)) + list(range(6,10))
In [84]: df
Out[84]:
ids
0 0
1 1
2 2
3 3
4 4
5 6
6 7
7 8
8 9
In [85]: 5 in df['ids']
Out[85]: True
In [86]: df[df['ids'] == 5]
Out[86]:
Empty DataFrame
Columns: [ids]
Index: []
In [87]: 5 in list(df['ids'])
Out[87]: False
Solution
The trick here is to understand that pandas
objects care about the .index
a lot. Because of the .index
, Series
can support a near-dictionary like behavior. From this perspective, just like checking in
on a dictionary to check for key existence, it makes some sense that in
on a pandas
object checks for index existence.
So using the above, we can check:
>>> import pandas as pd
>>> s = pd.Series([1,2,3,4,6,7,8,9])
>>> s.index
RangeIndex(start=0, stop=8, step=1)
>>> 5 in s # implicitly check if 5 is in s.index
True
>>> 5 in s.index # explicitly check if 5 is in s.index
True
>>> 5 in s.values # explicitly check if 5 is in values
False
An alternative way to check if a value is in a Series
you can use some boolean logic:
>>> (5 == s).any()
False
Also see This Answer as well.
Answered By - Cameron Riddell
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.