Issue
i'm a beginner using pandas to look at a csv. i'm using .iterrows() to see if a given record matches today's date, so far so good. however when calling (row.name) for a .csv with a column headed 'name' i get different output than if i rename the column and edit the (row."column-heading") to match. i can call it anything but "name" and get the right output. i tried (row.notthename) (row.fish) and (row.thisisodd) - which all worked fine - before coming here.
if the first colmumn in birthdays.csv is "name" and i call print(row.name) it returns "2". if the first column is "notthename" and i call print(row.notthename) it returns the relevant name. what gives? i don't understand why arbitrarily renaming the column and the function call is yielding different output?
eg case A: column named "name"
birthdays.csv:
name,email,year,month,day
a test name,[email protected],1961,12,21
testerito,[email protected],1985,02,23
testeroonie,[email protected],2022,01,17
data = pandas.read_csv("birthdays.csv")
for (index, row) in data.iterrows():
if (dt.datetime.now()).month == row.month and (dt.datetime.now()).day == row.day:
print(row.name)
outputs "2"
whereas case B: column named "notthename"
data = pandas.read_csv("birthdays.csv")
for (index, row) in data.iterrows():
if (dt.datetime.now()).month == row.month and (dt.datetime.now()).day == row.day:
print(row.notthename)
outputs "testeroonie"
i'm missing something.... is there some special handling of "name" going on?
thanks for helping me learn!
Solution
This happens because DataFrame.iterrows
returns a Series
object, and the Series
object has a built-in property called name
. This is why using the object shortcut for column names, although convenient, can be dangerous. The dictionary notation doesn't have this issue:
print(row['name'])
Answered By - Tim Roberts
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.