Issue
I have a xlsx file and same in csv format, i read them using pandas.read_excel
and pandas.read_csv
respectively and then store data after reading those files in a dataframe. If i print the stored values, i see no difference, but then when i sort and compare, i am seeing difference between the two values, can someone help?
df = fields_df.sort_values(['register',1], ascending=[False, False])
df1 = fields_df1.sort_values(['register',1], ascending=[False, False])
fields_df
and fields_df1
are the two dataframes after reading from xlsx and csv resp.
Before sorting fields_df
and fields_df1
when compared, are exactly same. But after sorting i see difference in the output
Solution
You can investigate difference of dataframes by:
out = fields_df.compare(fields_df1)
print (out)
Then test rows in out
DataFrame, e.g.if different types, or some trailing spaces for strings columns or maybe some datetime columns only in excel DataFrame.
print (fields_df.dtypes)
print (fields_df1.dtypes)
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.