Wednesday, September 7, 2022

[FIXED] How to parse correctly to date in python from specific ISO format

September 07, 2022 date, datetime, iso, pandas, python-3.x No comments

Issue

I connected to a table on a db where there are two columns with dates. I had no problem to parse the column with values formatted as this: 2017-11-03
But I don't find a way to parse the other column with dates formatted as this: 2017-10-03 05:06:52.840 +02:00

My attempts
If I parse a single value through the strptime method

dt.datetime.strptime("2017-12-14 22:16:24.037 +02:00", "%Y-%m-%d %H:%M:%S.%f %z")

I get the correct output
datetime.datetime(2017, 12, 14, 22, 16, 24, 37000, tzinfo=datetime.timezone(datetime.timedelta(seconds=7200)))

but if I try to use the same code format while parsing the table to the dataframe, the column dtype is an object:

Licenze_FromGY = pd.read_sql(query, cnxn, parse_dates={"EndDate":"%Y-%m-%d", "LastUpd":"%Y-%m-%d %H:%M:%S.%f %z"})


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Tenant     1000 non-null   int64         
 1   IdService  1000 non-null   object        
 2   Code       1000 non-null   object        
 3   Aggregate  1000 non-null   object        
 4   Bundle     991 non-null    object        
 5   Status     1000 non-null   object        
 6   Value      1000 non-null   int64         
 7   EndDate    258 non-null    datetime64[ns]
 8   Trial      1000 non-null   bool          
 9   LastUpd    1000 non-null   object

I also tried to change the code format either in the read_sql method or in the pd.to_datetime() method, but then all the values become NaT:

Licenze_FromGY["LastUpd"] = pd.to_datetime(Licenze_FromGY["LastUpd"], format="%Y-%m-%d %H:%M:%S.%fZ", errors="coerce") 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 10 columns):
 #   Column     Non-Null Count  Dtype         
---  ------     --------------  -----         
 0   Tenant     1000 non-null   int64         
 1   IdService  1000 non-null   object        
 2   Code       1000 non-null   object        
 3   Aggregate  1000 non-null   object        
 4   Bundle     991 non-null    object        
 5   Status     1000 non-null   object        
 6   Value      1000 non-null   int64         
 7   EndDate    258 non-null    datetime64[ns]
 8   Trial      1000 non-null   bool          
 9   LastUpd    0 non-null      datetime64[ns]
dtypes: bool(1), datetime64[ns](2), int64(2), object(5)
memory usage: 71.4+ KB
None

Anyone can help?

Solution

pandas cannot handle mixed UTC offsets in one Series (column). Assuming you have data like this

import pandas as pd

df = pd.DataFrame({"datetime": ["2017-12-14 22:16:24.037 +02:00", 
                                "2018-08-14 22:16:24.037 +03:00"]})

if you just parse to datetime,

df["datetime"] = pd.to_datetime(df["datetime"])

df["datetime"]
0    2017-12-14 22:16:24.037000+02:00
1    2018-08-14 22:16:24.037000+03:00
Name: datetime, dtype: object

you get dtype object. The elements of the series are of the Python datetime.datetime dtype. That limits the datetime functionality, compared to the pandas datetime dtype.

You can get that e.g. by parsing to UTC:

df["datetime"] = pd.to_datetime(df["datetime"], utc=True)

df["datetime"]
0   2017-12-14 20:16:24.037000+00:00
1   2018-08-14 19:16:24.037000+00:00
Name: datetime, dtype: datetime64[ns, UTC]

You might set an appropriate time zone to re-create the UTC offset:

df["datetime"] = pd.to_datetime(df["datetime"], utc=True).dt.tz_convert("Europe/Athens")

df["datetime"]
0   2017-12-14 22:16:24.037000+02:00
1   2018-08-14 22:16:24.037000+03:00
Name: datetime, dtype: datetime64[ns, Europe/Athens]

Answered By - FObersteiner

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, September 7, 2022

[FIXED] How to parse correctly to date in python from specific ISO format

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels