Issue
I have a dataframe with a Date column consisting of stings in this format. I need to strip the end of the strin so that I can convert to a datetime object.
"20231101 05:00:00 America/New_York" "20231101 06:00:00 America/New_York"
I have tried these approaches unsuccessfully.
df['Date'] = df['Date'].replace('^.*\]\s*', '', regex=True)
df['Date'] = df['Date'].str.strip(' America/New_York')
df['Date'] = df['Date'].map(lambda x: x.rstrip(' America/NewYork'))``
as well as a couple of others based on my searches. Is there an easy way to do this or should I write a function to slice the string by grabbing the first 17 characters and assigning the result back to the df. Note the string could be of the form '20231101 05:00:00 America/Central' Thanks for any and all assistance.
Solution
- If you want to remove a particular suffix, then I recommend str.removesuffix rather than str.strip.
- Notice that you sometimes write
New_York
with an underscore and sometimesNewYork
without an underscore. If you ask to remove'NewYork'
then'New_York'
won't be removed. - After the edit in your question, the suffixes all start with
' America'
but differ afterwards; in this case you could usestr.split(' America').str[0]
to keep everything before' America'
.
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Date': ["20231101 05:00:00 America/New_York",
"20231101 06:00:00 America/New_York",
"20231101 07:00:00 America/Central"]
})
# df['Date'] = df['Date'].str.removesuffix(' America/New_York')
df['Date'] = df['Date'].str.split(' America').str[0]
print(df)
# Name Date
# 0 Alice 20231101 05:00:00
# 1 Bob 20231101 06:00:00
# 2 Charlie 20231101 07:00:00
Answered By - Stef
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.