Issue
How to extract dates alone from text file using regex in Python 3?
Below is my current code:
import datetime
from datetime import date
import re
s = "birthday on 20/12/2018 and wedding aniversry on 04/01/1997 and dob is on
09/07/1897"
match = re.search(r'\d{2}/\d{2}/\d{4}', s)
date = datetime.datetime.strptime(match.group(), '%Y-%m-%d').date()
print (date)
Expected Output is
20/12/2018
04/01/1997
09/07/1897
Solution
You have an invalid date format near '%Y-%m-%d'
since it should have been '%d/%m/%Y
' looking at your provided date: birthday on 20/12/2018
(dd/mm/yyyy)
Change this:
date = datetime.datetime.strptime(match.group(), '%Y-%m-%d').date()
With this:
date = datetime.datetime.strptime(match.group(), '%d/%m/%Y').date()
Your Fix:
import datetime
from datetime import date
import re
s = "birthday on 20/12/2018"
match = re.search(r'\d{2}/\d{2}/\d{4}', s)
date = datetime.datetime.strptime(match.group(), '%d/%m/%Y').date()
print (date)
But:
Why get into all the trouble? When they're easier and elegant ways out there.
Using dparser
:
import dateutil.parser as dparser
dt_1 = "birthday on 20/12/2018"
print("Date: {}".format(dparser.parse(dt_1,fuzzy=True).date()))
OUTPUT:
Date: 2018-12-20
EDIT:
With your edited question which now has multiple dates, you could extract them using regex
:
import re
s = "birthday on 20/12/2018 and wedding aniversry on 04/01/1997 and dob is on 09/07/1897"
pattern = r'\d{2}/\d{2}/\d{4}'
print("\n".join(re.findall(pattern,s)))
OUTPUT:
20/12/2018
04/01/1997
09/07/1897
OR
Using dateutil
:
from dateutil.parser import parse
for s in s.split():
try:
print(parse(s))
except ValueError:
pass
OUTPUT:
2018-12-20 00:00:00
1997-04-01 00:00:00
1897-09-07 00:00:00
Answered By - DirtyBit
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.