Issue
I am trying to match patterns between two strings. For example, I have
pattern_search = ['education four year']
string1 = 'It is mandatory to have at least of four years of professional education'
string2 = 'need to have education four years with professional degree'
I am trying a way to say true when i try to find match between pattern_search and string1 & string2.
When I am using regex library match/search/findall doesn't help me. In string i have the all the words required but not in order, in string2 i have one extra word with added plural.
Currently I am splitting the strings checking with each word in pattern_search with each word in string1 & 2 after preprocessing, is there any way to find match between the sentences?
Solution
You should take a nice look at the difflib
library, specifically the get_close_matches
function which returns words that are "close enough" to fill that requirement of words that may not exactly match. Be sure to adjust your threshold (cutoff=
) accordingly.
from difflib import get_close_matches
from re import sub
pattern_search = 'education four year'
string1 = 'It is mandatory to have at least of four years of professional education'
string2 = 'need to have education four years with professional degree'
string3 = 'We have four years of military experience'
def match(string, pattern):
pattern = pattern.lower().split()
words = set(sub(r"[^a-z0-9 ]", "", string.lower()).split()) # Sanitize input
return all(get_close_matches(word, words, cutoff=0.8) for word in pattern)
print(match(string1, pattern_search)) # True
print(match(string2, pattern_search)) # True
print(match(string3, pattern_search)) # False
If you want to make pattern_search
a list of patterns, then you should probably loop through the match
function.
Answered By - Sunny Patel
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.