Issue
Like Left formula we Used in Excel same as I want to extract no. of character from Policy no. column based on Insurer column like.....
if insurer is HDFC then Extract only 10 character form the sting and if insurer is tata then Extract only 7 character form the sting. and if insurer other than HDFC & tata then leave it as blank.
also see hdfc insure comes with lot of different name.
How I will achieve this in python
Insurer | Policy no. | Expected OutPut |
---|---|---|
Hdfc | 4509242332 | 4509242332 |
Tata | tatadigitNational | tatadig |
Hdfc ergo | 09082323ab12sd | 09082323ab |
HDFC | nolanheroman | nolanherom |
Tata | 97543007356 | 9754300 |
Tata | pqrsequence2o202 | pqrsequ |
Tata | 987654321 | 9876543 |
HDFC health ergo | nolanheroman | nolanherom |
Digit | 1733choola | |
star | naahiHonaar |
Solution
You can use the str
accessor per group with help of a dictionary:
num = {'hdfc': 10, 'tata': 7}
# identify insurer from first word (can be changed to another method)
group = df['Insurer'].str.extract('(\w+)', expand=False).str.lower()
df['Expected OutPut'] = (df.groupby(group)['Policy no.']
.transform(lambda g: g.str[:num[g.name]]
if g.name in num else '')
)
output:
Insurer Policy no. Expected OutPut
0 Hdfc 4509242332 4509242332
1 Tata tatadigitNational tatadig
2 Hdfc ergo 09082323ab12sd 09082323ab
3 HDFC nolanheroman nolanherom
4 Tata 97543007356 9754300
5 Tata pqrsequence2o202 pqrsequ
6 Tata 987654321 9876543
7 HDFC health ergo nolanheroman nolanherom
8 Digit 1733choola
9 star naahiHonaar
update: enforce a minimal length
num = {'hdfc': 10, 'tata': 7}
group = df['Insurer'].str.extract('(\w+)', expand=False).str.lower()
df['Expected OutPut'] = (df.groupby(group)['Policy no.']
.transform(lambda g: g.mask(g.str.len()<num[g.name], '')
.str[:num[g.name]]
if g.name in num else '')
)
output:
Insurer Policy no. Expected OutPut
0 Hdfc 4509242332 4509242332
1 Tata tatadigitNational tatadig
2 Hdfc ergo 09082323ab12sd 09082323ab
3 HDFC nolanh # example here
4 Tata 97543007356 9754300
5 Tata pqrsequence2o202 pqrsequ
6 Tata 987654321 9876543
7 HDFC health ergo nolanheroman nolanherom
8 Digit 1733choola
9 star naahiHonaar
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.