Issue
I am trying to remove text before specific brackets using REGEX in comma separated column using Pandas
From this -
colA
My Company Ltd [CS], address, nbc [LV], state [NP], pc [SS], country
Business Plc [CS], address, abc [LV], state [NP], code [SS], country
Work Harder Inc [CS], address, xyz[CS], state [NP], code [SS], country
Company Business People [CS], address, typode [SS], country, nlp [CS]
Text before [CS] and [LV] and within brackets has to be removed
Expected result -
colA
address, state [NP], pc [SS], country
address, state [NP], code [SS], country
address, state [NP], code [SS], country
address, typode [SS], country
Solution
You can also use regex [^,]*\[(CS|LV)\],?
to match and remove the patterns:
df.colA.str.replace('[^,]*\[(CS|LV)\],?', '').str.strip(', ')
0 address, state [NP], pc [SS], country
1 address, state [NP], code [SS], country
2 address, state [NP], code [SS], country
3 address, typode [SS], country
Name: colA, dtype: object
where [^,]*
matches patterns between commas, \[(CS|LV)\]
to match [CS]
or [LV]
and ,?
for optional following comma.
Answered By - Psidom
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.