Issue
I have a dataset that has a "tags" column in which each row is a list of tags. For example, the first entry looks something like this
df['tags'][0]
result = "[' Leisure Trip ', ' Couple ', ' Duplex Double Room ', ' Stayed 6 nights ']"
I have been able to remove the trailing whitespace from all elements and only the leading whitespace from the first element (so I get something like the below).
['Leisure trip', ' Couple', ' Duplex Double Room', ' Stayed 6 nights']
Does anyone know how to remove the leading whitespace from all but the first element is these lists? They are not of uniform length or anything. Below is the code I have used to get the final result above:
clean_tags_list = []
for item in reviews['Tags']:
string = item.replace("[", "")
string2 = string.replace("'", "")
string3 = string2.replace("]", "")
string4 = string3.replace(",", "")
string5 = string4.strip()
string6 = string5.lstrip()
#clean_tags_list.append(string4.split(" "))
clean_tags_list.append(string6.split(" "))
clean_tags_list[0]
['Leisure trip', ' Couple', ' Duplex Double Room', ' Stayed 6 nights']
Solution
IIUC you want to apply strip
for the first element and right strip for the other ones. Then, first convert your 'string list' to an actual list with ast.literal_eval
and apply strip
and rstrip
:
from ast import literal_eval
df.tags.agg(literal_eval).apply(lambda x: [item.strip() if x.index(item) == 0 else item.rstrip() for item in x])
Answered By - Nuri Taş
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.