Issue
here is the sample data for my question
pid | date | task | language |
---|---|---|---|
8522 | 17-09-2022 | listen | english |
7851 | 17-09-2022 | read | hindi |
7546 | 17-09-2022 | telugu | |
3584 | write | ||
8522 | 18-09-2022 | read | |
7851 | 19-09-2022 | hindi | |
8522 | speak | hindi | |
7546 | 19-09-2022 | speak | english |
3584 | write | tamil | |
8522 | 25-09-2022 |
i tried to code and got the output
data=df.groupby('pid')[['date','task','language']].transform(lambda x: '|'.join(x))
and i got this output
and i want the output to be with no empty/null values to be added so my desired final output would look like this but i have not idea how to filter or remove the values and get the output
any answer or suggestion is helpful
Solution
You can use pandas.Series.dropna
in the lambda expression before joining.
Here is what works for me:
out = (
df
.groupby('pid')[['date','task','language']]
.agg(lambda x: '|'.join(x
.replace(r'^\s*$', np.nan, regex=True)
.dropna()
.astype(str)
)
)
)
print(out)
date task language
pid
3584 write|write tamil
7546 17-09-2022|19-09-2022 speak telugu|english
7851 17-09-2022|19-09-2022 read hindi|hindi
8522 17-09-2022|18-09-2022|25-09-2022 listen|read|speak english|hindi
Answered By - Rabinzel
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.