Monday, June 6, 2022

[FIXED] Pandas - Repeat row if found in list and count

June 06, 2022 dataframe, pandas, python No comments

Issue

Need help in repeating rows if found in list.

If found value in list increment count
If more than one instance found then repeat the row by incrementing count

data = {'First_Name':['Tom', 'Nick', 'Daniel', 'Jack'],
        'oter_col':['other_value1', 'other_value2', 'other_value3', 'other_value4']}
df = pd.DataFrame(data)
df["repeation"] = 0

DF looks like

  First_Name      oter_col  count
0        Tom  other_value1      0
1       Nick  other_value2      0
2     Daniel  other_value3      0
3       Jack  other_value4      0

Input list:

Full_Name = ["Tom Cruise", "Tom Holland", "Tom Hardy", "Jack black", "Chris Hemsworth"]

In need output like:

  First_Name    Full_Name      oter_col  count
0        Tom   Tom Cruise  other_value1      1
1        Tom  Tom Holland  other_value1      2
2        Tom    Tom Hardy  other_value1      3
3       Jack   Jack black  other_value4      1
4       Nick               other_value2      1
5     Daniel               other_value3      0

Tried something like this to get first matching index so I can repeat the row but not sure how to do it.

for name in Full_Name:
    m = df.First_Name.str.contains(name.split()[0])
    first_index_found = m.idxmax() if m.any() else None
    if type(first_index_found) == int:
        print(first_index_found)

Solution

You can use:

# Create a regex pattern to extract First_Name from Full_Name
pat = fr"\b({'|'.join(df['First_Name'])})\b"

# Create a dataframe from Full_Name
df1 = pd.DataFrame({'Full_Name': Full_Name})
df1['First_Name'] = df1['Full_Name'].str.extract(pat)

# Merge them on First_Name column
out = df.merge(df1, on='First_Name', how='left')

# Count (choose one)
out['repeation'] = out.groupby('First_Name').cumcount().add(1)
# OR
out['repeation2'] = (out.dropna().groupby('First_Name').cumcount().add(1)
                        .reindex(out.index, fill_value=0))

Output:

>>> out
  First_Name      oter_col  repeation    Full_Name  repeation2
0        Tom  other_value1          1   Tom Cruise           1
1        Tom  other_value1          2  Tom Holland           2
2        Tom  other_value1          3    Tom Hardy           3
3       Nick  other_value2          1          NaN           0
4     Daniel  other_value3          1          NaN           0
5       Jack  other_value4          1   Jack black           1

Answered By - Corralien

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, June 6, 2022

[FIXED] Pandas - Repeat row if found in list and count

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels