Friday, December 10, 2021

[FIXED] change multiple dataframe columns based on list of users

December 10, 2021 dataframe, numpy, pandas, python No comments

Issue

Let's say I have this dataframe in pandas:

data = {'user':['Apple', 'One', 'Tom'], 
        'col_1':[1, 2, 3],
        'col_2':[4, 5, 6],
        'col_3':[7, 8, 9]
}

df = pd.DataFrame(data)

Which outputs:

user	col_1	col_2	col_3
Apple	1	2	3
One	4	5	6
Tom	7	8	9

Now, here is the deal: I want to modify the value of columns col_1, col_2 and col_3 if the user is within a specific list of users. Otherwise, I have to keep the current value of the columns.

My proposal:

First, I created a dictionary, where the keys show the users that need modifications. Also, for each user, I created a list of the new values (col_1, col_2 and col_3, respectively) This way:

my_dict = {
    'one'  : ['col_1_value_one', 'col_2_value_one', 'col_3_value_one'],
    'two'  : ['col_1_value_two', 'col_2_value_two', 'col_3_value_two'],
    'apple': ['col_1_value_apple', 'col_2_value_apple', 'col_3_value_apple']
}

Then, I created a list based on the dict keys:

list_users = list(my_dict.keys())

Finally, I was able to verify whether the user was in fact within the list of keys or not by using np.where, containsand some regex (that is why I joined the list with "|" and used lower()). It works properly when keeping the "old" value if the user is not present in the list.

df['col_1'] = np.where(df['user'].str.lower().str.contains('|'.join(list_users), na=False, regex=True), 
                                        "<I need to replace this!>", 
                                        df['col_1'])
df['col_2'] = np.where(df['user'].str.lower().str.contains('|'.join(list_users), na=False, regex=True), 
                                        "<I need to replace this!>", 
                                        df['col_2'])
df['col_3'] = np.where(df['user'].str.lower().str.contains('|'.join(list_users), na=False, regex=True), 
                                        "<I need to replace this!>", 
                                        df['col_3'])

However, as you can see I couldn't find a way to replace the values for the ones present in the dictionary values if the user is in fact within the list. For example, if the user is "Apple", then the new values for col_1, col_2 and col_3 are:

col_1_value_apple
col_2_value_apple
col_3_value_apple

Expected output based on example:

user	col_1	col_2	col_3
Apple	col_1_value_apple	col_2_value_apple	col_3_value_apple
One	col_1_value_one	col_2_value_one	col_3_value_one
Tom	7	8	9

Here, both "Apple" and "One" users are modified, whilst "Tom" is not in the list, therefore keeping their original values.

I am sure this is not the optimal way to solve this, but any help would be highly appreciated!! Thanks!

Solution

This works:

for name, cols in my_dict.items():
    df.loc[df['user'].str.lower() == name, ['col_1', 'col_2', 'col_3']] = cols

Answered By - politinsa

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, December 10, 2021

[FIXED] change multiple dataframe columns based on list of users

Issue

My proposal:

Solution

0 comments:

Post a Comment

Popular Posts

Labels