Issue
I am trying to do 2 things in Python:
- Select the names of specific columns using a
regex
- Rename these selected columns using a list of names (the names are unfortunately stored in their own weird dataframe)
I am new to python
and pandas
but did a bunch of googling and am getting the TypeError: Index does not support mutable operations
error. Here's what I am doing.
import pandas as pd
import numpy as np
df=pd.DataFrame(data=np.array([
[1, 3, 3, 4, 5,9,5],
[1, 2, 4, 4, 5,8,4],
[1, 2, 3, 'a', 5,7,3],
[1, 2, 3, 4, 'e',6,2],
['f', 2, 3, 4, 5,6,1]
]),
columns=[
'a',
'car-b',
'car-c',
'car-d',
'car-e',
'car-f',
'car-g'])
#Select the NAMES of the columns that contain 'car' in them as I want to change these column names
names_to_change = df.columns[df.columns.str.contains("car")]
names_to_change
#Here is the dataset that has the names that I want to use to replace these
#This is just how the names are stored in the workflow
new_names=pd.DataFrame(data=np.array([
['new_1','new_3','new_5'],
['new_2','new_4','new_6']
]))
new_names
#My approach is to transform the new names into a list
new_names_list=pd.melt(new_names).iloc[:,1].tolist()
new_names_list
#Now I figure I would use .columns to do the replacement
#But this returnts the mutability error
df.columns[df.columns.str.contains("car")]=new_names_list
#This also returns the same error
df.columns = df.columns[df.columns.str.contains("car")].tolist()+new_names_list
Traceback (most recent call last):
File "C:\Users\zsg876\AppData\Local\Temp/ipykernel_1340/261138782.py", line 44, in <module>
df.columns[df.columns.str.contains("car")]=new_names_list
File "C:\Users\zsg876\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 4585, in __setitem__
raise TypeError("Index does not support mutable operations")
TypeError: Index does not support mutable operations
I tried a bunch of different methods (this was no help: how to rename columns in pandas using a list) and haven't had much luck. I am coming over from R
where renaming columns was a lot simpler -- you'd just pass a vector using names()
.
I take it the workflow is different here? Appreciate any suggestions!
UPDATE:
This seems to do the trick, but I am not sure why exactly. I figured replacing one list with another of equal length would work, but that does not seem to be the case. Can anyone educate me here?
col_rename_dict=dict(zip(names_to_change,new_names_list))
df.rename(columns=col_rename_dict, inplace=True)
Solution
You can use df.filter(like='car').columns
to get the names of columns containing car
, and you can use new_names.to_numpy().T.ravel
to efficiently convert the new_names
dataframe into an array of the new names. Then, you can use zip
and dict
to convert the two arrays into a dict where the keys are the old column names and the values are the new column names. Then, simple pass that to df.rename
with axis=1
:
old_names = df.filter(like='car').columns
new_names = new_names.to_numpy().T.ravel()
df = df.rename(dict(zip(old_names, new_names)), axis=1)
Output:
>>> df
a new_1 new_2 new_3 new_4 new_5 new_6
0 1 3 3 4 5 9 5
1 1 2 4 4 5 8 4
2 1 2 3 a 5 7 3
3 1 2 3 4 e 6 2
4 f 2 3 4 5 6 1
Answered By - richardec
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.