Issue
I have a multi-index pandas dataframe. It has 31 columns, and then a second indexing level which is the file from which the data comes from.
I want to modify values in certain columns by a specific number (they are in pixel values and I want to convert them to mm by dividing them by a scale factor).
The data in each of columns are floats, and the px_to_mm is an int.
Instead of returning a float as I would expect, it returns NaN values across all entries of the column.
My code is as follows:
unique_animals = df.index.get_level_values('File').unique()
px_to_mm = 791
columns_in_px = ['mouth_x', 'mouth_y', 'stomach_centre_x',
'stomach_centre_y', 'aboral_organ_x',
'aboral_organ_y', 'tentacle_1_x',
'tentacle_1_y', 'tentacle_2_x', 'tentacle_2_y',
'cilia_1_x', 'cilia_1_y',
'cilia_2_x', 'cilia_2_y', 'X_diff_stomach',
'Y_diff_stomach']
for animal in unique_animals:
for column in columns_in_px:
df.loc[animal, column] = df.loc[animal, column] / px_to_mm
This is what the df index looks like:
MultiIndex([('CtenoEgg230801_', 0),
('CtenoEgg230801_', 1),
('CtenoEgg230801_', 2),
('CtenoEgg230801_', 3),
('CtenoEgg230801_', 4),
('CtenoEgg230801_', 5),
('CtenoEgg230801_', 6),
('CtenoEgg230801_', 7),
('CtenoEgg230801_', 8),
('CtenoEgg230801_', 9),
...
('CtenoEgg230802_', 66240),
('CtenoEgg230802_', 66241),
('CtenoEgg230802_', 66242),
('CtenoEgg230802_', 66243),
('CtenoEgg230802_', 66244),
('CtenoEgg230802_', 66245),
('CtenoEgg230802_', 66246),
('CtenoEgg230802_', 66247),
('CtenoEgg230802_', 66248),
('CtenoEgg230802_', 66249)],
names=['File', None], length=106632)
and a sample of the first few rows:
mouth_x mouth_y mouth_likelihood stomach_centre_x stomach_centre_y stomach_centre_likelihood aboral_organ_x aboral_organ_y aboral_organ_likelihood tentacle_1_x ... X_diff_stomach Y_diff_stomach Velocity_stomach Acceleration_stomach Theta_mouth_stomach Theta_Velocity_mouth_stomach Theta_Acceleration_mouth_stomach Theta_deg_mouth_stomach Height_Index Frame
File
0 231.626724 233.873352 0.999196 200.364288 191.369202 0.998929 168.946747 140.374954 0.996564 202.717392 ... NaN NaN NaN NaN 0.936630 NaN NaN 53.692184 86.915467 0
1 230.637405 234.197998 0.999158 200.186630 191.611725 0.998900 169.261520 140.385788 0.997088 203.156342 ... -0.177658 0.242523 NaN NaN 0.950049 0.013419 NaN 54.461426 87.094083 1
2 230.883316 233.928162 0.999064 200.056335 191.886490 0.999025 169.505844 139.894012 0.997208 205.199158 ... -0.130295 0.274765 0.304093 NaN 0.938103 -0.011946 -0.025365 53.776596 87.748322 2
3 229.841034 234.385590 0.999249 199.935638 191.638977 0.999073 170.242477 139.233582 0.995122 203.273712 ... -0.120697 -0.247513 0.275373 -0.028720 0.960341 0.022238 0.034185 55.051394 87.569965 3
4 229.045685 234.782135 0.999314 200.159286 191.692688 0.999104 169.480316 138.349838 0.994976 203.242462 ... 0.223648 0.053711 0.230007 -0.045366 0.980226 0.019885 -0.002353 56.191290 88.285329 4
Solution
The outer loop seems to be useless here:
for animal in unique_animals: # all rows?
for column in columns_in_px:
df.loc[animal, column] = df.loc[animal, column] / px_to_mm
So just use:
df[columns_in_px] /= px_to_mm
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.