Issue
I have a function computeLeft
which receives an index and returns four numbers. Something like this:
def computeLeft(i):
return np.array([i*2, i*3, i*4, i*5])
# edited to correct it
Right now in my code I use it like this:
import numpy as np
import pandas as pd
results=["val1","val2","val3","val4"]
df[results] = np.vectorize(computeLeft, signature="()->(4)")(range(len(df)))
where df
is some dataframe.
This obviously applies the function to all rows of df
.
I want to apply this function to only some indexes of df
.
So for example I have a list [2, 5, 7, 8, 10]
.
I want to compute computeLeft
only for the indices in the list and that the columns in result have values only for those rows (the rest having Nan).
How can I apply computeLeft
selectively like this?
Solution
First you need to fix your function and its vectorization:
def computeLeft(i):
return np.array([i*2, i*3, i*4, i*5])
computeLeftVectorized = np.vectorize(computeLeft, signature="()->(n)")
However, as mentioned by @jared in his anwser, it is much faster to just write the function as a pure numpy function in the first case.
And to achieve your goal you can use loc[]
:
results=["val1","val2","val3","val4"]
# Add new columns and fill the columns with NaN
df[results] = np.nan
indices_to_change = [2,5,7,8,10]
# change all rows in results with the provided indices
df.loc[indices_to_change, results] = computeLeftVectorized(indices_to_change)
This assumes that your dataframe has index labels that are the given integers (the given indices are not just the row number with different index labels).
Answered By - Oskar Hofmann
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.