Issue
I have some table data (based on some pandas dataframe) in following form:
Index | Name | Region 1 | ... | Region n |
---|---|---|---|---|
Index data | Name data | Region 1 data | Region 1 data | Region 1 data |
Now I want to loop through the datarows and seperate for each row the data of column Name in some string variable and the data of column Region i for all 1≤i≤n in some kind of array or list.
The way I know is as follows:
for index, row in data.iterrows():
name = row.values[0]
regions = row.filter(regex = '^Region').values
body of loop
In the body of the for loop I never need the variable row again, only name and regions. So for me the code feels a little bit overloaded.
My question now is:
Is their some way to make all a little bit simpler, maybe some for loop of kind:
for index, name, regions in data():
body of loop
Solution
First of all, when using pandas, it is better to avoid for-loops as much as we can. It is faster to use pandas methods and there are plenty for everything you can do with a for loop.
For your case, you can define what you want to do in a function and pass it to the apply()
method of pandas data frames. For example:
def body_for_loop(row, region_index):
name = row["Name"]
regions = row.filter(regex = '^Region').values
# body of loop
Now when you want to use it, you will just call:
df.apply(body_fro_loop, axis=1)
Answered By - Ahmed Elashry
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.