Issue
I got this problem when I was trying to access pandas dataframe in a shared manner.
Basically, there are 3 different classes maintain different aspects of a single dataframe:
- A class creates the dataframe
- B class adds and maintain feature B into the dataframe
- C adds some anaylsis and summerization information to the dataframe
So, over all my classes looks like:
import pandas as pd
class AClass:
def __init__(self):
self.df_original = pd.DataFrame({'A':[1, 2, 3]})
def feature(self):
"""
add new row to the dataframe every time when it called
"""
i = len(self.df_original) + 1
self.df_original = self.df_original.append(pd.DataFrame({'A':[i]}, index=[i]))
class BClass:
def __init__(self, df_in):
# self.df_featureB = df_in.copy(deep=False) doesn't work
self.df_featureB = df_in
def feature(self):
# process 'A' column, out come to 'B'
self.df_featureB['B'] = self.df_featureB['A'] + 1
class CClass:
def __init__(self, df_in):
# self.df_featureC = df_in.copy(deep=False) does not work
self.df_featureC = df_in
def feature(self):
# process 'A' and 'B' column, out come to 'C'
self.df_featureC['C'] = self.df_featureC['A'] + self.df_featureC['B']
# declare a b and c objects from above defined classes
# set data frame share between the 3 objects
a = AClass()
b = BClass(a.df_original)
c = CClass(a.df_original)
Here is what I want do: To repeat call bellow lines as a function
do_step_process():
a.feature()
b.feature()
c.feature()
debug_print():
print(a.df_original)
print(b.df_featureB)
print(c.df_featureC)
# call above function 3 times
do_step_process()
do_step_process()
do_step_process()
debug_print()
I expect to see all the print are same after many times calling but I get bellow after 3 times running, what I have growed from AClass doesn't apear available from BClass and CClass:
A
0 1
1 2
2 3
4 4
5 5
6 6
A B C
0 1 2 3
1 2 3 5
2 3 4 7
A B C
0 1 2 3
1 2 3 5
2 3 4 7
I have tried with copy(deep=False), it doesnot work
Solution
I just figure it out...
In my case I can use loc to append new items to my a.df_original, the problem was because 'append' creates new dataframe
SO, my solution is change AClass like bellow:
class AClass:
def __init__(self):
self.df_original = pd.DataFrame({'A':[1, 2, 3]})
def feature(self):
i = len(self.df_original) + 1
self.df_original.loc[i, 'A'] = i
This works by my side, this will prevent pandas from creating new dataSet
Answered By - yunfei
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.