Issue
I have a ~2MM row dataframe. I have a problem where, after splitting a column by a delimiter, it looks as though there wasn't a consistent number of columns merged into this split.
To remedy this, I'm attempting to use a conditional new column C where, if a condition is true, should equal column A. If false, set equal to column B.
EDIT: In attempting a provided solution, I tried some code listed below, but it did not update any rows. Here is a better example of the dataset that I'm working with:
Scenario meteorology time of day
0 xxx D7 Bus. Hours
1 yyy F3 Offshift
2 zzz Bus. Hours NaN
3 aaa Offshift NaN
4 bbb Offshift NaN
The first two rows are well-formed. The Scenario, meteorology, and time of day have been split out from the merged column correctly. However, on the other rows, the merged column did not have data for meteorology. Therefore, the 'time of day' data has populated in 'Meteorology', resulting in 'time of day' being nan.
Here was the suggested approach:
from dask import dataframe as dd
ddf = dd.from_pandas(df, npartitions=10)
ddf[(ddf.met=='Bus. Hours') | (ddf.met == 'Offshift')]['time'] = ddf['met']
ddf[(ddf.met=='Bus. Hours') | (ddf.met == 'Offshift')]['met'] = np.nan
This does not update the appropriate rows in 'time' or 'met'.
I have also tried doing this in pandas:
df.loc[(df.met == 'Bus.Hours') | (df.met == 'Offshift'), 'time'] = df['met']
df.loc[(df.met == 'Bus.Hours') | (df.met == 'Offshift'), 'met'] = np.nan
This approach runs, but appears to hang indefinitely.
Solution
The working solution was adapted from the comments, and ended up as follows:
cond = df.met.isin(['Bus. Hours', 'Offshift'])
df['met'] = np.where(cond, np.nan, df['met'])
df['time'] = np.where(cond, df['met'], df['time'])
Came across another situation where this was needed. It was along the lines of a string that shouldn't contain a substring:
df1 = dataset.copy(deep=True)
df1['F_adj'] = 0
cond = (df1['Type'] == 'Delayed Ignition') | ~(df1['Type'].str.contains('Delayed'))
df1['F_adj'] = np.where(cond,df1['F'], 0)
Answered By - Michael James
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.