Issue
I currently have a data frame with the three columns
colc | fpc | lpc |
---|---|---|
1 | 3 | 5 |
4 | 7 | 8 |
and so on. fpc and lpc can be nan, and colc are integers. I need to turn this into the form
a | b |
---|---|
1 | 3 |
1 | 4 |
1 | 5 |
4 | 7 |
4 | 8 |
as those are the row and column index of a matrix.
I currently use the following to tranform the original dataframe.
def unstack_pixels(pix, fpc, lpc, colc = 'col_2'):
a = pix[[colc, fpc, lpc]].dropna(how='any')
return np.column_stack([
np.repeat(a[colc], a[lpc] - a[fpc] + 1),
np.concatenate([
np.arange(start, stop + 1) for start, stop
in zip(a[fpc], a[lpc])
])
])
However, I am not sure if this is the most efficient way to do it...?
Solution
You can use repeat
and numpy.r_
combined with zip
+itertools.starmap
to generate an indexer:
from itertools import starmap
# if NaNs:
df = df.dropna()
out = pd.DataFrame({'a': df['colc'].repeat(df['lpc']-df['fpc']+1),
'b': np.r_[tuple(starmap(slice, zip(df['fpc'], df['lpc']+1)))],
})
Or with repeat
and groupby.cumcount
:
n = df['lpc'].sub(df['fpc']).add(1).fillna(0)
out = pd.DataFrame({'a': df['colc'].repeat(n),
'b': df['fpc'].repeat(n)
.pipe(lambda s: s+s.groupby(level=0).cumcount()),
})
Output:
a b
0 1 3
0 1 4
0 1 5
1 4 7
1 4 8
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.