Issue
This is my DataFrame:
import pandas as pd
df = pd.DataFrame(
{
'a': [101, 90, 11, 120, 1]
}
)
And this is the output that I want. I want to create column y
:
a y
0 101 101.0
1 90 101.0
2 11 90.0
3 120 120.0
4 1 120.0
Basically, values in a
are compared with their previous value, and the greater one is selected.
For example for row 1
, 90 is compared with 101. 101 is greater so it is selected.
I have done it in this way:
df['x'] = df.a.shift(1)
df['y'] = df[['a', 'x']].max(axis=1)
Is there a cleaner or some kind of built-in way to do it?
Solution
You can use np.fmax
to get the maxima without creating an additional column:
df["y"] = np.fmax(df["a"], df["a"].shift(1))
This outputs:
a y
0 101 101.0
1 90 101.0
2 11 90.0
3 120 120.0
4 1 120.0
We use np.fmax()
to ignore the NaN
created when shifting df["a"]
.
Answered By - BrokenBenchmark
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.