Issue
I have a pyspark dataframe with a column label:
label
0
1
2
3
0
And I want to create a new column new_label
changing all values that is not 3
to 0
.
to have only 2 classes: 0
and 3
I am pretty new to pyspark. How can I do this?
Solution
assuming df is your dataframe :
from pyspark.sql import functions as F
df = df.withColumn("new_label", F.when(F.col("label") == 3, 3).otherwise(0))
Answered By - Steven
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.