Issue
I have the following dataframe (nodes
)
nodeType subType
Supplier 1 Supplier Supplier
Supplier 2 Supplier Supplier
Supplier 3 Supplier Supplier of another type
System Integrator System Integrator System Integrator
Availability Zone 1 Availability Zone Server
Availability Zone 2 Availability Zone Warehouse
Availability Zone 3 Availability Zone Warehouse
Availability Zone 4 Availability Zone Warehouse
I would like to have a new column assigning a number to "subType" depending if they belong to the same nodeType
Expected result:
nodeType subType enumeration
Supplier 1 Supplier Supplier 0
Supplier 2 Supplier Supplier 0
Supplier 3 Supplier Supplier of another type 1
System Integrator System Integrator System Integrator 0
Availability Zone 1 Availability Zone Server 0
Availability Zone 2 Availability Zone Warehouse 1
Availability Zone 3 Availability Zone Warehouse 1
Availability Zone 4 Availability Zone Warehouse 1
up to this point, my best approach was to use
nodes["enumeration"] = nodes.groupby("nodeType").subType.cumcount()
but this doesn´t yield what I am expecting.
Thanks in advance
Solution
The solution can be achieved by using the following command
nodes["nodeType_enum"] = nodes.groupby("nodeType",group_keys=False).apply(lambda x: x.groupby("subType").ngroup())
I tried this same command without setting "group_keys" to False. Once done that, I got what I was expecting
Answered By - Lorenzo Gutiérrez
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.