Issue
I have a pandas data frame that looks like this
Single Client Index | Nation | Status | Client Name |
---|---|---|---|
1 | UK | Active | abc |
1 | Sweden | Inactive | aab |
1 | Germany | Inactive | bba |
2 | Poland | Active | rge |
2 | Australia | Active | erg |
2 | Denmark | Inactive | asj |
3 | Norway | Inactive | fjw |
3 | Sweden | Inactive | hjs |
4 | Bahrain | Inactive | isg |
4 | Bahrain | Inactive | ejs |
5 | USA | Active | mt4 |
5 | USA | Active | whw |
And I need to turn it into something that looks like this
Single Client Index | Nation/Status Mix |
---|---|
1 | One Active Nation, Rest Inactive |
2 | Multiple Active Nations |
3 | All Nations Inactive |
4 | All Clients Inactive, Same Nation |
5 | All Clients Active, Same Nation |
I have no idea how to approach this. I can pivot on the single client index, but how can I get the conditional logic over the "if exists" element of the nation/status field combos?
I am only a beginner with Python but any help is greatly appreciated.
Solution
See if this gets you started.
import pandas as pd
data = [
[1,'UK','Active','abc'],
[1,'Sweden','Inactive','aab'],
[1,'Germany','Inactive','bba'],
[2,'Poland','Active','rge'],
[2,'Australia','Active','erg'],
[2,'Denmark','Inactive','asj'],
[3,'Norway','Inactive','fjw'],
[3,'Sweden','Inactive','hjs'],
[4,'Bahrain','Inactive','isg'],
[4,'Bahrain','Inactive','ejs'],
[5,'USA','Active','mt4'],
[5,'USA','Active','whw']
]
df = pd.DataFrame( data, columns=['Client Index','Nation','Status','Client Name'] )
print(df)
def classify(info):
if not info['Active']:
if len(info['Inactive']) > 1:
return 'All Nations Inactive'
else:
return 'One Nation, All Inactive'
elif len(info['Active']) == 1:
if not info['Inactive']:
return 'One Nation, All Active'
else:
return 'One Active Nation, Rest Inactive'
else:
if not info['Inactive']:
return 'All Nations Active'
else:
return 'Multiple Nations Active'
active = {'Active': set(), 'Inactive': set()}
last = 0
result = []
for _index, row in df.iterrows():
if row['Client Index'] != last:
if last:
result.append( [last, classify(active)] )
last = row['Client Index']
active = {'Active': set(), 'Inactive': set()}
active[row['Status']].add( row['Nation'] )
result.append( [last, classify(active)] )
df1 = pd.DataFrame( result, columns=['Client Index','Status'])
print(df1)
Output:
Client Index Nation Status Client Name
0 1 UK Active abc
1 1 Sweden Inactive aab
2 1 Germany Inactive bba
3 2 Poland Active rge
4 2 Australia Active erg
5 2 Denmark Inactive asj
6 3 Norway Inactive fjw
7 3 Sweden Inactive hjs
8 4 Bahrain Inactive isg
9 4 Bahrain Inactive ejs
10 5 USA Active mt4
11 5 USA Active whw
Client Index Status
0 1 One Active Nation, Rest Inactive
1 2 Multiple Nations Active
2 3 All Nations Inactive
3 4 One Nation, All Inactive
4 5 One Nation, All Active
Answered By - Tim Roberts
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.