Issue
I have a dataframe which I want to iterate to get a specific result.
Portion of the dataframe df (the name of the columns are guid1
and guid2
) :
DataSet
guid1 guid2
A865 OR4
A875 OR4
OR4 JN1397
JN1400 JN1401
JN1402 DEL131KS-
Unset AND1
Interpretation : Both A865
and A875
are connected to OR4
and OR4
is then connected to JN1397
(it's a representation of a logical diagram).
My goal here is to iterate through the dataframe (to verify if there is any connections) to get a string result (as follows) in which I interpret the different connections:
A865, A875, OR4, JN1397
I already posted this question and it has been closed for lack of clarity, I really hope it's clear enough now because I really need some help here..
Solution
You could use networkx
to handle your graph.
This is how it looks (showing as directed graph, but using an undirected graph later):
You can create the graph using:
import networkx as nx
G = nx.from_pandas_edgelist(df, source='guid1', target='guid2')
Then keep the subgraph that contains your initial node:
start = df['guid1'].iloc[0] # 'A865'
keep = nx.node_connected_component(G, start)
# {'A865', 'A875', 'JN1397', 'OR4'}
If you really want the pseudo-path in original order of the DataFrame:
key = {v: k for k,v in enumerate(df['guid1'].drop_duplicates())}
# {'A865': 0, 'A875': 1, 'OR4': 2, 'JN1400': 3, 'JN1402': 4, 'Unset': 5}
path = ' -> '.join(sorted(keep, key=lambda x: key.get(x, float('inf'))))
output: 'A865 -> A875 -> OR4 -> JN1397'
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.