Issue
I use below plotly code to create a sankey chart.
import plotly.graph_objects as go
import plotly.io as pio
import pandas as pd
dataset = pd.read_csv('/Users/i073341/Library/CloudStorage/OneDrive-SAPSE/Data Analysis/result/Opportunity/20220121-170612/omp_cleanSankey.csv')
labelListTemp1 = list(set(dataset.source.values))
labelListTemp2 = list(set(dataset.target.values))
labelList = labelListTemp1 + labelListTemp2
sankey_node = list(dict.fromkeys(labelList))
fig = go.Figure(data=[go.Sankey( node = dict( pad=15,thickness=20,line = dict(color = "black", width = 0.5),label = labelList,color = "blue" ),
link = dict(source = dataset.source.apply(lambda x: labelList.index(x)),
target = dataset.target.apply(lambda x: labelList.index(x)),
value = dataset.value))])
fig.update_layout(autosize=False,width = 3000,height = 1000,hovermode = 'x',title="performance Goal user behavior monitor",font=dict(size=16, color='black'))
fig.write_html('/Users/i073341/Library/CloudStorage/OneDrive-SAPSE/test/perfUXRGoal.html', auto_open=False)
And the chart style created by plotly looks like below.
I googled that matplotlib can create sankey chart as well, but the sankey style created by matplotlib looks like below.
Is the possible to matplotlib to create a sankey chart that style is like what plotly created?
Solution
With a lack of good alternatives, I bit the bullet and tried my hand at creating my own sankey plot that looks more like plotly and sankeymatic. This uses purely Matplotlib and produces flows like below. I don't see the plotly image in your post though, so I don't know what you want it to look like exactly.
Full code at bottom. You can install this with python -m pip install sankeyflow
. The basic workflow is simply
from sankeyflow import Sankey
plt.figure()
s = Sankey(flows=flows, nodes=nodes)
s.draw()
plt.show()
Note that pySankey does use Matplotlib too, but it only allows for 1 level of bijective flow. SankeyFlow is much more flexible, with multiple levels and doesn't have to be bijective, but requires you to define the nodes.
from sankeyflow import Sankey
import matplotlib.pyplot as plt
plt.figure(figsize=(20, 10), dpi=144)
nodes = [
[('Product', 20779), ('Sevice\nand other', 30949)],
[('Total revenue', 51728)],
[('Gross margin', 34768), ('Cost of revenue', 16960)],
[('Operating income', 22247), ('Other income, net', 268), ('Research and\ndevelopment', 5758), ('Sales and marketing', 5379), ('General and\nadministrative', 1384)],
[('Income before\nincome taxes', 22515)],
[('Net income', 18765), ('Provision for\nincome taxes', 3750)]
]
flows = [
('Product', 'Total revenue', 20779, {'flow_color_mode': 'source'}),
('Sevice\nand other', 'Total revenue', 30949, {'flow_color_mode': 'source'}),
('Total revenue', 'Gross margin', 34768),
('Total revenue', 'Cost of revenue', 16960),
('Gross margin', 'Operating income', 22247),
('Gross margin', 'Research and\ndevelopment', 5758),
('Gross margin', 'Sales and marketing', 5379),
('Gross margin', 'General and\nadministrative', 1384),
('Operating income', 'Income before\nincome taxes', 22247),
('Other income, net', 'Income before\nincome taxes', 268, {'flow_color_mode': 'source'}),
('Income before\nincome taxes', 'Net income', 18765),
('Income before\nincome taxes', 'Provision for\nincome taxes', 3750),
]
s = Sankey(
flows=flows,
nodes=nodes,
)
s.draw()
plt.show()
Answered By - rileyx
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.