Issue
I have four 5x5 matrices that I want to compare to one "original"-matrix. I calculated the error of each matrix which results in a new 5x5 matrix that has the percentage error for each entry.
e.g.: Matrix_W
Ac Bc Cc Dc Ec
Ar 0.04 0.03 0.02 0.05 0.06
Br 0.01 0.02 0.04 0.02 0.01
Cr 0.03 0.05 0.09 0.08 0.01
Dr 0.07 0.09 0.05 0.03 0.01
Er 0.01 0.03 0.05 0.05 0.08
(r for row, c for column)
I stored four of these matrices in a pandas DataFrame:
Ac Bc Cc Dc Ec rowValue mType
0.04 0.03 0.02 0.05 0.06 Ar W
0.01 0.02 0.04 0.02 0.01 Br W
0.03 0.05 0.09 0.08 0.01 Cr W
0.07 0.09 0.05 0.03 0.01 Dr W
0.01 0.03 0.05 0.05 0.08 Er W
0.04 0.04 0.03 0.01 0.02 Ar X
0.09 0.07 0.05 0.04 0.01 Br X
0.01 0.02 0.06 0.05 0.07 Cr X
……
0.06 0.08 0.04 0.03 0.09 Er Z
Now, I want to create a 5x5 scatterplot matrix using seaborn to plot the error for each of the four matrices. Similar to this one, just with a 5 columns and rows:
So cell 0,0 (upper left corner) of the scatterplot matrix should show a plot of the error in position Ac, Ar of the four matrices. The x and y axis of the scatter plot matrix are independent: x_vars = Ac up to Ec; y_vars= Ar up to Er. The hue
should depend on the mType
variable.
The following code did not lead to the desired output:
Import Seaborn as sns
g = sns.PairGrid(df, x_vars=df.columns[:-2], y_vars=df[‚rowValue‘], hue=df[‚mType‘])
g.map(sns.scatterplot)
The result I get is a 20x5 matrix which does not seem to have an independent x and y-axis. I am not sure if the issue is how I store the data in the DataFrame, or if there are other things that I have to do beforehand to achieve the desired result. Any help is much appreciated!
Solution
You can't do what you described: in order to plot a scatterplot you need two variables, one for x axis and one for y axis. With your set of data you will get a 5x5 grid of subplots and each subplot will contain 4 points (one for each mType
), but which other variables determinates the position of this 4 points within that subplot?
I suggest you a different approch.
You can plot 4 different 5x5 heatmap, one for each mType
. Each cell of these heatmaps has a color based on its value.
Complete Code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from string import ascii_uppercase
# user input
np.random.seed(42)
matrix_size = 5
matrices = ['W', 'X', 'Y', 'Z']
# dataframe creation
n_matrix = len(matrices)
row_labels = [f'{lecter}r' for lecter in ascii_uppercase[:matrix_size]]
col_labels = [f'{lecter}c' for lecter in ascii_uppercase[:matrix_size]]
df = pd.DataFrame({'mType': np.repeat(matrices, matrix_size),
'rowValue': n_matrix*row_labels})
for col in col_labels:
df[col] = 0.05 + 0.01*np.random.randn(matrix_size*n_matrix)
# plotting
fig, ax = plt.subplots(1, n_matrix, figsize = (5*n_matrix, 5))
for i, mtype in enumerate(df['mType'].unique(), 0):
cbar = False if i != n_matrix - 1 else True
sns.heatmap(ax = ax[i],
data = df.loc[df['mType'] == mtype, col_labels],
vmin = 0.02,
vmax = 0.08,
annot = True,
cbar = cbar,
cmap = 'Spectral_r')
ax[i].tick_params(left = False, bottom = False)
ax[i].set_yticklabels(df['rowValue'].unique())
ax[i].set_title(mtype)
plt.show()
The code above is generalized for any number of matrices and any matrix size.
In your case (4 matrices, size 5):
Answered By - Zephyr
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.