Issue
I was wondering how to highlight diagonal elements of pandas DataFrame using df.style methods.
I already found out how to do it with the main diagonal, but can't manage to highlight the one which starts from the second column, f.e.
import numpy as np
import pandas as pd
df = pd.DataFrame({'a':[1,2,3,4],'b':[1,3,5,7],'c':[1,4,7,10],'d':[1,5,9,11]})
def style_diag(data):
diag_mask = pd.DataFrame("", index=df.index, columns=df.columns)
min_axis = min(diag_mask.shape)
diag_mask.iloc[range(min_axis), range(min_axis)] = 'background-color: yellow'
return diag_mask
df.style.apply(style_diag, axis=None)
This gives following output:
(but actually I don't really get the magic in this function)
And I'd like to have a yellow highlight across the diagonal elements 1, 4, 9.
How can I do that?
Solution
There are certainly more than a few options here depending on the exact needs. One approach would be to create a mask of the same shape as your DataFrame with the diagonals at the desired offset filled with True
s to conditionally apply styles.
The approach and usage
def style_diag(df_: pd.DataFrame, offset: int = 0) -> pd.DataFrame:
# Create empty styles DataFrame
style_df = pd.DataFrame('', index=df_.index, columns=df_.columns)
# Create a 2D False mask
mask = np.zeros(df_.shape, dtype=bool)
# Find diagonal indices at an offset and replace values with True
rows, cols = np.indices(mask.shape)
mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
# Set diagonal styles using mask
style_df[mask] = 'background-color:yellow'
return style_df
This can be used like:
df.style.apply(style_diag, offset=1, axis=None)
Which produces the following results:
Similarly this can be used without an offset to produce the original output:
df.style.apply(style_diag, axis=None)
Or even with negative offsets:
df.style.apply(style_diag, offset=-2, axis=None)
How it works
We start with an empty False mask of the same shape as our DataFrame:
mask = np.zeros(df_.shape, dtype=bool)
# array([[False, False, False, False],
# [False, False, False, False],
# [False, False, False, False],
# [False, False, False, False]])
From here we need to find the diagonal indices in order to replace the values on the diagonal with True
. There is a function np.diag_indices_from, however, unfortunately this does not directly support offset diagonals.
Let's instead grab the indices for this mask using np.indices
rows, cols = np.indices(mask.shape)
# rows
# array([[0, 0, 0, 0],
# [1, 1, 1, 1],
# [2, 2, 2, 2],
# [3, 3, 3, 3]])
# cols
# array([[0, 1, 2, 3],
# [0, 1, 2, 3],
# [0, 1, 2, 3],
# [0, 1, 2, 3]])
We can now use the np.diag function on both rows
and cols
which does natively support offsets (k
). (For this example, offset
is 1)
np.diag(rows, k=offset)
# array([0, 1, 2])
np.diag(cols, k=offset)
# array([1, 2, 3])
We can use the results from diag as indexers to update our mask
mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
# array([[False, True, False, False],
# [False, False, True, False],
# [False, False, False, True],
# [False, False, False, False]])
Now we have a well formatted mask that can be used easily apply style strings.
style_df[mask] = 'background-color:yellow'
# a b c d
# 0 background-color:yellow
# 1 background-color:yellow
# 2 background-color:yellow
# 3
Complete working example with imports and version numbers used
import numpy as np # v1.26.2
import pandas as pd # v2.1.4
df = pd.DataFrame({
'a': [1, 2, 3, 4],
'b': [1, 3, 5, 7],
'c': [1, 4, 7, 10],
'd': [1, 5, 9, 11]
})
def style_diag(df_: pd.DataFrame, offset: int = 0) -> pd.DataFrame:
style_df = pd.DataFrame('', index=df_.index, columns=df_.columns)
mask = np.zeros(df_.shape, dtype=bool)
rows, cols = np.indices(mask.shape)
mask[np.diag(rows, k=offset), np.diag(cols, k=offset)] = True
style_df[mask] = 'background-color:yellow'
return style_df
df.style.apply(style_diag, offset=1, axis=None)
Answered By - Henry Ecker
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.