Issue
I have the following code, plotting a function on a grid, where the function happens to have a very large integer value:
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter, FuncFormatter
import numpy as np # thanks to user @simon pointing out I had forgotten this
p = 13
counts = [[0 for x in range(p)] for y in range(p)]
counts[0][0] = 1000000000
unique_counts = np.unique(counts)
plt.imshow(counts, cmap='viridis', origin='lower', extent=[0, p-1, 0, p-1])
cbar = plt.colorbar(ticks=unique_counts, format=ScalarFormatter(useOffset=False))
cbar.ax.yaxis.set_major_formatter(FuncFormatter(lambda x, _: format(int(x), ','))) # Format tick labels with commas
plt.show()
Running this in GoogleColab, it runs perfectly fine and gives the nice plot
However, if I bump up counts[0][0] = 1000000000000000000000
say, then I get the following error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-0ec4c2551685> in <cell line: 8>()
6 counts[0][0] = 100000000000000000000
7 unique_counts = np.unique(counts)
----> 8 plt.imshow(counts, cmap='viridis', origin='lower', extent=[0, p-1, 0, p-1])
9 cbar = plt.colorbar(ticks=unique_counts, format=ScalarFormatter(useOffset=False))
10 cbar.ax.yaxis.set_major_formatter(FuncFormatter(lambda x, _: format(int(x), ','))) # Format tick labels with commas
3 frames
/usr/local/lib/python3.10/dist-packages/matplotlib/image.py in set_data(self, A)
699 if (self._A.dtype != np.uint8 and
700 not np.can_cast(self._A.dtype, float, "same_kind")):
--> 701 raise TypeError("Image data of dtype {} cannot be converted to "
702 "float".format(self._A.dtype))
703
TypeError: Image data of dtype object cannot be converted to float
I would like very much to be able to plot functions that take very large integer values with exact precision (so rounding/using floats would not be good). Is this possible?
EDIT: someone was understandably confused by this seemingly useless level of precision in a plot; I clarified that what's actually important for me is actually being able to read the exact value off the colorbar labels (for number theory applications, I need an exact count for the number of points on some varieties mod p). So I'm ok with the plot being slightly off, but I do really want the colorbar labels to be exact.
Solution
New answer
(For my original answer, see the section below.)
Based on the question's update, from which it became clear that the essential information that should be retained is the precise integer values on the colorbar tick labels, here is my updated answer. Its crucial idea is:
- For the colorbar tick positions and image data, use floating point values (as these are the only ones that Matplotlib can internally deal with; see original answer below).
- For the colorbar tick labels, use the given integer values: provide them to Matplotlib as a list of already formatted strings (following this approach).
Here is the corresponding code:
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
import numpy as np
p = 13
counts = [[0 for x in range(p)] for y in range(p)]
# Provide some huge ints for demonstration purposes
counts[ 0][ 0] = 100000000000000000008
counts[ 0][-1] = counts[ 0][ 0] // 2
counts[-1][ 0] = counts[ 0][-1] // 2
counts[-1][-1] = counts[-1][ 0] // 2
# Get the unique values (without Numpy, just to be sure)
unique_counts = sorted(set(val for row in counts for val in row))
# Provide the image and tick *positions* as float values to avoid casting error
counts_img = np.array(counts, dtype=float)
counts_ticks = [float(val) for val in unique_counts]
# Provide the tick *labels* as strings generated from the original integer vals
counts_ticks_labels = [f'{val:,}' for val in unique_counts]
# Display everything
plt.imshow(counts_img, cmap='viridis', origin='lower', extent=[0, p-1, 0, p-1])
cbar = plt.colorbar(format=ScalarFormatter(useOffset=False))
cbar.set_ticks(ticks=counts_ticks, labels=counts_ticks_labels)
plt.show()
In older versions of Matplotlib, you might need to adjust the last three lines as follows:
cbar = plt.colorbar(ticks=counts_ticks, format=ScalarFormatter(useOffset=False))
cbar.ax.set_yticklabels(counts_ticks_labels)
plt.show()
And here is the resulting plot:
Original answer
Short answer
I currently do not see a way to exactly pass huge integers to imshow()
, due to the inner workings of Matplotlib relying on Numpy arrays for holding the image data. If you can live with approximate values, use
counts[0][0] = float(100000000000000000000)
Long answer
The reason for the error that you see is that your nested list of image data is internally converted to a Numpy array by Matplotlib before displaying it. In Matplotlib's current version, this happens in cbook.safe_masked_invalid()
, which is called by _ImageBase._normalize_image_array()
, which is called by _ImageBase.set_data()
, which is called by Axes.imshow()
.
The chain of problems here is the following:
Huge integers (i.e. integers that cannot be represented by Numpy's
int_
data type, I assume) are converted to Numpy'sobject
data type by default. This happens for your data withcounts[0][0] = 100000000000000000000
, but not withcounts[0][0] = 1000000000
. You can easily check the corresponding Numpy behavior as follows:str(np.array([100000000000000000000]).dtype) # >>> 'object' str(np.array([1000000000]).dtype) # >>> 'int64'
In Matplotlib, as already mentioned, this happens in
cbook.safe_masked_invalid()
; more precisely, it happens in the linex = np.array(x, subok=True, copy=copy)
, wherex
refers to your nested listcounts
.After that,
_ImageBase._normalize_image_array()
checks whether the resulting array's data type is eitheruint8
or whether it can be cast to thefloat
data type. Neither is true for Numpy'sobject
data type, so the error is raised.
To avoid this chain of problems, the only possibility that I see is converting your data to float values or to a float array yourself, once the values become too big, before passing them to imshow()
.
Answered By - simon
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.