Issue
Quick summary: matplotlib savefig is too slow to PNG. ...looking for ideas/thoughts on how to speed it up, or alternative libraries (chaco? cairo?)
Updated: Added some (very rough and ready) code to illustrate at the bottom.
I'm using matplotlib (python 3.x, latest anaconda on quad core macbook) to create a plot of a single 1024x1024 np array (of int16's) via imshow()
. My goal is to simply produce an annotated image file on disk (no interactive display needed).
The axes is set to fill the figure completely (so no splines/tics etc) and the dpi/size combo is set to match the size of the array - so no scaling/interpolation etc.
On top of that single axes, I'm display 3 text areas and a few (~6) rectangle patches.
...so nothing fancy and pretty much as simple as you can get from a plotting perspective.
However when I save the figure (with savefig
) to PNG it takes around ~1.8 seconds (!!!).
...Saving as raw or jpg both come in at around ~0.7 sec.
I tried switching backends to Agg, but that increased the time to about ~2.1 sec for savefig()
Am I wrong in thinking this is too slow? I would prefer to save in PNG, not JPG - but I can't understand why PNG is that much slower than JPG. My goal is to deploy on AWS, so concerned about speed here.
Are there any faster libraries around? (I don't want interactive UI plotting, just basic save-to-file plotting)
Some rough and ready code that approximately illustrates this is below. The output on my machine is:
current backend: MacOSX
default save: 0.4048
default save - float64: 0.3446
full size figure: 0.8105
full size figure - with text/rect: 0.9023
jpg: full size figure - with text/rect: 0.7468
current backend: agg
AGG: full size figure - with text/rect: 1.3511
AGG: jpg: full size figure - with text/rect: 1.1689
I couldn't (even after repeated trying) get the sample code to reproduce the ~1.7 sec (process time) savefig() that I'm seeing in my app, but I think the code below still illustrates a) jpg is faster than png (or conversely, png seems slow) b) it still seems slow (imo)
So should I not be expecting anything faster than this? ...is that just the speed it is? Are there any faster backends available? When I deploy on AWS (linux) what is the best/fastest backend to use there?
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon, Rectangle
import time
def some_text(ax):
pm = u'\u00b1'
string = f'blah\nblah {pm}blah\nblah blah blah'
ax.text(10, 10, string, color='red', ha='left')
ax.text(990, 990, string, color='green', ha='right')
ax.text(500, 500, string, color='green', ha='center')
ax.text(500, 500, string, color='green', ha='center', va='top', fontsize=10)
ax.text(800, 500, string, color='green', ha='center', multialignment='center', fontsize=16)
def some_rect(ax):
rect = Rectangle((10,10),width=100, height=100, color='red', fill=False)
ax.add_patch(rect)
rect = Rectangle((300,10),width=100, height=100, color='yellow', fill=False)
ax.add_patch(rect)
rect = Rectangle((300,600),width=50, height=50, color='yellow', fill=False)
ax.add_patch(rect)
rect = Rectangle((800,600),width=50, height=50, color='yellow', fill=False)
ax.add_patch(rect)
dim = 1024
test = np.arange(dim*dim).reshape((dim, dim))
dpi = 150
inches = test.shape[1]/dpi, test.shape[0]/dpi
print('current backend:', matplotlib.get_backend())
plt.imshow(test)
c0 = time.process_time()
plt.savefig('test.png')
print(f'default save: {(time.process_time()-c0):.4f}')
plt.close()
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(test)
c0 = time.process_time()
plt.savefig('test3.png')
print(f'full size figure: {(time.process_time()-c0):.4f}')
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(test)
some_text(ax)
some_rect(ax)
c0 = time.process_time()
plt.savefig('test4.png')
print(f'full size figure - with text/rect: {(time.process_time()-c0):.4f}')
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(test)
some_text(ax)
some_rect(ax)
c0 = time.process_time()
plt.savefig('test5.jpg')
print(f'jpg: full size figure - with text/rect: {(time.process_time()-c0):.4f}')
backend = 'agg'
matplotlib.use(backend, force=True)
import matplotlib.pyplot as plt
print('current backend: ', matplotlib.get_backend())
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(test)
some_text(ax)
some_rect(ax)
c0 = time.process_time()
plt.savefig('test6.png')
print(f'AGG: full size figure - with text/rect: {(time.process_time()-c0):.4f}')
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(test)
some_text(ax)
some_rect(ax)
c0 = time.process_time()
plt.savefig('test7.jpg')
print(f'AGG: jpg: full size figure - with text/rect: {(time.process_time()-c0):.4f}')
Solution
Try making a PIL
image object, for me it's more than 100 times faster than matplotlib
:
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
data = np.random.random((100, 100))
cm = plt.get_cmap('viridis')
img = Image.fromarray((cm(data)[:, :, :3] * 255).astype(np.uint8))
img.save('image.png')
If you just want greyscale, you can skip the get_cmap
business — just scale your array to the range 0 to 255.
The annotations would have to be added in PIL
.
One important difference from using matplotlib
is that it's pixel-for-pixel. So if you want to apply some scaling, you'll have to interpolate first. You could use scipy.ndimage.zoom
for that.
Answered By - Matt Hall
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.