Issue
I am currently unpacking an encrypted file from the software I use to get an image (2048x2048) from a file (along with other information). I'm currently able to do this but it takes about 1.7 seconds to load. Normally this would be fine but I'm loading 40ish images at each iteration and my next step in this simulation is to add more iterations. I've been trying to use JIT interpreters like pypy and numba. The code below is just one function in a larger object but it's where I'm having the most timelag.
Pypy works but when I call my numpy functions, it takes twice as long. So I tried using numba, but it doesn't seem to like unpack. I tried using numba within pypy, but that also seems to not work. My code goes a bit like this
from struct import unpack
import numpy as np
def read_file(filename: str, nx: int, ny: int) -> tuple:
f = open(filename, "rb")
raw = [unpack('d', f.read(8))[0] for _ in range(2*nx*ny)] #Creates 1D list
real_image = np.asarray(raw[0::2]).reshape(nx,ny) #Every other point is the real part of the image
imaginary_image = np.asarray(raw[1::2]).reshape(nx,ny) #Every other point +1 is imaginary part of image
return real_image, imaginary_image
In my normal python interpreter, the raw
line takes about 1.7 seconds and the rest are <0.5 seconds.
If I comment out the numpy lines and just unpack in pypy, the raw
operation takes about 0.3 seconds. However, if I perform the the reshaping operations, it takes a lot longer (I know it has to do with fact that numpy is optimized in C and will take longer to convert).
So I just discovered numba and thought I'd give it a try by going back to my normal python interpreter (CPython?). If I add the @njit
or @vectorize
decorators to the function I get the following error message
File c:\Users\MyName\Anaconda3\envs\myenv\Lib\site-packages\numba\core\dispatcher.py:468, in _DispatcherBase._compile_for_args(self, *args, **kws)
464 msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
465 f"by the following argument(s):\n{args_str}\n")
466 e.patch_message(msg)
--> 468 error_rewrite(e, 'typing')
469 except errors.UnsupportedError as e:
470 # Something unsupported is present in the user code, add help info
471 error_rewrite(e, 'unsupported_error')
File c:\Users\MyName\Anaconda3\envs\myenv\Lib\site-packages\numba\core\dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
407 raise e
408 else:
--> 409 raise e.with_traceback(None)
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name 'unpack': Cannot determine Numba type of <class 'builtin_function_or_method'>
I may be reading this error message wrong but it seems that Numba does not like built in functions? I haven't looked into any of the other options like Cython. Is there some way to make Numba or pypy work? I'm mostly interested in speeding this operation up so I'd be very interested to know what people think is the best option. I'd be willing to explore optmizing in C++ but I'm not aware of how to link the two
Solution
Issuing tons of .read(8)
calls and many small unpack
ings is dramatically increasing your overhead, with limited benefit. If you weren't using numpy
already, I'd point you to preconstructing an instance of struct.Struct
and/or using .iter_unpack
to dramatically reduce the costs of looking up Struct
s to use for unpacking, and replacing a bunch of tiny read
calls with a bulk read
(you need all the data in memory anyway), but since you're using numpy
, you can have it do all the work for you much more easily:
import numpy as np
def read_file(filename: str, nx: int, ny: int) -> tuple:
data_needed = 2*8*nx*ny
with open(filename, "rb") as f: # Use with statements so you don't risk leaking file handles
raw = f.read(data_needed) # Perform a single bulk read
if len(raw) != data_needed:
raise ValueError(f"{filename!r} is too small to contain a {nx}x{ny} image")
arr = np.frombuffer(raw) # Perform a single bulk read, and convert from that raw buffer to a single numpy array
real_image = arr[::2].reshape(nx,ny) # Slice and reshape to desired format
imaginary_image = arr[1::2].reshape(nx,ny)
return real_image, imaginary_image
That replaces a bunch of relatively slow Python level manipulation with a very fast:
- Bulk read of all the data
- Bulk conversion of the data to a single
numpy
array (it doesn't even unpack it properly, it just interprets it in-place as being of the expected type, which defaults to "float", actually Cdouble
s) - Slicing and reshaping appropriately
No need for numba
; on my local box, for a 2048x2048 call, your code took ~1.75 seconds, this version took ~10 milliseconds.
Answered By - ShadowRanger
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.