Tuesday, February 8, 2022

[FIXED] How to get bytes from interleave of numpy arrays with different dtype?

February 08, 2022 numpy, python No comments

Issue

Say I have two numpy arrays of different type

a = np.array([1, 3], dtype=np.int32)
b = np.array([[1.1, 4.4, 1.7],
              [1.1, 7.5, 8.2]], dtype=np.float32)

using a.tobytes() and b.tobytes() gives me the expected outputs:

# a.tobytes() is
b'\x01\x00\x00\x00\x03\x00\x00\x00'
# b.tobytes() is
b'\xcd\xcc\x8c?\xcd\xcc\x8c@\x9a\x99\xd9?\xcd\xcc\x8c?\x00\x00\xf0@33\x03A'

I would like to interleave the two like arrays to obtain bytes like this:

# The order I want is a[0] b[0] a[1] b[1].
b'\x01\x00\x00\x00\xcd\xcc\x8c?\xcd\xcc\x8c@\x9a\x99\xd9?\x03\x00\x00\x00\xcd\xcc\x8c?\x00\x00\xf0@33\x03A'

I tried using numpy because the code is not verbose and executes faster than my (successful but slow) for loop implementation.

ab = np.empty((2, 4), dtype=object)
ab[:, 0] = a
ab[:, 1:] = b

# ab will then be
array([[1, 1.100000023841858, 4.400000095367432, 1.7000000476837158],
       [3, 1.100000023841858, 7.5, 8.199999809265137]], dtype=object)

However I can't use ab.tobytes() anymore because the type object isn't even 32 bit anymore. The output is:

b"0'b\xfb\xfb\x7f\x00\x00p.\xaf`\x87\x01\x00\x00\xf0-\xaf`\x87\x01\x00\x00\x10&\xaf`\x87\x01\x00\x00p'b\xfb\xfb\x7f\x00\x00P,\xaf`\x87\x01\x00\x00P%\xaf`\x87\x01\x00\x00P)\xaf`\x87\x01\x00\x00"

Ultimately I want to write the bytes to a file.

Solution

I found a slightly outrageous way to let just numpy handle all of it. Using arr.tobytes() gives us the bytes of a numpy array arr, naturally numpy has a function np.frombuffer doing the reverse.

Hence I can pretend my integer array is actually also made of floats:

ab = np.empty((2, 1 + 3), dtype=np.float32)
ab[:,0] = np.frombuffer(a.tobytes(), dtype=np.float32)
ab[:,1:] = b
ab.tobytes()

In my tests this was more than 7 times faster than Psidoms first solution, which was more than twice as fast as Psidoms second solution. Disclaimer, this includes writing the bytes to a file and some overhead.

EDIT: using numpy.ndarray.data instead of numpy.ndarray.tobytes is even faster. .tobytes creates a variable containing the data, where as .data points to it in memory. So finally:

ab = np.empty((2, 1 + 3), dtype=np.float32)
ab[:,0] = np.frombuffer(a.data, dtype=np.float32)
ab[:,1:] = b
ab.data

Answered By - ty.

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, February 8, 2022

[FIXED] How to get bytes from interleave of numpy arrays with different dtype?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels