Issue
I have a 2D Numpy array of integers like so:
a = np.array([[ 3, 0, 2, -1],
[ 1, 255, 1, 2],
[ 0, 3, 2, 2]])
and I have a dictionary with integer keys and values that I would like to use to replace the values of a
with new values. The dict might look like this:
d = {0: 1, 1: 2, 2: 3, 3: 4, -1: 0, 255: 0}
I want to replace the values of a
that match a key in d
with the corresponding value in d
. In other words, d
defines a map between old (current) and new (desired) values in a
. The outcome for the toy example above would be this:
a_new = np.array([[ 4, 1, 3, 0],
[ 2, 0, 2, 3],
[ 1, 4, 3, 3]])
What would be an efficient way to implement this?
This is a toy example, but in practice the array will be large, its shape will be e.g. (1024, 2048)
, and the dictionary will have on the order of dozens of elements (34 in my case), and while the keys are integers, they are not necessarily all consecutive and they can be negative (like in the example above).
I need to perform this replacement on hundreds of thousands of such arrays, so it needs to be fast. However, the dictionary is known in advance and remains constant, so asymptotically, any time used to modify the dictionary or transform it into a more appropriate data structure doesn't matter.
I'm currently looping over the array entries in two nested for
loops (over the rows and columns of a
), but there has got to be a better way.
If the map didn't contain negative values (e.g. -1 like in the example), I would just create a list or an array from the dictionary once where the keys are the array indices and then use that for an efficient Numpy fancy indexing routine. But since there are negative values, too, this won't work.
Solution
Here's one way, provided you have a small dictionary/min and max values, this may be more efficient, you work around the negative index by adding the array min:
In [11]: indexer = np.array([d.get(i, -1) for i in range(a.min(), a.max() + 1)])
In [12]: indexer[(a - a.min())]
Out[12]:
array([[4, 1, 3, 0],
[2, 0, 2, 3],
[1, 4, 3, 3]])
Note: This moves the for loop to the lookup table, but if this is significantly smaller than the actual array this could be a lot faster.
Answered By - Andy Hayden
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.