Issue
There is now a way to search the index of a submatrix in the main matrix, but both are slow, can anyone think of a faster way?
a=np.array([[5 4 5 9 3 4 6 2 5 3]
[8 3 5 4 3 4 5 8 4 4]
[5 7 8 5 2 3 3 6 8 8]
[4 5 6 2 6 5 6 7 9 3]
[3 6 8 2 8 7 3 8 8 8]])
b=np.array([[2 3 3]
[6 5 6]])
def search_image(kernel,array):
array_H, array_W = array.shape
kernel_H, kernel_W = kernel.shape
for x in range(0,array_W-kernel_W+1):
for y in range(0,array_H-kernel_H+1):
array_subsection = array[y:y+kernel_H, x:x+kernel_W]
if (array_subsection== kernel).all():
print(f"kernel was found at x={x}, y={y}.")
return [x,y]
print(f"kernel was not found in array.")
Solution
Assuming the component of the small matrix are not repeated a lot of time in the big matrix and the small matrix is far smaller than the big one, you can search the first item of the small matrix in the big one so to then search the actual small matrix at the target found locations. This should help to reduce the number of sub-search by a large factor. Here is an example:
def search_image_faster(kernel, array):
h, w = kernel.shape
assert w > 0 and h > 0
locs = np.where(array == kernel[0,0])
for i in range(len(locs[0])):
y, x = locs[0][i], locs[1][i]
if np.array_equal(array[y:y+h, x:x+w], kernel):
#print(f"kernel was found at x={x}, y={y}.")
return [x,y]
#print(f"kernel was not found in array.")
This strategy can be enhanced by checking other values like the last item of the small matrix. Still, there are pathological cases where this algorithm is not more efficient than the initial one.
The thing is Numpy alone provides no way to do that efficiently AFAIK. One very efficient solution is to use Numba to do it:
import numba as nb
@nb.njit(inline="always")
def match_subimage(kernel, array, ay, ax):
kh, kw = kernel.shape
ah, aw = array.shape
if ay + kh > ah or ax + kw > aw:
return False
for ky in range(kh):
for kx in range(kw):
if array[ay+ky, ax+kx] != kernel[ky, kx]:
return False
return True
@nb.njit('(uint8[:,::1], uint8[:,::1])')
def search_image_fastest(kernel, array):
kh, kw = kernel.shape
ah, aw = array.shape
assert kh+kw > 0
kv = kernel[0, 0]
# Note this is faster to iterate over the x location in the nested loop
for y in range(ah):
for x in range(aw):
# Fast-fail optimization
if array[y, x] != kv:
continue
if match_subimage(kernel, array, y, x):
#print(f"kernel was found at x={x}, y={y}.")
return (x, y)
#print(f"kernel was not found in array.")
return None
Here are the timings for the input matrices on my i5-9600KF processor:
search_image: 55.2 µs
search_image_faster: 14.7 µs
search_image_fastest: 0.5 µs
The Numba implementation is about 100 times faster than the initial implementation. In fact, it is so fast that >60% of its execution time is just the time to start the Numba function from CPython and not the actual computation.
If this is not enough, you can use multiple Numba threads to do that., but using multiple thread can make the analysis of small input matrices slower due to the time to start threads.
If this is still not enough, then you can pre-compute the checksum of the small matrix and iterate over the big matrix so to compute the partial checksum and compare it to the precomputed one. This strategy aims to reduce the number of heavy matches like in the first solution except the probability for two checksum to be equal should be even smaller and thus possibly faster in pathological cases.
Answered By - Jérôme Richard
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.