Issue
Assume grid_sheet is an array (1000, 1000, 3)
and
array2 is numpy array shaped (13k-ish, 3).
We're basically treating this array2 like a list. A list of rgb value combinations. Each combination is unique.
And grid_sheet should be treated like a screenshot as if you've used snipping tool to create the image.
blank_sheet = np.zeros((grid_sheet.shape[0], grid_sheet.shape[1]))
for data in array2:
blank_sheet = np.where(((grid_sheet[:,:,2] == data[2]) & (grid_sheet[:,:,1] == data[1]) & (grid_sheet[:,:,0] == data[0])), blank_sheet+1, blank_sheet)
The output would be like a boolean array the same size as the grid_sheet. I don't want to use a for loop on array2 because it's just too slow.
I've tried splitting the channels to compare to their corresponding columns but when dstacking and summing it all back together just shows it marks nearly then entire grid with 1s. Results are the same if i merge the values together flatten then compare and then reshape to an image representable way. There are a number of other idea's i've tried, plenty of stackoverflow solutions i've tried to merge with others. I hardly see any point in nditer. someone tried suggesting itertools but i don't think the 2 mesh well together.
Solution
We can use broadcasting for this purpose. But first we have to add additional axes and slightly reorganize data.
To apply comparison along the third axis we have to transpose the array with colors, so that its colors are numbered by second index. And put two additional dimensions at the beginning to fit with the image plane:
colors = array2.T[None,None,:,:]
Now colors
has 4 dimensions and its shape is (1, 1, 3, len(array2))
. Next step is to add forth dimension to the image which will correspond the index of each color of the array2:
image = grid_sheet[:,:,:,None]
Now image
has also 4 dimensions and its shape is (1000, 1000, 3, 1)
. If we compare image
and colors
, the comparison will be done along the third axis, i.e. by colors only. To find out if all parts of a color match the color in the image point we apply all(2)
, where 2 addresses the third axis. Then we apply any
along the last dimension in order to find if any of the given colors matches the color of the image point:
result = (image == colors).all(2).any(2)
Note, that after the all
method the number of dimentions has been reduced by 1, so the index of the last dimension will be 2. That's why we put 2 as the parameter of any
.
Test case
from numpy import arange, array, newaxis
image = arange(3*3*2).reshape(3,3,2)
colors = array([[0,1], [2,3], [4,5], [8,9]])
expected = array([
[ True, True, True],
[False, True, False],
[False, False, False]
])
image = image[:, :, :, newaxis]
colors = colors.T[newaxis, newaxis, :, :]
assert colors.shape == (1,1,2,4)
assert image.shape == (3,3,2,1)
result = (image == colors).all(2).any(2)
assert (result == expected).all()
An example with Dask to process big pictures
import numpy as np
import dask.array as da
from dask.distributed import Client
client = Client(n_workers=4)
display(client)
# X, Y : a size of an image
# N : a number of colors to check
X, Y, N = 1080, 1920, 13921
# dX, dY, dN : dimensions of chunks for dask.array
# values vary by computer
dX, dY, dN = 360, 640, 500
image = np.arange(X*Y*3).reshape(X, Y, 3)
colors = np.arange(N*3).reshape(N, 3)
image = image[:, :, :, None]
colors = colors.T[None, None, :, :]
# a shape of chunks should resemble the logic of broadcasting
im = da.from_array(image, chunks=(dX, dY, 3, 1))
co = da.from_array(colors, chunks=(1, 1, 3, dN))
im, co = da.broadcast_arrays(im, co)
re = (im == co).all(2).any(2)
result = re.compute()
assert result.shape == (X, Y)
assert result.sum() == N
Answered By - Vitalizzare
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.