Issue
I'm trying to write code that converts an RGB image array (H,W,C) into 2D array with the format of "RGB(r,g,b)", which is then converted into a pandas DataFrame. My code is almost working except some of the blue values seem to be missing from the final result and I cannot figure out why. How do I convert the data and maintain all the information?
image = io.imread('px.jpg')
# Convert the 3D array to a 2D array of RGB strings
rgb_strings_array = np.apply_along_axis(lambda row: f"RGB({int(row[0])}, {int(row[1])}, {int(row[2])})", axis=2, arr=image)
# Create a DataFrame with the RGB strings array
df = pd.DataFrame(rgb_strings_array, columns=range(rgb_strings_array.shape[1]))
Output (in Excel):
You can see that some of the blue values are missing from the saved data (and it is not an Excel issue since the pandas DataFrame is missing the information as well).
Solution
The issue is that np.apply_along_axis
is producing an array of dtype="<U12"
, i.e. a 12-character unicode. The problem with that is some of the RGB strings are longer than that, so they are being cut off. Really, you shouldn't be using numpy to work with strings. The parsing can easily be done in native Python and the result can still be converted to a pandas DataFrame.
import numpy as np
import pandas as pd
rng = np.random.default_rng(42)
image = rng.integers(0, 255, size=(4,4,3))
rbg_strings_array = [[f"RGB({int(pixel[0])}, {int(pixel[1])}, {int(pixel[2])})"
for pixel in row]
for row in image]
df = pd.DataFrame(rbg_strings_array, columns=range(len(rbg_strings_array)))
print(df)
Result:
0 | 1 | 2 | 3 |
---|---|---|---|
0 | RGB(22, 197, 166) | RGB(111, 110, 218) | RGB(21, 177, 51) |
1 | RGB(187, 194, 182) | RGB(200, 130, 32) | RGB(214, 114, 127) |
2 | RGB(199, 164, 102) | RGB(209, 139, 113) | RGB(114, 57, 23) |
3 | RGB(218, 211, 70) | RGB(161, 42, 193) | RGB(178, 90, 17) |
Answered By - jared
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.