Issue
I have a dataset of grayscale images, like this one below: Now, I open my dataset with the following class:
"""Tabular and Image dataset."""
def __init__(self, excel_file, image_dir):
self.image_dir = image_dir
self.excel_file = excel_file
self.tabular = pd.read_excel(excel_file)
def __len__(self):
return len(self.tabular)
def __getitem__(self, idx):
if torch.is_tensor(idx):
idx = idx.tolist()
tabular = self.tabular.iloc[idx, 0:]
y = tabular["Prognosis"]
image = PIL.Image.open(f"{self.image_dir}/{tabular['ImageFile']}")
image = np.array(image)
#image = image[..., :3]
image = transforms.functional.to_tensor(image)
return image, y
If I check the tensors of the image, then I have this:
tensor([[[160, 160, 192, ..., 52, 40, 40],
[176, 208, 320, ..., 96, 80, 16],
[176, 240, 368, ..., 160, 160, 52],
...,
[576, 608, 560, ..., 16, 16, 16],
[624, 592, 544, ..., 16, 16, 16],
[624, 624, 576, ..., 16, 16, 16]]], dtype=torch.int32)
Now, they should be between 0 and 1, right? Because it is grayscale, or 0-255 in RGB, but there are those big values that I have no idea from where they are coming (indeed imshow
shows images with strange distorted colors like yellow and blue rather then grayscale).
However, this is the size of the images torch.Size([1, 2350, 2866])
; I want to resize to (1,224,224) for example
This is my function:
def resize_images(images: List[str]):
for i in images:
image = PIL.Image.open(f"{data_path}TrainSet/{i}")
new_image = image.resize((224, 224))
new_image.save(f"{data_path}TrainImgs/{i}")
resize_images(list(df["ImageFile"]))
However, this code returns all images that are 224x244 but they are all black. All images are completely black!
Solution
You have either shared a different file from the one your code opens, or imgur
has changed your file. In any case, the most expedient way to examine the content of your file on Linux/Unix/macOS without installing any software is to use the file
command. So, checking your file we can see that it is an 8-bit PNG with alpha channel:
file pZS4l.png
pZS4l.png: PNG image data, 896 x 732, 8-bit/color RGBA, non-interlaced
That immediately tells me it is not the exact image you opened in your code, because there are values exceeding 255 in your pixel dump, and that is not possible in an 8-bit file. So, the next best way to check the contents of an image is with exiftool
and that works for Windows users too. That looks like this:
exiftool pZS4l.png
ExifTool Version Number : 12.30
File Name : pZS4l.png
Directory : .
File Size : 327 KiB
File Modification Date/Time : 2022:11:05 17:15:22+00:00
File Access Date/Time : 2022:11:05 17:15:23+00:00
File Inode Change Date/Time : 2022:11:05 17:15:22+00:00
File Permissions : -rw-r--r--
File Type : PNG
File Type Extension : png
MIME Type : image/png
Image Width : 896
Image Height : 732
Bit Depth : 8
Color Type : RGB with Alpha
Compression : Deflate/Inflate
Filter : Adaptive
Interlace : Noninterlaced
Profile Name : ICC Profile
Profile CMM Type : Apple Computer Inc.
Profile Version : 2.1.0
Profile Class : Display Device Profile
Color Space Data : RGB
Profile Connection Space : XYZ
Profile Date Time : 2022:07:06 14:13:59
Profile File Signature : acsp
Primary Platform : Apple Computer Inc.
CMM Flags : Not Embedded, Independent
Device Manufacturer : Apple Computer Inc.
Device Model :
Device Attributes : Reflective, Glossy, Positive, Color
Rendering Intent : Perceptual
Connection Space Illuminant : 0.9642 1 0.82491
Profile Creator : Apple Computer Inc.
Profile ID : 0
Profile Description : Display
...
...
So, now I see it is a screen-grab made on a Mac. A screen-grab is not the same as the image you are displaying! If you display a 16-bit image on an 8-bit display, the screen-grab will be 8-bit. You should share your original image, not a screen-grab. If imgur
is changing your images, you should share them with Dropbox or Google Drive or similar.
Right, on to your question. Assuming you actually open a PNG in your code (which we can't tell because it is incomplete) the data should not be float because PNG cannot store floats, it can only store integers. PNG can store integer samples with bit depths of 1, 2, 4, 8 or 16-bits. If you read about a 24-bit PNG, that is RGB888. If you read about a 32-bit PNG, that is RGBA8888. If you read about a 48-bit PNG, that is RGB with 16 bits/sample. If you read about 64-bit PNG, that is RGBA with 16-bits/sample.
So, the short answer is to run:
file YOURACTUALIMAGE.PNG
and/or:
exiftool YOURACTUALIMAGE.PNG
So, my suspicion is that you have a 16-bit greyscale PNG, which is perfectly able to store samples in the range 0..65535.
Note: If you actually do want to store floats, you probably need to use TIFF, or PFM (Portable Float Map) or EXR format.
Answered By - Mark Setchell
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.