Issue
I was trying to use pytesseract
to find the box positions of each letter in an image. I tried to use an image, and cropping it with Pillow and it worked, but when I tried with a lower character size image (example), the program may recognize the characters, but cropping the image with the box coordinates give me images like this. I also tried to double up the size of the original image, but it changed nothing.
img = Image.open('imgtest.png')
data=pytesseract.image_to_boxes(img)
dati= data.splitlines()
corde=[]
for i in dati[0].split()[1:5]: #just trying with the first character
corde.append(int(i))
im=img.crop(tuple(corde))
im.save('cimg.png')
Solution
If we stick to the source code of image_to_boxes
, we see, that the returned coordinates are in the following order:
left bottom right top
From the documentation on Image.crop
, we see, that the expected order of coordinates is:
left upper right lower
Now, it also seems, that pytesseract
iterates images from bottom to top. Therefore, we also need to further convert the top
/upper
and bottom
/lower
coordinates.
That'd be the reworked code:
from PIL import Image
import pytesseract
img = Image.open('MJwQi9f.png')
data = pytesseract.image_to_boxes(img)
dati = data.splitlines()
corde = []
for i in dati[0].split()[1:5]:
corde.append(int(i))
corde = tuple([corde[0], img.size[1]-corde[3], corde[2], img.size[1]-corde[1]])
im = img.crop(tuple(corde))
im.save('cimg.png')
You see, left
and right
are in the same place, but top
/upper
and bottom
/lower
switched places, and where also altered w.r.t. the image height.
And, that's the updated output:
The result isn't optimal, but I assume, that's due to the font.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
Pillow: 8.1.0
pytesseract: 4.00.00alpha
----------------------------------------
Answered By - HansHirse
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.