Sunday, March 27, 2022

[FIXED] pytesseract image_to_string not accurate enough

March 27, 2022 opencv-python, python, python-tesseract No comments

Issue

I wanna read digits in a loop from a cropped sudoku picture using python (newby in this lang...) and googling it recommend using pytesseract,

first I tried with PIL usage for reading the picture

from PIL import Image
import pytesseract

image = Image.open('./test.png')

width, height = image.size
left = 0
top = 0
i = 0
j = 0
while (top < height):
    while (left < width):
        crop_img = image.crop((left, top, left + width / 9,  top + height / 9))
        print(i, j, pytesseract.image_to_string(crop_img, config='--psm 6'))
        left += width / 9
        j += 1
    top += height / 9
    i += 1
    left = 0
    j = 0

the outcome of print was like so

not accurate enough, but not so bad.

So my second attempt was using cv2 instead PIL, and as suggested in other answers I shifted the pic to be black text upon white bg (could be that it bit messy and not best practice, tips are welcome :) )

import pytesseract
import cv2

image = cv2.imread('./test.png', 0)
height, width = image.shape
left = 0
top = 0
i = 0
j = 0
while (top < height):
    while (left < width):
        crop_img = image[int(top):int(top + height/9),
                         int(left):int(left + width/9)]
        thresh = cv2.threshold(
            crop_img, 155, 255, cv2.THRESH_BINARY_INV)[1]
        result = cv2.GaussianBlur(thresh, (5, 5), 0)
        result = 255 - result
        print(i, j, pytesseract.image_to_string(result, config='--psm 6'))
        left += width / 9
        j += 1
    top += height / 9
    i += 1
    left = 0
    j = 0

what gives me

in both cases I saved (.save(} for PIL and imwrite for cv2) the crop image for debugging, and actually the pics are pretty clear, for example in cv2 cropped{ 2, 2 } spot (that evaluate as empty spot) the cropped img is

the full sudoku image

thanks in advance!

Solution

For this, I used OpenCV for the image, and then saved the board into a numpy array. The main thing I did was add an argument of the config for the image_to_string() call to restrict the output to only be digits. This does take a while though, since it's predicting individually for each digit like I think you were in your original.

import cv2
import numpy as np
import pytesseract

im = cv2.resize(cv2.imread('./test.png'), (900, 900))

out = np.zeros((9, 9), dtype=np.uint8)

for x in range(9):
    for y in range(9):
        num = pytesseract.image_to_string(im[10 + x*100:(x+1)*100 - 10, 10 + y*100:(y+1)*100 - 10, :], config='--psm 6 --oem 1 -c tessedit_char_whitelist=0123456789')
        if num:
            out[x, y] = num

This gave me this output on your image in your post, with 0s as blank spaces.

array([[5, 3, 0, 0, 7, 0, 0, 0, 0],
       [6, 0, 0, 1, 9, 5, 0, 0, 0],
       [0, 9, 8, 0, 0, 0, 0, 6, 0],
       [8, 0, 0, 0, 6, 0, 0, 0, 3],
       [4, 0, 0, 8, 0, 3, 0, 0, 1],
       [7, 0, 0, 0, 2, 0, 0, 0, 6],
       [0, 6, 0, 0, 0, 0, 2, 8, 0],
       [0, 0, 0, 4, 1, 9, 0, 0, 5],
       [0, 0, 0, 0, 8, 0, 0, 7, 9]], dtype=uint8)

It's not the cleanest, but it seems to work pretty well.

Answered By - duckboycool

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, March 27, 2022

[FIXED] pytesseract image_to_string not accurate enough

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels