Monday, November 13, 2023

[FIXED] improve the accuracy of image reading of a printscreen with pytesseract

November 13, 2023 ocr, python, python-tesseract, tesseract No comments

Issue

my script captures a part of the screen where there will always be numbers, currently the best configuration I found was this, which is correct more than 90% of the time, but as the image is very small, sometimes it reads the wrong number, some configuration that I can improve to be more accurate?

original img

tresh img, after some changes

my actual code



def text_from_region(region):
    """_summary_

    Args:
        region (tuple of x, y): _description_

    Returns:
        _type_: _description_
    """
    # C:\Program Files\Tesseract-OCR\tesseract.exe
    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

    img = ImageGrab.grab(bbox=(region))
    img = np.array(img)
    img = cv2.cvtColor(src=img, code=cv2.COLOR_RGB2BGR)
    
    gray = cv2.cvtColor(src=img, code=cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

    config = '--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789'
    text = pytesseract.image_to_string(thresh, config=config)

    return int(text)

bug on 568 example, Cap: x is my tkinter label showing result.

Solution

My solution for future readers was to increase the size of the image, which was previously very small, to 300% of its original size, by doing this I achieved a very high precision with the same settings, follow the final scripts..

def ReadImageFromScreen(coords):
    """take a screenshot from a region

    Args:
        coords (int tuple): 0,0,100,100

    Returns:
        _type_: _description_
    """ 
    img = ImageGrab.grab(bbox=(coords))
    img = np.array(img)
    img = cv2.cvtColor(src=img, code=cv2.COLOR_RGB2BGR)
    img = cv2.cvtColor(src=img, code=cv2.COLOR_BGR2GRAY)
    return img

def resize_img(img, scale):

    width = int(img.shape[1] * scale / 100)
    height = int(img.shape[0] * scale / 100)

    dsize = (width, height)
    
    return cv2.resize(img, dsize=dsize)

def text_from_region(region):
    """_summary_

    Args:
        region (tuple of x, y): _description_

    Returns:
        _type_: _description_
    """
    # C:\Program Files\Tesseract-OCR\tesseract.exe
    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

    img = ReadImageFromScreen(region)

    gray_resized = resize_img(img,300)
    _, thresh = cv2.threshold(gray_resized, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)

    config = '--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789'
    text = pytesseract.image_to_string(thresh, config=config)

    return text.strip()

Answered By - Collaxd

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, November 13, 2023

[FIXED] improve the accuracy of image reading of a printscreen with pytesseract

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels