Sunday, March 20, 2022

[FIXED] Tesseract output changing, adding, and removing numbers from very clear image

March 20, 2022 opencv, python, python-tesseract, tesseract No comments

Issue

I am working on a program that uses a webcam to read constantly changing digits off of a screen using pytesseract (long story). It takes an image of the whole screen, then cuts out each number needed to be recorded (there are 23 of them) using predetermined coordinates stored in the list called 'roi'. There are some other steps but this is the most important part. Currently it is adding, deleting, and changing numbers constantly, but not consistently. Here are some examples:

It reads this incorrectly as '32.0'

It reads this correctly as '52.0'

It reads this incorrectly as '39.3'

It reads this incorrectly as '2499.1'

These images have already been processed using OpenCV, and it's what all the images in the roi set look like. Based on other answers, I have binarized it, tried to clean up the edges, and put a white border around the image (see code).

This program reads the screen every 30 seconds, sometimes getting it right, other times getting it wrong. Many times it likes change 5s into 3s, 3s into 5s, and 5s into 9s. Sometimes it just misses or adds digits altogether. Below is my code for processing the images.

pytesseract.pytesseract.tesseract_cmd = #tesseract file path
scale = 1.4
img = cv2.imread(#image file path#)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.rotate(img, cv2.ROTATE_180)
width = int(img.shape[1] / scale)
height = int(img.shape[0] / scale)
dim = (width, height)
img = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)                                    
cv2.destroyAllWindows()

myData = []
cong = r'--psm 6 -c tessedit_char_whitelist=+0123456789.-'

for x,r in enumerate(roi):                                                                 
    imgCrop = img[r[0][1]:r[1][1], r[0][0]:r[1][0]]        
    scalebig = 0.2
    wid = int(imgCrop.shape[1] / scalebig)
    hei = int(imgCrop.shape[0] / scalebig)
    newdims = (wid, hei)
    imgCrop = cv2.resize(imgCrop, newdims)

    imgCrop = cv2.threshold(imgCrop,155,255,cv2.THRESH_BINARY)[1]

    kernel2 = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))                              
    imgCrop = cv2.morphologyEx(imgCrop, cv2.MORPH_CLOSE, kernel2, iterations=2)

    value = [255,255,255]
    imgCrop = cv2.copyMakeBorder(imgCrop, 10, 10, 10, 10, cv2.BORDER_CONSTANT, None, value = value)

    datapoint = pytesseract.image_to_string(imgCrop, lang='eng', config=cong)
    myData.append(datapoint)

The output is the pictures I linked above.

I have looked into fine tuning it, but I have a Windows machine and I can't seem to find a good tutorial. I am not a programmer by trade, I spent 2 months teaching myself Python to do this, but the machine learning aspect of Tesseract has me spinning, and I don't know how else to fix remarkably inconsistent readings. If you need any further info please ask and I'll be happy to tell you.

Edit: Added some more incorrectly read images for reference

Solution

Make sure you use the right image format (jpeg is the wrong format for OCR)
In the case of the tesseract LSTM engine make sure the letter size is not bigger than 35 points.

With tesseract best_tessdata I got these results:

tesseract 593_small.png -
59.3

tesseract 520_small.png -
52.0

tesseract 2491_small.png -
249.1

Answered By - user898678

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, March 20, 2022

[FIXED] Tesseract output changing, adding, and removing numbers from very clear image

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels