Issue
I'm creating a bot for a video game and I have to read some information displayed on the screen. Given that the information is always at the same position, I have no issue to take a screenshot and crop the picture to the right position.
90% of the time, the recognition will be perfect, but sometimes it will return something that seems totally random (see the example below).
I've tried to turn the picture into black and white with no success, and tried to change the pytesseract config (config = ("-l fra --oem 1 --psm 6"))
def readScreenPart(x,y,w,h):
monitor = {"top": y, "left": x, "width": w, "height": h}
output = "monitor.png"
with mss.mss() as sct:
sct_img = sct.grab(monitor)
mss.tools.to_png(sct_img.rgb, sct_img.size, output=output)
img = cv2.imread("monitor.png")
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imwrite("result.png", img)
config = ("-l fra --oem 1 --psm 6")
return pytesseract.image_to_string(img,config=config)
Example : this picture generates a bug, it returns the string "IRPMV/LEIILK"
Another image
Now I don't know where the issue comes from, given that it is not just a single wrong character but a totally random result..
Thanks for your help
Solution
As the comment said, it's about your text and background color. Tesseract is basically useless with light text on dark background, here is the few lines i apply to any text image before giving it to tesseract :
# convert color image to grayscale
grayscale_image = cv2.cvtColor(your_image, cv2.COLOR_BGR2GRAY)
# Otsu Tresholding method find perfect treshold, return an image with only black and white pixels
_, binary_image = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)
# we just don't know if the text is in black and background in white or vice-versa
# so we count how many black pixels and white pixels there are
count_white = numpy.sum(binary > 0)
count_black = numpy.sum(binary == 0)
# if there are more black pixels than whites, then it's the background that is black so we invert the image's color
if count_black > count_white:
binary_image = 255 - binary_image
black_text_white_background_image = binary_image
Now you're sure to have black text on white background no matter wich colors was the original image, also Tesseract is (weirdly) the most efficient if the characters have an height of 35pixels, larger characters doesn't significantly reduce the accuracy, but just a few pixels shorter can make tesseract useless!
Answered By - Appa21
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.