Issue
I'm trying to read numbers from a screenshot I'm taking from a game, but I'm having a trouble getting the numbers right.
from pyautogui import *
import pyautogui as pg
import time
import keyboard
import random
import win32api, win32con
import threading
import cv2
import numpy
from pynput.mouse import Button, Controller
from pynput.keyboard import Listener, KeyCode
from PIL import Image
from pytesseract import *
pytesseract.tesseract_cmd = r'D:\Python\Tesseract\tesseract.exe'
#configs
custom_config = r'--dpi 300 --psm 6 --oem 3 -c tessedit_char_whitelist=0123456789'
# 1. load the image as grayscale
img = cv2.imread("price.png",cv2.IMREAD_GRAYSCALE)
# Change all pixels to black, if they aren't white already (since all characters were white)
img[img <= 150] = 231
img[img == 199] = 0
cv2.imwrite('resultfirst.png', img)
# 2. Scale it 10x
scaled = cv2.resize(img, (0,0), fx=10, fy=10, interpolation = cv2.INTER_CUBIC)
# 3. Retained your bilateral filter
filtered = cv2.bilateralFilter(scaled, 11, 17, 17)
# 4. Thresholded OTSU method
thresh = cv2.threshold(filtered, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
time.sleep(1)
# 5. Erode the image to bulk it up for tesseract
kernel = numpy.ones((5,5),numpy.uint8)
eroded = cv2.erode(thresh, kernel, iterations = 2)
pre_processed = eroded
output = pytesseract.image_to_string(pre_processed, config=custom_config)
cv2.imwrite('result.png', pre_processed)
print(output)
Image is pretty clear but returns either 13500 or 18500, but no amount of tinkering returns the 7 correctly. Is there a better way to go at it or am I forgetting something?
EDIT:
I managed to get better results after I converted the yellow (gray after grayscale conversion) to black, to fill the numbers. I added the conversion code to the codeblock.
Before: This was the original result before After: This is the result now
Problem is that pytesseract still returns that 7 as 1 every time. I don't think I can make that 7 more like 7 from this.. what to do?
Solution
Not sure how general this solution will be, but if all of your pictures are like this one a threshold of 103 will work:
image = cv2.imread('price.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
threshold = 103
_, img_binarized = cv2.threshold(gray, threshold, 255, cv2.THRESH_BINARY)
print(pytesseract.image_to_string(img_binarized, config='--dpi 300 --psm 6 --oem 1 -c tessedit_char_whitelist=0123456789').strip())
gives 78500
on my machine.
Answered By - rassar
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.