Issue
I'm trying to read relatively clear numbers from a screenshot, but I am running into issues getting pytesseract to read the text correctly. I have the following screenshot:
And I know the score (2-0) and the clock (1:42) are going to be in the exact same place.
This is the code I currently have for reading the clock time and the orange score:
lower_orange = np.array([0, 90, 200], dtype = "uint8")
upper_orange = np.array([70, 160, 255], dtype = "uint8")
#Isolate scoreboard location on a 1080p pic
clock = input[70:120, 920:1000]
scoreboard = input[70:150, 800:1120]
#greyscale
roi_gray = cv2.cvtColor(clock, cv2.COLOR_BGR2GRAY)
config = ("-l eng -c tessedit_char_whitelist=0123456789: --oem 1 --psm 8")
time = pytesseract.image_to_string(roi_gray, config=config)
print("time is " + time)
# find the colors within the specified boundaries and apply
# the mask
mask_orange = cv2.inRange(scoreboard, lower_orange, upper_orange)
# find contours in the thresholded image, then initialize the
# list of digit locations
cnts = cv2.findContours(mask_orange.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
locs = []
for (i, c) in enumerate(cnts):
# compute the bounding box of the contour, then use the
# bounding box coordinates to derive the aspect ratio
(x, y, w, h) = cv2.boundingRect(c)
ar = w / float(h)
# since score will be a fixed size of about 25 x 35, we'll set the area at about 300 to be safe
if w*h > 300:
orange_score_img = mask_orange[y-5:y+h+5, x-5:x+w+5]
orange_score_img = cv2.GaussianBlur(orange_score_img, (5, 5), 0)
config = ("-l eng -c tessedit_char_whitelist=012345 --oem 1 --psm 10")
orange_score = pytesseract.image_to_string(orange_score_img, config=config)
print("orange_score is " + orange_score)
here's the output:
time is 1:42
orange_score is
Here is the orange_score_img, after I masked out everything within my upper and lower orange bounds and applied a gaussian blur.
Yet at this point, and even when I configure pytesseract to search for 1 character and limited the whitelist, I still can't get it to read correctly. Is there some additional postprocessing that I'm missing to help pytesseract read this number as 2?
Solution
As per @fmw42's suggestion, I tried playing with some morphology changes. Thickening the numbers seemed to do the trick!
kernel = np.ones((5,5),np.uint8)
orange_score_img = cv2.dilate(orange_score_img,kernel,iterations=1)
EDIT: the REAL answer, I realized, is that pytesseract does MUCH better with black text on a white background than white text on a black background! it reads perfectly when I inverted the colors:
orange_score_img = cv2.bitwise_not(orange_score_img)
I hope this helps people when they first start out using pytesseract! trying to tune the image to fit all my cases was incredibly frustrating and knowing that black text on white works much better would have saved me hours...
Answered By - JonathanW
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.