Issue
I am teaching myself python and am trying to make a simple program to recognize letters from an image. The letters are not in sentence or paragraph form. I am trying to do this using cv2 + pytesseract for detection, but I just can't seem to get it to work reliably. I am beginning to suspect I am using the wrong tool for the job but I can't find anything else to help me.
This is my reference image with the letters I want to extract:
Ideally I would like the letter and also the coordinates of each letter (bounding box). I've been able to apply a mask and threshold to the image to get this:
But what I am stuck on is Pytesseract being unable to reliably give me the letters individually or even correctly. Here is my console output...
$ py main.py --image test.png
D
C UL
UO
The code I am using is simply taking the black and white text image and running it through pytesseract. I've tried playing around with the --psm
flag but because the text is in an odd shape, I haven't had much luck.
text = pytesseract.image_to_string(Image.open(filename), config='-l eng --psm 11')
os.remove(filename)
print(text)
Solution
You can segment and process each letter one by one. You can look the detail in my code.
import cv2
import numpy as np
import pytesseract
img = cv2.imread("xO6JI.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
items = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = items[0] if len(items) == 2 else items[1]
img_contour = img.copy()
for i in range(len(contours)):
area = cv2.contourArea(contours[i])
if 100 < area < 10000:
cv2.drawContours(img_contour, contours, i, (0, 0, 255), 2)
detected = ""
for c in contours:
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
base = np.ones(thresh.shape, dtype=np.uint8)
if ratio > 0.9 and 100 < area < 10000:
base[y:y+h, x:x+w] = thresh[y:y+h, x:x+w]
segment = cv2.bitwise_not(base)
custom_config = r'-l eng --oem 3 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ" '
c = pytesseract.image_to_string(segment, config=custom_config)
print(c)
detected = detected + c
cv2.imshow("segment", segment)
cv2.waitKey(0)
print("detected: " + detected)
cv2.imshow("img_contour", img_contour)
cv2.waitKey(0)
cv2.destroyAllWindows()
The result
U
O
L
C
D
detected: UOLCD
Answered By - us2018
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.