Issue
I have an image from which I want to extract text.
I am using following code to extract text.
pytesseract.image_to_string(text_image, config='-l eng --psm 7')
However, the output is wrong 80% of the time and it detects output like "mE Smart Meter Gateway" or "RTE Smart Meter Gateway". Mainly the issue is in the detection of the first two characters. I am using python3. Any help in improving the detection of the text will be appreciated.
Solution
After adaptiveThresholding, I was able to read the text. First blur the image.
blurred = cv2.GaussianBlur(text_image, (7, 7), 0)
Apply adaptivethresholding.
thresh = cv2.adaptiveThreshold(blurred, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 21, 4)
Finally, extract the text.
text = pytesseract.image_to_string(thresh, config='-l eng --psm 7')
Answered By - Daud Khan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.