Issue
I am on Windows 10, and I try to extract digits from this image
with the pytesseract
library with language lets
(cf. https://github.com/adrianlazaro8/Tesseract_sevenSegmentsLetsGoDigital or LetsGoDigital, cf. https://github.com/arturaugusto/display_ocr).
I preprocessed my image (grey, threshold and erosion) to get:
But the output of
pytesseract.image_to_string(img, lang='lets')
is empty.
Solution
You didn't set any specific page segmentation method. I'd opt for --psm 6
here:
Assume a single uniform block of text.
So, even without further pre-processing I get the proper result:
import cv2
import pytesseract
img = cv2.imread('RcVbM.jpg')
text = pytesseract.image_to_string(img, lang='lets', config='--psm 6')
print(text.replace('\n', '').replace('\f', ''))
# 004200
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.1
OpenCV: 4.5.2
pytesseract: 5.0.0-alpha.20201127
----------------------------------------
Answered By - HansHirse
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.