Issue
I have 3 pictures of a digital clock:
Task: To somehow recognize the numbers in these photos using pytesseract
.
I have already tried in the usual way, but Tesseract was unable to determine anything.
Solution
EDIT: Improved the pre-processing to properly detect the two new input examples:
import cv2
import pytesseract
def detect_7_segments(image):
config = '--psm 6 -c tessedit_char_whitelist="0123456789 "'
return pytesseract.image_to_string(image, lang='lets', config=config)
for img_file in ['q0Y40.png', '9dgOm.png', 'XQAfP.png', 'k7BmR.png']:
img = cv2.imread(img_file, cv2.IMREAD_GRAYSCALE)
img = cv2.blur(img, (5, 5))
img = cv2.threshold(img, 64, 255, cv2.THRESH_BINARY)[1]
print(img_file, detect_7_segments(img).replace('\f', '').replace('\n', ''))
# q0Y40.png 0
# 9dgOm.png
# XQAfP.png 67
# k7BmR.png 85
Check the LetsGoDigital font, cf. this question, and my earlier answer for further details.
import cv2
import pytesseract
def detect_7_segments(image):
return pytesseract.image_to_string(image, lang='lets', config='--psm 6')
for img_file in ['XQAfP.png', 'k7BmR.png']:
img = cv2.imread(img_file, cv2.IMREAD_GRAYSCALE)
print(img_file, detect_7_segments(img).replace('\f', ''))
# XQAfP.png 67
#
# k7BmR.png 85
#
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.2
OpenCV: 4.5.2
pytesseract: 5.0.0-alpha.20201127
----------------------------------------
Answered By - HansHirse
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.