Issue
I am trying to classify an image based on its content. For example, I have got loads of images as below, that will contain some content – in this case numeric values. I had tried OpenCV and Pytesseract OCR solution as proposed here: https://stackoverflow.com/a/60161328/7250310
However, this solution doesn't work on my images, and the content isn't detected. Below are my sample images:
Do you have any other ideas to achieve this? Basically Image 1 should give output as 1
, and so on.
Solution
This simple approach works at least for the four presented images:
import cv2
import pytesseract
images = ['4sXGS.jpg', 'Nizki.jpg', 'T0EM8.jpg', 'g2fY7.jpg']
for img in images:
img = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)[1]
text = pytesseract.image_to_string(img, config='--psm 10')
text = text.replace('\n', '').replace('\f', '')
print(text)
Output:
1
2
3
4
The single steps are:
- Read the image as grayscale.
- Inverse binary threshold the image using Otsu's method.
- Run
pytesseract
using the-psm 10
option (single character). Maybe also add the described whitelisting for identifying digits only.
Caveat: I use a special version of Tesseract from the Mannheim University Library.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.1
OpenCV: 4.5.2
pytesseract: 5.0.0-alpha.20201127
----------------------------------------
Answered By - HansHirse
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.