Issue
I tried to extract text from this image using Tesseract.
The code that I tried:
img = Image.open('downloadedpng.jpeg').convert('L')
ret,img = cv2.threshold(np.array(img), 125, 255, cv2.THRESH_BINARY)
img = Image.fromarray(img.astype(np.uint8))
print(pytesseract.image_to_string(img))
The output that I got:
re vie
I've tried erosion and dilation with the below code:
img_erosion = cv2.erode(img, kernel, iterations=1)
img_dilation = cv2.dilate(img, kernel, iterations=1)
but I got errors. Any idea how to properly convert it to string ?
Solution
You should pass a black on white text for best results:
import cv2
from PIL import Image
img = cv2.imread('1.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.bitwise_not(img) # <- invert
ret, thresh = cv2.threshold(img, 100, 255, cv2.THRESH_BINARY)
im = Image.fromarray(thresh.astype("uint8"))
print(pytesseract.image_to_string(im))
Answered By - K41F4r
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.