Issue
I am trying to read captcha using pytesseract module. And it is giving accurate text most of the time, but not all the time.
This is code to read the image, manipulate the image and extract text from the image.
import cv2
import numpy as np
import pytesseract
def read_captcha():
# opencv loads the image in BGR, convert it to RGB
img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)
lower_white = np.array([200, 200, 200], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)
mask = cv2.inRange(img, lower_white, upper_white) # could also use threshold
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))) # "erase" the small white points in the resulting mask
mask = cv2.bitwise_not(mask) # invert mask
# load background (could be an image too)
bk = np.full(img.shape, 255, dtype=np.uint8) # white bk
# get masked foreground
fg_masked = cv2.bitwise_and(img, img, mask=mask)
# get masked background, mask must be inverted
mask = cv2.bitwise_not(mask)
bk_masked = cv2.bitwise_and(bk, bk, mask=mask)
# combine masked foreground and masked background
final = cv2.bitwise_or(fg_masked, bk_masked)
mask = cv2.bitwise_not(mask) # revert mask to original
# resize the image
img = cv2.resize(mask,(0,0),fx=3,fy=3)
cv2.imwrite('ocr.png', img)
text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')
return text
For manipulation of the image, I have got help from this stackoverflow post.
And this the original captcha image:
And this image is generated after the manipulation:
But, by using pytesseract, I am getting text: AX#7rL.
Can anyone guide me on how to improve the success rate to 100% here?
Solution
Since there are tiny holes in your resulting image, morphological transformations, specifically cv2.MORPH_CLOSE
, to close the holes and smooth the image should work here
Threshold to obtain a binary image (black and white)
Perform morphological operations to close small holes in the foreground
Inverse the image to get result
4X#7rL
Potentially a cv2.GaussianBlur()
before inserting into tesseract would help too
import cv2
import pytesseract
# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]
# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)
# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()
Answered By - nathancy
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.