Issue
I'm trying to extract the numbers from .png file using OpenCv and then the image_to_string()
method from pytesseract, but the output is not good.
I tried some pre-processing methods like resize and noise filters, but still can't get accurate results. How can I handle this?
Solution
Here's a simple preprocessing step to clean up the image before using pytesseract
- Convert image to grayscale
- Sharpen the image
- Perform morphological transformations to enhance text
Since your input image looks blurry, we can sharpen the image using cv2.filter2D()
and a generic sharpening kernel. Other types of kernels can be found here
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
The text has small holes, so we can use cv2.dilate()
to close small holes and smooth the image
sharpen = 255 - sharpen
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
dilate = cv2.dilate(sharpen, kernel, iterations=1)
result = 255 - dilate
Here's the result. You can try using just the sharpened image or the enhanced image with pytesseract
import cv2
import numpy as np
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
cv2.imwrite('sharpen.png', sharpen)
sharpen = 255 - sharpen
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
dilate = cv2.dilate(sharpen, kernel, iterations=1)
result = 255 - dilate
cv2.imwrite('result.png', result)
cv2.waitKey(0)
Answered By - nathancy
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.