Issue
I have used Pytesseract and openCV to read text from an image. I used the median blur, normalization and threshold to remove the background and was able to read the text.
However, some parts of the text have turned too light during the process of normalization and I wish to darken them so that they match the darkness/intensity of the remaining text in the image. I tried morphological transformations and tried canny+erosion to remove noise, but neither of those helped.
My input looks like this:
In here, "Code", "Division Name", "15" and "Mechanical" are lighter and I am unable to read them, whereas I am easily able to read "Air Distribution" and "Basic materials & methods".
Any help regarding how to change the color of the lighter text would be greatly helpful.
Solution
You can make change in threshold and then apply erode in the white-text-in-black-ground image.
import cv2
import numpy as np
image = cv2.imread("1.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.blur(gray,(3,3))
_,thresh = cv2.threshold(blur,240,255,cv2.THRESH_BINARY)
cv2.imshow("thresh",thresh)
thresh = cv2.bitwise_not(thresh)
element = cv2.getStructuringElement(shape=cv2.MORPH_RECT, ksize=(5, 5))
erode = cv2.erode(thresh,element,3)
cv2.imshow("erode",erode)
cv2.imshow("img",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Answered By - Ha Bom
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.