Issue
I'm trying this for some days without success that's and I decide to ask for some help :)
I'm quite new to cv2 and tesseract and I'm trying to do something that I thought was easy, but for some reason is not that easy as I was expecting.
This image is a print screen of multiple values that I have to read and convert to text/int and this one is the original. I can isolate everyone from the multiple images that I have, but when I try to convert them, I can't. sometimes, it gives me the right value but 90% of the time he misses.
and here is what i do:
#open the image
image = cv2.imread('image.png')
#resize it to give some help (this way i was able to get some good results)
image2 = cv2.resize(image, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
image2 = cv2.cvtColor(image2, cv2.COLOR_BGR2GRAY)
#getting the parts of the images with the text that i want to convert
name = image2[0:100, 100:500]
stats1 = image2[110:160, 460:580]
stats2 = image2[160:210, 460:580]
stats3 = image2[220:270, 460:580]
#using pytesseract to convert from image to string
name_str = pytesseract.image_to_string(name, lang='eng',config='--psm 6')
stats1_str = pytesseract.image_to_string(ally_stats_grass, lang='eng',config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789.%')
#print the values
print('name', name_str)
print('grass', stats1_str)
meanwhile, I tried also different approaches with threshold and inverting the image colors, also some dilate and erode but without success
image2 = cv2.threshold(image2, 1, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
image2 = 255 - cv2.morphologyEx(image2, cv2.MORPH_CLOSE, kernel, iterations=1)
karnel = np.ones((1, 1), np.uint8)
image2 = cv2.dilate(image2, kernel, iterations=1)
image2 = cv2.erode(image2, kernel, iterations=1)
I'm just praying, and wishing that someone can help me :) Thank you for your time
Solution
To me the percentages looked the most difficult to recognize, so this answer focuses on those. For this image, I think probably want different solutions for different regions of the image.
The percentages are difficult because the black border around them makes the numbers too thick after thresholding. I used cv2.inRange
to only keep areas in the whole image that are green. If you have other images with different colored percentages, this method will not work. And it's worth pointing out that by default, OpenCV images are BGR, so the second number is the green component.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
image = cv2.imread('plagued_void.png')
image = cv2.resize(image, None, fx=5, fy=5, interpolation=cv2.INTER_CUBIC)
mask = cv2.inRange(image, (0, 219, 0), (5, 255, 5))
image = cv2.bitwise_and(image, image, mask=mask)
cv2.imshow('green', image)
cv2.waitKey(0)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
image = cv2.threshold(image, 1, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cv2.imshow('threshold', image)
cv2.waitKey(0)
config = '--oem 3 --psm 6'
txt = pytesseract.image_to_string(image, config = config, lang='eng')
print(txt)
txt = txt.replace('\u201D', '%').replace('Y', '%')
print(txt)
Here's what I get after the first print statement:
+1.20”.
+1.18Y
+1.28Y.
Tesseract has a tough time with the % character and whitelisting it doesn't seem to help. So I do a straight up replace for Y and and the second print statement becomes:
+1.20%.
+1.18%
+1.28%.
There are still trailing periods at the end, which you'll have to get rid of.
Answered By - bfris
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.