Issue
I'm trying to use this tutorial to have PyTesseract OCR my desktop. It works when I run that script, as you can see by this image:
The code from the tutorial:
#Construct arg parser and parse arg's
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="path to input image to be OCR'd")
# '--image' refers to the path of the input image that will be OCR'd
ap.add_argument("-c", "--min-conf", type=int, default=0, help="min conf value to filter weak text detection")
# sets a min conf to filter weak detections
args = vars(ap.parse_args())
#Load input image, convert from BGR to RGB ch ordering, and
# use Tesseract to localize each area of text in the input image
image = cv2.imread(args["image"] )
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pytesseract.image_to_data(rgb, output_type=Output.DICT)
# 'image_to_data' detects and localizes text
#Loop over each indiv text localizations
for i in range(0, len(results["text"] ) ):
#extract bounding box coordinates of the text region from the current result
x = results["left"][i]
y = results["top"][i]
w = results["width"][i]
h = results["height"][i]
#extract OCR itself along with conf of text localztn
text = results["text"][i]
print(results["conf"][i])
conf = int( results["conf"][i] )
#Filter out weak conf text localztns
if conf > args["min_conf"]:
#display conf and text to terminal
print("Confidence: {}".format(conf) )
print("Text: {}".format(text) )
print("")
#remove non-ASCII text so we can draw text on image using OpenCV, then draw bounding box around text with text itself
text = "".join( [c if ord(c) < 128 else "" for c in text] ).strip()
cv2.rectangle(image, (x,y), (x+w, y+h), (0, 255, 0), 2 )
cv2.putText(image, text, (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 0, 255), 3)
#Show output image
cv2.imshow("Image", image)
cv2.waitKey(0) # makes it so that it'll wait for you to hit a key before it continues
but it doesn't work when I try to implement it into another project. Here's my code:
screenshotOfDesktop = pyautogui.screenshot('screenshotOfDesktop.png')
#have Tesseract read it
readDesktop_SAP = cv2.imread('screenshotOfDesktop.png')
#convert data to string
rgb = cv2.cvtColor(readDesktop_SAP, cv2.COLOR_BGR2RGB)
results = pytesseract.image_to_data(rgb, config='--psm 7', output_type=Output.DICT)
# "config= '--psm 7' " makes it so that PyTesseract reads everything as a single line of text
print(results)
# Iterating through the list of results
for i in range(0, len(results["text"] ) ):
if "Description" not in results["text"]:
print("Didn't find description on screen. Please check that the SAP 'find document' page is open on the screen. ")
input('Press ENTER to exit now. ')
exit()
if "Description" in results["text"]:
print("Found 'Description' on screen! ")
# Gating by confidence
conf = int(results["conf"][i])
if conf < 0.2:
print("Confidence is less than 0.7. Moving on. ")
continue
elif conf >= 0.2:
# Getting the coordinates of the result
Desc_x = results["left"][i]
Desc_y = results["top"][i]
Desc_w = results["width"][i]
Desc_h = results["height"][i]
# Printing everything
print("The coordinates are: ")
print(x, y, width, height)
print(f"Confidence = {conf}")
#
Instead, my code only spits out this for the "results" list:
{'level': [1, 2, 3, 4, 5, 5], 'page_num': [1, 1, 1, 1, 1, 1], 'block_num': [0, 1, 1, 1, 1, 1], 'par_num': [0, 0, 1, 1, 1, 1], 'line_num': [0, 0, 0, 1, 1, 1], 'word_num': [0, 0, 0, 0, 1, 2], 'left': [0, 0, 0, 0, 0, 1451], 'top': [0, 4, 4, 4, 4, 145], 'width': [1920, 1727, 1912, 1727, 891, 276], 'height': [1080, 1061, 1070, 1061, 1061, 8], 'conf': ['-1', '-1', '-1', '-1', 11, 0], 'text': ['', '', '', '', 'fe', '~']}
Does anyone have any clue on why that might be? I understand I'm not using argparser like the writer is, but it should be the same result, no? I checked to make sure that it was looking at the right screenshot as well.
Relevant information:
- Tesseract v4.1.0.20190314
- Python 3.9.2
Solution
I failed to recognize that the tutorial code utilized a grayscale image before using PyTesseract to OCR. I implemented a grayscale and was able to find the text afterwards.
Answered By - MontX
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.