Sunday, January 23, 2022

[FIXED] How to recognize deformed text under some other bigger object by using pytesseract and opencv-python in python?

January 23, 2022 ocr, opencv, opencv-python, python-tesseract, tesseract No comments

Issue

I am using pytesseract to recognize text as follow

td = pytesseract.image_to_data(img, output_type=Output.DICT)
tn_boxes = len(td['level'])
for o in range(0, tn_boxes):
    text = td['text'][o]
    print(text)

i am just making an index of Examples by using a simple logic detect keyword 'Example no.' find it's end point keyword 'Sol.' and put a piece of image from keyword 'Example no.' to keyword 'Sol.' into index and then find next example and so on
But when i try following image Then it show output SET THEORY ae . . 5 (6) Let A = {x: x isa negative odd integer} = {-1,-3,-5,-7,...etc
See how it is not recognizing first line Sol. (a) Let A={x:x is a natural number..etc.
And when i try it with following image not having horizontal line it just works fine.

Is there any way to configure pytesseract to recognize text with having a line above it ?

Edited:

sometimes when we place some image above text or some other text with higher size then pytesseract fails to detect text below that bigger object.

Is there any solution for this kind of problem may be there is a way to configure detection minimum size or configure to detect all possible sized text even under some bigger objects ?

For example it show output usually denoted by o(G). ors a a {= 7 Wave =e () oe that the set of ae | group usual ition of integers.
See how it is not detecting keyword Example 1. for folowing image

But when i try following image it shows output usually denoted by o(G). Example 1. (2) Prove that th . group under usual addition of integers, Now it is detecting keyword Example 1.

Solution

Read e.g. image processing to improve tesseract OCR accuracy and read the docs.

Answered By - user898678

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, January 23, 2022

[FIXED] How to recognize deformed text under some other bigger object by using pytesseract and opencv-python in python?

Issue

Is there any way to configure pytesseract to recognize text with having a line above it ?

Edited:

Is there any solution for this kind of problem may be there is a way to configure detection minimum size or configure to detect all possible sized text even under some bigger objects ?

Solution

0 comments:

Post a Comment

Popular Posts

Labels