Issue
I'm using pytesseract 0.3.10 with tesseract 5.3.0. I want to take a look at how tesseract processed my images. I tried setting tessedit_write_images to true via:
import pytesseract as pt
pt.image_to_string(crop_img, lang='eng+deu+fra+spa', config="--psm 6 -c tessedit_write_images=1")
But this is not working. The tessinput.tif file is nowhere to be found.
(The --psm 6
part is working.)
I also tried to use tessedit_write_images=True
or tessedit_write_images=T
.
Using pt.run_and_get_output()
is also not working.
Is there a possibility to set the variable tessedit_write_images
to true outside my python script?
Solution
Create a "config" text file and write into it:
tessedit_write_images true
Than use the command line: tesseract Text.png out.txt config
This gives you a text and a .tiff file. If you rename config to config.txt works also in python subprocess:
import subprocess
process = subprocess.run(["tesseract", "Text.png", "out.txt", "config.txt"], shell=False, stdout=subprocess.PIPE)
PS: I used tesseract v5.1.0.20220510 leptonica-1.78.0
Answered By - Hermann12
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.