Issue
I have installed Anaconda 2018.12 (Python 3.7 version). I am trying to test out the pytesseract module but I keep encountering:
TesseractNotFoundError: C:\Program Files (x86)\Tesseract-OCR\tesseract.exe is not installed or it's not in your path
I have done:
- pip install Pillow (already installed it says)
- pip install pytesseract (successful)
- Tried to set the tesseract_cmd to the location of tesseract (but I can't find it)
I have searched for the tesseract.exe file but cannot find it anywhere on the system so I'm struggling to understand how do I reference/import the module into a jupyter notebook if it's already been consumed into anaconda?
The code I'm trying to run is:
from PIL import Image
import pytesseract
#pytesseract.pytesseract.tesseract_cmd = r"C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe"
text = pytesseract.image_to_string(Image.open('C:\Temp\IMG_1519.jpg'))
print(text)
I'm hoping it's simple user error but any assistance would be gratefully received. Many thanks, Ben
Solution
Quoting from the PyPi page:
Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine.
and (under prequisites):
Install Google Tesseract OCR (additional info how to install the engine on Linux, Mac OSX and Windows)
This means, that pytesseract
is not a standalone module. It is a python wrapper for using the Google’s Tesseract-OCR Engine, which you need to install seperately
Answered By - FlyingTeller
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.