Issue
I am trying to run the following script on a databrick python notebook:
pip install presidio-image-redactor
pip install pytesseract
python -m spacy download en_core_web_lg
from PIL import Image
from presidio_image_redactor import ImageRedactorEngine
import pytesseract
image = Image.open("images/ImageData.PNG")
engine = ImageRedactorEngine()
redacted_image = engine.redact(image, (255, 192, 203))
Upon running the last line, I'm getting the error below:
TesseractNotFoundError: tesseract is not installed or it's not in your PATH.
am I missing anything?
Solution
You can use %sh
in a separate cell to execute the shell commands on the driver node. To install tesseract, you can do:
%sh apt-get -f -y install tesseract-ocr
If you need to install it to all nodes of the cluster, you need to use cluster init script with the same command (without %sh
)
Answered By - Alex Ott
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.