Issue
I want to convert the C++ version Result iterator example in tesseract-ocr doc to Python.
Pix *image = pixRead("/usr/src/tesseract/testing/phototest.tif");
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
api->Init(NULL, "eng");
api->SetImage(image);
api->Recognize(0);
tesseract::ResultIterator* ri = api->GetIterator();
tesseract::PageIteratorLevel level = tesseract::RIL_WORD;
if (ri != 0) {
do {
const char* word = ri->GetUTF8Text(level);
float conf = ri->Confidence(level);
int x1, y1, x2, y2;
ri->BoundingBox(level, &x1, &y1, &x2, &y2);
printf("word: '%s'; \tconf: %.2f; BoundingBox: %d,%d,%d,%d;\n",
word, conf, x1, y1, x2, y2);
delete[] word;
} while (ri->Next(level));
}
What I could do till right now is the following :
import ctypes
liblept = ctypes.cdll.LoadLibrary('liblept-5.dll')
pix = liblept.pixRead('11.png'.encode())
print(pix)
tesseractLib = ctypes.cdll.LoadLibrary(r'C:\Program Files\tesseract-OCR\libtesseract-4.dll')
tesseractHandle = tesseractLib.TessBaseAPICreate()
tesseractLib.TessBaseAPIInit3(tesseractHandle, '.', 'eng')
tesseractLib.TessBaseAPISetImage2(tesseractHandle, pix)
#tesseractLib.TessBaseAPIRecognize(tesseractHandle, tesseractLib.TessMonitorCreate())
I cannot convert the C++ api->Recognize(0)
to Python(what I have tried is in the last line(commented) of the code, but it is wrong), I am not experienced with C++, so I cannot go on anymore, anyone can help with the conversion ? The APIs:
From the source code: https://github.com/tesseract-ocr/tesseract/blob/420cbac876b06beeee271d9f44ba800d943a8a83/include/tesseract/capi.h
I guess I also have some difficulty on the subsequent conversion , for example , I don't know how to denote tesseract::RIL_WORD
in Python, so it would be kind to provide me a full version of the conversion , thanks !
I know there is a project named tesserocr can save me from the conversion , but the problem with the project is they don't provide an uptodate windows Python wheels, which is the main reason for me to do the conversion .
Solution
I think the problem is that api->Recognize()
expects a pointer as first argument. They mistakenly put a 0
in their example but it should be nullptr
. 0
and nullptr
both have the same value but on 64bits systems they don't have the same size (usually ; I assume on some weird non-x86 systems this may not be true either).
Their example still works with a C++ compiler because the compiler is aware that the function expects a pointer (64bits) and fix it silently.
In your example, it seems you haven't specified the exact prototype of TessBaseAPIRecognize()
to ctypes. So ctypes can't know a pointer (64 bits) is expected by this function. Instead it assumes that this function expects an integer (32 bits) --> it crashes.
My suggestions:
- Use
ctypes.c_void_p(None)
instead of 0 - If you intend to use that in production, specify to ctypes all the function prototypes
- Be careful with the examples you look at: Those examples use Tesseract base API (C++ API) whereas if you want to use libtesseract with Python + ctypes, you have to use Tesseract C API. Those 2 APIs are very similar but may not be identical.
If you need further help, you can have a look at how things are done in PyOCR. If you decide to use PyOCR in your project, just beware that the license of PyOCR is GPLv3+, which implies some restrictions.
Answered By - Jerome Flesch
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.