Sunday, December 3, 2023

[FIXED] how to convert C++ tesseract-ocr code to Python?

December 03, 2023 c++, python, python-tesseract, tesseract No comments

Issue

I want to convert the C++ version Result iterator example in tesseract-ocr doc to Python.

  Pix *image = pixRead("/usr/src/tesseract/testing/phototest.tif");
  tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
  api->Init(NULL, "eng");
  api->SetImage(image);
  api->Recognize(0);
  tesseract::ResultIterator* ri = api->GetIterator();
  tesseract::PageIteratorLevel level = tesseract::RIL_WORD;
  if (ri != 0) {
    do {
      const char* word = ri->GetUTF8Text(level);
      float conf = ri->Confidence(level);
      int x1, y1, x2, y2;
      ri->BoundingBox(level, &x1, &y1, &x2, &y2);
      printf("word: '%s';  \tconf: %.2f; BoundingBox: %d,%d,%d,%d;\n",
               word, conf, x1, y1, x2, y2);
      delete[] word;
    } while (ri->Next(level));
  }

What I could do till right now is the following :

import ctypes
liblept = ctypes.cdll.LoadLibrary('liblept-5.dll')
pix = liblept.pixRead('11.png'.encode()) 
print(pix)

tesseractLib = ctypes.cdll.LoadLibrary(r'C:\Program Files\tesseract-OCR\libtesseract-4.dll')

tesseractHandle = tesseractLib.TessBaseAPICreate()

tesseractLib.TessBaseAPIInit3(tesseractHandle, '.', 'eng')

tesseractLib.TessBaseAPISetImage2(tesseractHandle, pix)
#tesseractLib.TessBaseAPIRecognize(tesseractHandle, tesseractLib.TessMonitorCreate())

I cannot convert the C++ api->Recognize(0) to Python(what I have tried is in the last line(commented) of the code, but it is wrong), I am not experienced with C++, so I cannot go on anymore, anyone can help with the conversion ? The APIs:

I guess I also have some difficulty on the subsequent conversion , for example , I don't know how to denote tesseract::RIL_WORD in Python, so it would be kind to provide me a full version of the conversion , thanks !

I know there is a project named tesserocr can save me from the conversion , but the problem with the project is they don't provide an uptodate windows Python wheels, which is the main reason for me to do the conversion .

Solution

I think the problem is that api->Recognize() expects a pointer as first argument. They mistakenly put a 0 in their example but it should be nullptr. 0 and nullptr both have the same value but on 64bits systems they don't have the same size (usually ; I assume on some weird non-x86 systems this may not be true either).

Their example still works with a C++ compiler because the compiler is aware that the function expects a pointer (64bits) and fix it silently.

In your example, it seems you haven't specified the exact prototype of TessBaseAPIRecognize() to ctypes. So ctypes can't know a pointer (64 bits) is expected by this function. Instead it assumes that this function expects an integer (32 bits) --> it crashes.

My suggestions:

Use ctypes.c_void_p(None) instead of 0
If you intend to use that in production, specify to ctypes all the function prototypes
Be careful with the examples you look at: Those examples use Tesseract base API (C++ API) whereas if you want to use libtesseract with Python + ctypes, you have to use Tesseract C API. Those 2 APIs are very similar but may not be identical.

If you need further help, you can have a look at how things are done in PyOCR. If you decide to use PyOCR in your project, just beware that the license of PyOCR is GPLv3+, which implies some restrictions.

Answered By - Jerome Flesch

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, December 3, 2023

[FIXED] how to convert C++ tesseract-ocr code to Python?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels