Issue
I would like to decode the following image into text :
I already tried to use Tesseract OCR for my purpose but I've not been lucky so far.
Here's my code :
import pytesseract
import sys
import argparse
try:
import Image
except ImportError:
from PIL import Image
from subprocess import check_output
def resolve(path):
check_output(['C:\Program Files\ImageMagick-7.0.9-Q16\convert.exe', path, '-resample', '600', path])
return pytesseract.image_to_string(Image.open(path))
if __name__=="__main__":
argparser = argparse.ArgumentParser()
argparser.add_argument('path',help = 'image path at OCR')
args = argparser.parse_args()
path = args.path
print('Resolving the image...')
captcha_text = resolve(path)
print('Result: ',captcha_text)`
Here's the output of my program :
C:\Users\Foussy\PycharmProjects\03_Imagedecoder>python main.py C:\Users\Foussy\Pictures\4570502--437826.jpeg
Resolving the image...
Result:
It seems my program is unable to decode the picture. I tried to decode images with more "obvious" text and it did it well. I also tried several other examples of this type of captcha without success. What do you recommend me to do ?
The thing is, in the end, I would like to write a program that decodes images like this automatically, so unless there's reliable way to modify the images automatically in a way that makes Tesseract compatible with, I don't see any other way to solve this problem. If someone knows a certain library or something... Would be helpful.
Solution
This python library might help: https://pypi.org/project/captcha-solver/
Example:
from captcha_solver import CaptchaSolver
solver = CaptchaSolver('twocaptcha', api_key='2captcha.com API HERE')
raw_data = open('captcha.png', 'rb').read()
print(solver.solve_captcha(raw_data))
Answered By - Priyanka Khairnar
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.