Issue
i would like to have the entries from the following vehicle registration document automatically written to a text file.
However, the text recognition is very difficult. I have tried to open the image in different configurations. I have also tested different colour levels of the vehicle registration document. However, none of my attempts yielded a usable result.
Does anyone have an idea how it would be possible to recognise the text properly?
This is the image i tried to ocr:
The Code i used is shown in the Following:
import cv2
import numpy as np
import pytesseract
import matplotlib.pyplot as plt
from PIL import Image
import regex
pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'
img = cv2.imread("Fahrzeugscheinsplit1.jpg")
result = pytesseract.image_to_string(img)
print(result)
My output is shown in here:
|
08.05.2006)'| 8566) ADVOOOO1X
ne r pear
a BORD 7 aoe \
‘BWY i
QUBB1 Repieee ay a f
TRAC |
| = say, |
is Mondeo ath }
FO! s 1
Fz.2.Pers, +b. 8 Spl. .
Kombilimousine
vo) EURO 4
«| BURO 4 ) Re !
» Diesel ES
ll 0002. WW 0d62. l2198 |
Solution
First, you should know the image-processing techniques for tesseract. From the official documentation you can apply simple-threshold.
If you apply simple thresholding, the result will be:
I think we should center the image for accurate recognition. We can center the image by adding borders:
The image is ready for text-extraction, if we process the image with the confidence > 30:
Nearly all the text in the given input image is detected. We can also print the values of the detected texts:
Detected Text: 08.05.2006
Detected Text: 8566!
Detected Text: M1
Detected Text: AC
Detected Text: 8
Detected Text: 6
Detected Text: FORD
Detected Text: BWY
Detected Text: SFHAP7
Detected Text: Mondeo
Detected Text: FORD
Detected Text: (D)
Detected Text: Pz.z.Pers.bef.b.
Detected Text: 8
Detected Text: Spl.
Detected Text: Kombilimousine
Detected Text: EURO
Detected Text: 4
Detected Text: EURO
Detected Text: 4
Detected Text: Diesel
Detected Text: 0002
Detected Text: 0462
Detected Text: 2198
Using simple thresholding we nearly found all the values correctly, for the missing parts you can play with the values like decreasing the confidence level or increasing the thresh level or using other threshold methods like adaptive-thresholding or inRange-thresholding
Code:
from cv2 import imread, cvtColor, COLOR_BGR2GRAY as GRAY
from cv2 import imshow, waitKey, rectangle, threshold, THRESH_BINARY as BINARY
from cv2 import copyMakeBorder as addBorder, BORDER_CONSTANT as CONSTANT
from pytesseract import image_to_data, Output
bgr = imread("UXvS7.jpg")
gray = cvtColor(bgr, GRAY)
border = addBorder(gray, 50, 50, 50, 50, CONSTANT, value=255)
thresh = threshold(border, 150, 255, BINARY)[1]
data = image_to_data(thresh, output_type=Output.DICT)
for i in range(0, len(data["text"])):
confidence = int(data["conf"][i])
if confidence > 30:
x = data["left"][i]
y = data["top"][i]
w = data["width"][i]
h = data["height"][i]
text = data["text"][i]
print(f"Detected Text: {text}")
rectangle(thresh, (x, y), (x + w, y + h), (0, 255, 0), 2)
imshow("", thresh)
waitKey(0)
Answered By - Ahx
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.