Issue
thanks so much for your time in advance. i tried the following code to grab digits from the attached image but the results were so bad. I would really appreciate some suggestions on how to preprocess the image so i can get better results. does the red background in the img makes it difficult to get result?
#import needed modules
import cv2
import pytesseract
from PIL import Image
import numpy as np
def thin_font(pic):
pic = cv2.bitwise_not(pic)
kernel = np.ones((1,1),np.uint8)
pic = cv2.erode(pic, kernel, iterations=1)
pic = cv2.bitwise_not(pic)
return (pic)
imgFile = "c:/test1.jpg"
img = cv2.imread(imgFile)
#img upscaling----------------------
width = int(img.shape[1]*1.4)
height = int(img.shape[0]*1.4)
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
thinimg = thin_font(resized)
imggray = cv2.cvtColor(thinimg, cv2.COLOR_BGR2GRAY)
imginv = cv2.bitwise_not(imggray)
thresh, inputimg = cv2.threshold(imginv, 150, 230,cv2.THRESH_BINARY)
#-----------------------------------------------
text = pytesseract.image_to_string(inputimg, config="outputbase digits")
print(text)
Solution
With some trial and error I managed to get decent results, but not perfect...
The main idea is "representing" PyTesseract one table cell at a time.
The answer doesn't include automatic table separation using image processing.
The solution assumes that the width and height of the cells are fixed and known from advance (some cropping and padding were needed).
(In case you want to do it automatically, here is a nice code sample).
Preprocessing that gave the best OCR results:
- Convert the image (or each cell) to grayscale.
- Invert polarity - make the text black on white (instead of white on black).
- Resize the "cell" by a factor of x2 in each axis.
Tesseract configuration that gave the best results:
text = pytesseract.image_to_string(cell, config="-c tessedit"
"_char_whitelist=' '0123456789-."
" --psm 6")
Code sample:
import pytesseract
import numpy as np
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # May need when using Windows
imgFile = "test1.png"
img = cv2.imread(imgFile)
img = img[0:-10, 1:-4, :] # Crop the relevant part
img = np.pad(img, ((3, 0), (0, 0), (0, 0)), 'edge') # Add some padding to the top (making constant cell height).
for row in range(8):
print() # New line
for col in range(7):
x0 = col*80 # Assume cell width is 80 pixels
y0 = row*19 # Assume cell height is 19 pixels
x1 = x0 + 80
y1 = y0 + 20
cell = img[y0+1:y1-1, x0+1:x1, :] # Crop the cell in position [col, row]
cell = 255 - cv2.cvtColor(cell, cv2.COLOR_BGR2GRAY) # Convert to grayscale and invert polarity
cell = cv2.resize(cell, (cell.shape[1]*2, cell.shape[0]*2), interpolation=cv2.INTER_CUBIC) # Resize up by a factor of x2 in each axis.
text = pytesseract.image_to_string(cell, config="-c tessedit"
"_char_whitelist=' '0123456789-."
" --psm 6")
print(text.rjust(11), end='', flush=True) # Print the text without newline (add leading spaces).
cv2.imshow('cell', cell) # Show the cell as image
cv2.waitKey() # Wait for key pressing
print() # New line
cv2.destroyAllWindows()
Output:
-2227 -410.59 11.11 -12673.94 -135.49 -106.01 -349.10
-2629 -403.90 3.81 -15635.17 -243.68 -115.72 318.26
-1791 404.17 8.60 -8068.60 44.42 -87.76 -1663.20
-2920 -674.54 5.74 -11296.37 -146.38 -143.96 486.33
-3110 -728.97 3.92 -11358.89 -173.37 -150.93 436.33
-3283 -752.10 -12.20 -9683.32 -158.25 -151.55 -753.67
-2412 498.37 10.56 -11971.43 -101.01 -119.15 -916.63
-2583 446.77 7.01 -14523.37 -176.70 -120.52 -277.24
Issues:
There is an issue with the minus sign, when the sign touches the digit.
Example:
(In that case the minus sign is not identified).
Suggested solution:
Check if the background color is red or green, and if it's red, add a minus sign (if not exist).
Answered By - Rotem
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.