I spent the morning kludging together a python script to convert a grocery receipt into a spreadsheet as part of one of my New Year’s resolutions. There seem to be a few options out there for scanning and recording receipts, but it’s not clear that they apply an OCR technique to automatically convert them to spreadsheet.
Here’s the receipt I used:
This website provided some python source code to detect edges in the image then the outline of the receipt and transform out any foreshortening or other viewing distortion —
To detect edges, the code converts the color image to grayscale and applies the Canny edge detection scheme, which involves applying a Gaussian blur to suppress noise, calculating image derivatives, and looking for large values. The result is shown in the image above on the left, and more details on the algorithm here.
Next, the code finds the outline of the receipt by using the OpenCV‘s findContours, sorts the contours by area, and finds the contour with the largest area but with four vertices.
The code then applies a four-point transformation to warp the receipt to give it a rectangular shape and finally thresholds the grayscale to enhance the contrast. The rightmost panel in the above image shows the final result.
To convert the image to a table of text, I used PyTesseract, which provides OCR capabilities. I installed the package Tesseract using homebrew: “brew install tesseract”.
Then I just grabbed the code from this website to convert the final result into a text table:
st = pytesseract.image_to_string(Image.open(save_filename), config="-psm 6")
The “psm=6” option was required to return the text properly.
Unfortunately, the OCR analysis wasn’t perfect. For example,
was converted to
‘*CRESgENT R01 1 1800000401 4.82 IF
The prices on all lines came back fine, but the description was often distorted. I decided I cared more about the price anyway. Fortunately, the WinCo receipt had “TF” or “TX” at the rightmost side, so I performed a regex search to find the beginning of that string and grabbed the characters to the left of that.
Finally, I converted the strings into a list of comma-separated values to load into Excel or Google Sheets, leaving a space between the corrupted description and price so I could enter my own description, giving
CRESgENTR0111, , 4.82
On the off-chance it will be useful to someone else, I’ve posted the code here. Using my script will also require the source code for pyimagesearch, which requires submitting an e-mail address.