Tesseract Receipt Tracker
Workflow
-
Acquire Image:
readtool on image path (supports jpg, png, pdf first page). -
Setup tesseract:
exec pip install tesseractTesseract:
exec sudo apt update && sudo apt install tesseract-ocr -
Extract Text:
# Variant command for tesseract exec tesseract --image_path image.jpg --output ocr.txt -
Parse Fields:
exec python3 scripts/parse_receipt.py ocr.txt -
Log Data: Write to expense_log.csv or json.
Post-Processing
Use regex/scripts for receipt-specific fields: total, subtotals, taxes, odometer, dates.
scripts/
Custom parsers for structured extraction.
references/
Field mappings and examples.