PDF Invoice OCR Demo
This demonstrates the PDF OCR text processing capabilities of SimpleIndex by extracting the Document Number, Date, Document Type, Customer and Total from a number of Estimates and Invoices.
All of this information is read automatically using the existing text layer of a computer generated PDF, such as those created using PDF printer drivers. Template and dictionary matching algorithms are used to locate and extract the correct data values from the text.
Since the existing text is being used, OCR is not performed. This makes processing much faster and 100% accurate. OCR can be used to get text from scanned PDF files with no existing text.