Office PDF Document Indexing
SimpleIndex uses the existing text of Microsoft Office documents (Word, Excel, PowerPoint, etc.) and PDF files to extract data using RegEx patterns and database keyword matching. Scanned PDF files are converted to text with OCR. Automatically assign metadata and upload to any document management system.
Organize Office Documents with Text Parsing
Tuesday, 23 January 2018
This video shows the Sort My Documents sample job included with the SimpleIndex trial download. It shows how you can organize office documents automatically by parsing the file’s text for relevant metadata and keywords. You can then use those keywords to tag documents with metadata and create standardized folders and filenames. First we sort Word
PDF Text Processing Demo
Friday, 12 January 2018
This sample job demonstrates the PDF text processing capabilities of SimpleIndex by extracting the Document Number, Date, Document Type, Customer and Total from a number of documents without OCR, by processing the text layer of PDF files. Computer-generated PDF files, such as those created using PDF printer drivers, already contain digitized text. SimpleIndex reads the
- 1
- 2

