An essential first step to processing mixed batches with many types of documents is classification. Document Classification methods quickly sort documents by type using key content and layout attributes to identify them.
The most popular document classification systems are advanced AI-based machine learning algorithms that automatically learn how to classify documents based on samples and user feedback. These systems are very powerful but also very expensive. Only large organizations processing millions of pages each year can afford these enterprise solutions.
SimpleIndex naturally has a simpler way to do classification based on keyword patterns in the document text. Simply create a list of document types and assign one or more unique keywords or phrases that will only appear in that document type to each. Logical operators for AND, OR and NOT prevent false matches by requiring multiple keywords for matching or excluding documents that contain certain phrases.
Keyword-based classification works for the vast majority of applications at a fraction of the cost of AI classification.
After classification, SimpleIndex can automatically launch separate document indexing workflows for each document type found in the classified batch. This is especially useful when documents have different metadata requirements or business workflows associated with them.
Zone OCR and Dynamic OCR
Large Documents (>500 pages) are Slow to Process
How to activate SimpleView?
How to activate any Add-on or Upgrade to SimpleIndex?
Is SimpleIndex for Windows only? I’m a Mac user.
- Published in Licensing & Installation
Is it possible to search for and retrieve documents with Windows desktop search?
- Published in Database & Retrieval, Export, Office PDF Text Processing
How much do Simple Software products cost?
- Published in Licensing & Installation, LoanStacker, SimpleCoversheet, SimpleExport, SimpleQB, SimpleSend, SimpleView
On what versions of Windows does SimpleIndex run?
- Published in Licensing & Installation
I’m using full page OCR. The information is all appearing in the txt file but it is losing format about half way through. Data to the right is ending up at the end of the txt doc. Can this be fixed?
- Published in OCR
How do you configure full text searching in Retrieval mode?
- Published in Database & Retrieval, OCR
Can OCR text be saved to Office, Text, HTML or other formats?
- Published in Licensing & Installation, OCR
Can SimpleIndex create searchable PDF Image+Text files with hidden text?
- Published in Export, OCR, Office PDF Text Processing
Organize Office Documents with Text Parsing
PDF Text Processing Demo
- 1
- 2

