Automating Document Capture
The Benefits of Automation  
Once you have decided the best way to organize, store and retrieve your documents, the next part of the planning stage is to find the most efficient way to scan these documents and associate them with the correct index field values. Creating an efficient scanning and indexing process will save you countless hours of labor over the life of your project.

The two main methods for automating indexing are barcode recognition and Optical Character Recognition (OCR). Barcode recognition is faster and more accurate, but your documents must contain a barcode on the document or a cover page for this to work. OCR is able to read printed data directly from the page, which means most documents can be processed as-is. However there are many conditions that can affect the practicality of OCR that will be discussed in this section.

If your index data already exists in another database, SimpleIndex has two features that can make use of this data to automate processing. The Index Autofill feature lets you enter one key field that is used in a database lookup to retrieve matching values and fill in the remaining index fields automatically. SimpleIndex also has the ability to pre-set index values using the Command Line Interface and have a scanned document receive these indexes automatically.
 
Using Barcode Recognition  
Barcode recognition is the most efficient way to capture index data printed on documents. Some documents already have key information in barcode format on them. If your project is to scan new documents on an ongoing basis, it may be possible for you to redesign it to include barcodes. Having a barcode with index data on the document is the best case scenario, for all the index data is on the document at the time it is created in a format that can be read with near 100% accuracy.

If it is not possible to print barcodes on the document itself, an alternative is to have the person who creates the document print out a barcode cover page and place it on the file before it is scanned. The SimpleCoversheet application was designed to make this easy by providing a simple interface for selecting index values and printing a standard coversheet that contains these values in barcode format.

Barcode recognition can also be useful when you have documents with a variable number of pages that will all receive the same index values. If it is not possible to generate an indexed coversheet for these at the time they are created, a generic barcode coversheet can be used to separate the scanned images into multi-page files, one for each document. A second process can then be used to index these images one file at a time instead of one page at a time, greatly increasing throughput.
 
Using OCR  

Traditionally, zone OCR solutions require you to specify a region on the page where index information will be found. This region is recognized and the result is inserted into an index field. The problem with traditional zone OCR is that if the region is moved slightly due to variations in scanning, the result could contain extra neighboring characters or cut off desired characters. This limits the usefulness of traditional zone OCR to documents where the index value is in the exact same place every time and has plenty of white space around it.

SimpleIndex’s OCR contains many advanced features to overcome the inherent limitations of zone OCR. This is done by providing template and dictionary matching for OCR fields. These features search the OCR results for a certain pattern or list of possible values and return only the matching data. This allows you to draw your OCR zones much larger than normal, ensuring that no matter how much the data shifts around it will always be contained within that region.

It is even possible to draw your zone around the entire page and find key information that is not printed in any fixed location. For example, a doctor’s office may receive lab reports from many different labs. Each report is formatted differently, but each contains the patient’s name somewhere on it. Using the dictionary matching feature with a patient name list, SimpleIndex can identify the correct patient for each lab automatically.

When implementing OCR for document automation, carefully consider the data you are trying to recognize. Is the text legible? Does it appear in a fixed location? Does it conform to a unique pattern that won’t be found anywhere else on the page? Is there a list available with all the possible values for this field? Answer these questions and you will know which OCR approach is best for your application.

 
Using Index Autofill  
The Autofill feature of SimpleIndex is an easy way to associate many index fields with one document without retyping data that already exists in another application. Autofill uses a database lookup to retrieve records that match a key value entered by the user. Blank index fields are then filled in automatically with the data from this lookup. The result is a document database with many different possible search fields, of which only one needed to be entered during scanning.

The key field may be typed by the user, or it may be read from the document automatically using barcode recognition or OCR. The lookup is performed either when the user changes this field or when the index values are saved. If the lookup finds multiple matching records, the user will be notified and the first set of values will be used by default.
 
Using Pre-Indexed Batches  
Pre-index batches are a unique feature of SimpleIndex that greatly improve throughput for scanning a single document at a time. Pre-indexed batches can be configured to allow the user to enter index values prior to scanning, or they can be executed from the command line to circumvent user interaction altogether.

Some typical scenarios for pre-indexed batches are:
1. User scans one document at a time by entering field values first, scanning and having the images saved with these values automatically.
2. User has several pre-defined documents that they scan. All field values are saved with the configuration file. User loads the scanner and double-clicks the appropriate configuration to scan and save that file automatically.
3. SimpleIndex is integrated with an existing application. A “Scan Current Record” button is implemented that launches SimpleIndex and passes the index values for the current document through the command line. The user loads the scanner and clicks this button; images are scanned and saved automatically.
 
Find Out More    
Product Information Index
Getting Started Guide
Sample Applications
Video Demos
Simple Software University
Frequently Asked Questions
Other Simple Software Products

Get a Web Demo    
Get a free online demo with a scanning specialist who can configure SimpleIndex on your computer remotely. Sign up now!

Download a Demo   
Fully functional 30-day demos are available for all Simple Software applications. Download Now!


Watch the Video!

 
Online Video Library  
Video Index
Training Videos
Zone OCR
Barcode Recognition
PDF OCR Text Parsing
How Many Clicks?

Applications by Industry  
See how SimpleIndex can be used in your business.
Health Care
Financial
Education
Legal
Mortgage
 
How to Buy    
Solutions start at just $500!  Buy SimpleIndex online or from an Authorized Dealer in your area. View the Price List.

Online Support Options    
Simple Software provides an interactive Frequently Asked Questions database and Live Support chat system, as well as free Training Videos.

QuickBooks Users    
SimpleQB lets you scan and view documents from QuickBooks and import transaction data from OCR, barcodes or a database.  More on SimpleQB.

Become an Authorized Dealer  
SimpleIndex is a great addition to any system integrator's product line. Become an Authorized Dealer.


Affordable Forms Processing - Automatic Data Capture - Barcode Recognition - Batch Scanning - Bates Stamping - Distributed Document Capture - Document Imaging - Document Indexing - Inexpensive Document Management - OCR - PDF Conversion - Scanning Software - Zone OCR