SimpleIndex Document Management Software Support

Dynamic Zone
OCR


Traditional Zone OCR vs. Dynamic OCR       Videos: Zone OCR | Full Page OCR


Traditional Zone OCR

Zone OCR is used to read document indexes or tags from text on the page.  Zone OCR is a great way to automate the data entry associated with scanning documents.

However, there are several limitations to TRADITIONAL zone OCR that must be overcome: 
  • Index information must be in the exact same place on every page  
  • Documents shift and skew during scanning, causing the zones to not line up  
  • If surrounding lines or text on the document are too close, they can encroach on the zone

SimpleIndex® Dynamic OCR

SimpleIndex overcomes these limitations by using Dynamic OCR technology to locate the desired text even when it moves around on the page.  Our simplified version of Dynamic OCR works great for many types of documents at a fraction of the cost of other solutions. 
  • Index information can appear anywhere on any page 
  • Unwanted characters are automatically ignored 
  • Find unique patterns of letters and numbers using Template Matching (Social Security #, Date, etc.) 
  • Use Dictionary Matching to find a value from a list of possible values (Vendor Name, Document Type, etc.)

Dynamic OCR Examples

In the video we see how SimpleIndex approaches a typical Zone OCR example.  With SimpleIndex you can use large zones that give a wide margin for error.  Template and Dictionary matching are then used to extract the 7-digit Account Number, 6-digit Order Number and Company Name.  SimpleIndex discards the surrounding text and keeps the correct value.

Another common example is finding a unique identifier, for example a social security number, that could appear anywhere on the page.  Simply enter the template ###-##-#### and SimpleIndex will search the full OCR text until it finds a match.  Since only one social security number is likely to appear on the page, a match on this pattern is almost certainly the required value.

With dictionary matching, you can give SimpleIndex a list of possible values and it will automatically search the zone or page for each possible value until it finds a match.

Many dynamic forms processing applications can be implemented using these simple algorithms. This makes SimpleIndex far more versatile than other zone OCR solutions that require the index value to be in the exact same location on every page. Yet SimpleIndex costs only a fraction of the price!

SimpleIndex's dynamic forms processing can greatly speed up data entry by eliminating a good percentage of indexing work. For many this can put the labor cost of scanning within their reach.

Dynamic OCR can also be applied to MS Office and PDF files, creating a fully automated process for intelligently indexing and reorganizing electronic documents.

Support for Regular Expressions

PDF OCR Video SimpleIndex OCR has a simple built-in template format, as well as support for Regular Expressions.  Regular Expressions (RegEx for short) let you define complex search patterns to extract matching values from the text.  This greatly enhances the functionality of the dynamic OCR in SimpleIndex, making it capable of finding variable-length fields with no distinct pattern.

Regular Expressions are a commonly used in text parsing applications.  The Perl programming language makes extensive use of RegEx, as do UNIX utilities like "grep".  Many programmers and IT personnel are already familiar with RegEx and can create complex expressions without specific training.

Click here for a reference guide to Regular Expressions

Version 7 Builds on SimpleIndex's Powerful Dynamic OCR

Version 7 includes the industry leading ABBYY FineReader® OCR engine for dramatically improved OCR accuracy and speed. Other OCR enhancements in version 7 include: 
  • Native support for PDF files without conversion 
  • Searchable PDF Image + Hidden Text output 
  • Interactive template builder and tester 
  • Improved OCR languge support 
  • OCR Zones can be created on any page of a multi-page file 
  • Pre-defined templates for data types such as dates, dollar amounts, etc. 

 
 

Document Imaging Suite

SimpleCoversheet
- barcode printing software
SimpleSend
- securely transfer data
SimpleQC
- powerful image viewer
SimpleQB
- easy Quickbooks integration
SimpleExport
- automatic data conversion
SimpleOCR
- freeware OCR application