SIGN IN YOUR ACCOUNT TO HAVE ACCESS TO DIFFERENT FEATURES

Login with Google


CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR DETAILS?

AAH, WAIT, I REMEMBER NOW!

CREATE ACCOUNT

ALREADY HAVE AN ACCOUNT?

Login with Google

QUESTIONS? CALL: 865-637-8986
  • SIGN UP
  • LOGIN

SimpleIndex

SimpleIndex

T (865) 637-8986
Email: info@simpleindex.com

SimpleIndex by SimpleSoftware
500 W Summit Hill Dr SW # 302, Knoxville, TN 37902

  • LEARN MORE
    • GENERAL INFO
      • Getting Started
      • How To Scan Documents
      • Barcode Scanning Guide
      • Searching & Viewing
      • News & Updates
      • Schedule a Web Demo
    • FEATURES
      • Streamlined Interface
      • TWAIN and ISIS Scanning
      • Dynamic OCR
      • Database Integration
      • Required Documents Check
      • Integrated & Unattended Processing
      • SharePoint Document Scanning
    • –
      • Document Classification
      • PDF & MS Office Text Parsing
      • Barcode Recognition
      • Optical Mark Recognition
      • Match Documents to Existing Data
      • Imprinting & Watermarking
      • Screenshot OCR
  • SOLUTIONS
    • General
      • All-In-One Scanning & Sorting Tool
      • Affordable Document Management
      • Instant Integration
      • Network Scanners & Copiers
      • Remote Document Capture
      • Reduce Click Charges for Data Capture
    • Specific
      • Sales Tax Exemption Forms
      • Federal Tax Returns
      • Invoice Processing
      • Material Safety Data Sheets (MSDS)
      • Patent ID and Title Extraction
      • Mortgage & Loan Documents
    • Feature Demos
      • Zone OCR with Template Matching
      • Full-Page OCR & Multi-User Workflow
      • PDF Text Processing
      • Organize Office Documents
      • Integration with RPA Bots
      • Compare with Other Solutions
  • SUITE
    • SimpleCoversheet – Print Bar Codes
    • SimpleExport – Data File Converter
    • SimpleView – Search, View & Edit
    • SimpleQB – QuickBooks Integrator
    • SimpleOCR – Freeware OCR
    • Buy Suite Apps
    • Buy Suite Bundles
  • DOWNLOAD
  • SHOP
    • COMPARE VERSIONS
    • SIMPLEINDEX WORKSTATION
      • Machine License
      • Concurrent User
      • Subscription License
    • SIMPLEINDEX SERVER
    • SUITE APPLICATIONS
    • SUITE BUNDLES
    • MAINTENANCE & RENEWALS
    • FIND A DEALER
      • Dealer Locator
      • Become a Dealer
    • CONTACT SALES
  • SUPPORT
    • WIKI HELP
    • KNOWLEDGE BASE
    • SIMPLEINDEX UNIVERSITY
      • SimpleIndex University – 100 Series
      • SimpleIndex University – 200 Series
      • SimpleIndex University – 300 Series
    • PRIVACY POLICY
    • CONTACT SUPPORT
  • My Account
    • Downloads
  • MY CART
    No products in cart.
  • Home
  • Page

Process text with RegEx (Regular Expressions) to perform complex pattern matching and extract data from the document text or OCR results.

Zone OCR and Dynamic OCR

Friday, 12 January 2018 by Simple Software
Zone OCR Automate Data Entry Documents

Many document scanning solutions use Zone OCR to obtain index data from the page.

SimpleIndex improves upon this time-tested but ultimately limited model with its Dynamic OCR feature.

Let’s look at the difference between the two methods:

Zone OCR

Zone OCR is used to read document indexes or tags from text on the page. It is a great way to automate the data entry associated with scanning documents.

However, there are several limitations to zone OCR that must be overcome:

  • Index information must be in the exact same place on every page
  • Documents shift and skew during scanning, causing the zones to not line up
  • If surrounding lines or text on the document are too close, they can encroach on the zone

Dynamic OCR

SimpleIndex overcomes these limitations by using Dynamic OCR technology to locate the desired text even when it moves around on the page. Our simplified version of Dynamic OCR works great for many types of documents at a fraction of the cost of other solutions.

  • Index information can appear anywhere on any page
  • Unwanted characters are automatically ignored
  • Find unique patterns of letters and numbers using Template Matching
    (Social Security #, Date, etc.)
  • Use Dictionary Matching to find a value from a list of possible values
    (Vendor Name, Document Type, etc.)

Download document scanning and OCR software.

Dynamic OCR Examples

In the video we see how SimpleIndex approaches a typical Zone OCR example. With SimpleIndex you can use large zones that give a wide margin for error. Template and Dictionary matching are then used to extract the 7-digit Account Number, 6-digit Order Number and Company Name. SimpleIndex discards the surrounding text and keeps the correct value.

Another common example is finding a unique identifier, for example a social security number, that could appear anywhere on the page. Simply enter the template ###-##-#### and SimpleIndex will search the full OCR text until it finds a match. Since only one social security number is likely to appear on the page, a match on this pattern is almost certainly the required value.

With dictionary matching, you can give SimpleIndex a list of possible values and it will automatically search the zone or page for each possible value until it finds a match.

Many dynamic forms processing applications can be implemented using these simple algorithms. This makes SimpleIndex far more versatile than other zone OCR solutions that require the index value to be in the exact same location on every page. Yet SimpleIndex costs only a fraction of the price!

SimpleIndex‘s dynamic forms processing can greatly speed up data entry by eliminating a good percentage of indexing work. For many this can put the labor cost of scanning within their reach.

MS Office Document OCR Text Parsing Video

Dynamic OCR can also be applied to MS Office and PDF files, creating a fully automated process for intelligently indexing and reorganizing electronic documents.

Amazon AWS Textract Cloud OCR Batch Processing

Amazon AWS Textract Cloud OCR

With Textract you can capture data from almost any type of form, including handwritten ones! Textract identifies labeled text anywhere on the document and returns the label text along with the corresponding value. Map the labels to index fields in SimpleIndex and you are ready to capture that data no matter where it appears on the page.

Textract uses machine learning with a huge model based on the billions of pages processed using Textract to provide the most accurate OCR and form field extraction solution available.

By default, Textract is only available as an API and requires custom coding to integrate it into your document workflows. SimpleIndex turns it into a fully-featured document batch document and data processing app that is ready to use out-of-the-box.

Since there are no templates to configure or train, setup can be done in hours instead of days or weeks months required by other enterprise data capture solutions.

Pay-as-you-go pricing makes SimpleIndex with Textract the most affordable way to batch process forms for projects with less than 50,000 pages per year to process, especially if you need to read handwriting or have forms with many layout variations.

Wiki: How to configure AWS Textract OCR in SimpleIndex

Support for Regular Expressions

Use Regular Expressions to extract index data from OCR text, PDF and Office documents.

SimpleIndex OCR has a simple built-in template format, as well as support for Regular Expressions. Regular Expressions (RegEx for short) let you define complex search patterns to extract matching values from the text.  This greatly enhances the functionality of the dynamic OCR in SimpleIndex, making it capable of finding variable-length fields with no distinct pattern.

Regular Expressions are a commonly used in text parsing applications. The Perl programming language makes extensive use of RegEx, as do UNIX utilities like “grep”. Many programmers and IT personnel are already familiar with RegEx and can create complex expressions without specific training.

Click here for a reference guide to Regular Expressions

Download document scanning and OCR software.

New OCR Features in Latest Version

  • OCR language pack now includes all available Tesseract languages including Hindi, Tamil, Arabic, Chinese, Thai, Vietnamese, Japanese, Korean, Indonesian, Hebrew and many more.
  • FineReader engine upgraded from version 9 to version 11, providing improved accuracy, MRC compression and multi-threaded processing for large documents.
  • Amazon AWS Textract Cloud OCR option gives you advanced forms extraction, accounts payable invoice and receipt extraction, handprint recognition, and the most accurate OCR available.

How to Configure SimpleIndex OCR

Our Wiki help has extensive information on how to configure OCR for various document and data capture scenarios.

  • Zone OCR read data in a specific location
  • Template matching to match unique patterns
  • Dictionary matching to match a list of possible values
  • OCR Options OCR job settings that apply to all fields
  • File Formats that can be output by OCR
  • Languages supported by OCR
  • FineReader versus Tesseract OCR engines
  • Searchable PDF with MRC compression
  • OCR to Field for point and click OCR during verification
  • Cloud OCR using Textract

Watch this Simple Software University training video to see how to configure and run an OCR job with SimpleIndex.

Download document scanning and OCR software.

 

KB Articles for Optical Character Recognition (OCR)

  • Language Pack for Standard/Tesseract OCR
  • Languages Supported in SimpleSoftware OCR Engines
  • What is Document Imaging?
  • Change the Dictionary Separator Value
  • Change the OCR Font or Type
  • Regular Expression (RegEx) - Syntax or Type
  • Autonumber Increment Value
  • I'm using full page OCR. The information is all appearing in the txt file but it is losing format about half way through. Data to the right is ending up at the end of the txt doc. Can this be fixed?
  • Is there a way to just use part of a bar code or OCR value? For example, extract "50" from the value "124450"
  • If I have a form which is filled manually by hand, can SimpleIndex read the data from it?
Automatic Data CaptureBatch ScanningDocument ClassificationDocument ImagingFile IndexingInvoice OCROCROffice PDF Text ProcessingOptical Character RecognitionRegExScreenshot OCRSearchText ProcessingWatermark PDF FilesWorkflow SoftwareZone OCR
Read more
No Comments

Contact Us Today!

Search Knowledge Base

Recent KB Articles

  • Zone OCR and Dynamic OCR
  • SimpleIndex - Affordable document scanning and OCR
  • SimpleIndex 10.1 with Textract!
  • SimpleIndex Cloud OCR
  • Astro RPA Single User, Single Bot
  • Astro RPA QuickBooks Online & AP Integration Bot
  • SimpleIndex Cloud OCR Add-on
  • SimpleIndex Cloud OCR Add-on - 1 Year Maintenance

Feature Cloud

Watermark PDF Files QuickBooks Document Management Barcode OCR Contentverse Imprinting Command Line Interface TIFF PDF Annotations Read PDF Forms SharePoint Migration Full Text Indexing Document Numbering System OCR Scanning TWAIN Scanning Software Server OCR Office to PDF Mortgage Automatic Data Capture Optical Mark Recognition Barcode Printing MS Office Screen Scraping OCR Export Business Process Automation Oracle Bates Numbering Software SQL Server Robotic Process Automation Document Scanning TWAIN PDF Barcode Recognition PDF Forms Watermark QR Code Keyword Indexing Bar Code Scanning Remote Capture Document Retrieval Optical Character Recognition Document Imaging PDF Compression Bar Code Printing Barcode Reading Software SharePoint Scanning TIFF Batch Scanning

Online Support Options

Check our Wiki Help, Knowledge Base and Training Videos, or Contact Support if you still need Help

How to Buy

Solutions start at just $500! Buy SimpleIndex online or from an Authorized Dealer in your area.

Authorized Dealers

Authorized DealersSimpleIndex is a great addition to any system integrator's product line. Become an Authorized Dealer.

Get a Web Demo

Get a free online demo with a scanning specialist who can configure SimpleIndex on your computer remotely.
Sign up for a demo now!

Download a Trial

SimpleIndex Trial30-day trial downloads are available for all Simple Software applications.
Download Now!

SimpleIndex Applications

SimpleIndex Applications Packaged apps built with SimpleIndex.
SimpleInvoice for AP
Sales Tax Manager
Mortgage LoanStacker
MSDS and Patents
SimpleIndex

© 2022 Meta Enterprises, LLC | Knoxville, Tennessee | A Family Owned Company
© 2022 SimpleSoftware | Consulting Services in the Field of Software as a Service

TOP
Manage Cookie Consent
We use cookies to optimize our website and our service.
Functional cookies Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage vendors Read more about these purposes
View preferences
{title} {title} {title}
});