Batch Scanning Pages - SimpleIndex

Batch OCR software is a form of Optical Character Recognition software that allows for the conversion of multiple files at once, usually through a hot folder or watched folder method that converts any files added to a particular folder on your computer on a preset schedule. It also includes OCR servers, which can split the workload among more than one machine.

I have a duplex scanner. How to set up SimpleIndex to scan two sided documents automatically?

Friday, 11 August 2023 by Simple Software

Please refer to the Wiki Documentation for the complete Scanning reference.

Simplex versus Duplex scanning is a function of your scanner driver. SimpleIndex uses both TWAIN and ISIS drivers. ISIS drivers are faster for high-speed scanners and are preferred.

To configure duplex on an ISIS scanner:

1 Select “Use ISIS Driver” from the Scan menu if it is not selected

2 If this is your first time using ISIS, click “Select a Scanner” to select the driver for your scanner

3 Click Display Scanner Settings to display the ISIS driver settings

4 Find the setting for Simplex/Duplex and set appropriately. Each scanner model has a different driver interface. Refer to your scanner’s documentation if you cannot find the duplex setting.

To set the scanner settings using TWAIN:

1 Select “Use TWAIN Driver” from the Scan menu

2 If this is your first time scanning with TWAIN, click “Select a Scanner” and select your scanner

3 If “Display Scanner Settings” is not checked in the Scan menu, click it to select it

4 When you run the job, the scanner settings will be displayed prior to scanning. Find and select the duplex setting there. If you cannot find it, refer to your scanner’s documentation.

When scanning in duplex, you will want to enable automatic blank page deletion. Many scanners have this feature built-in, so check your driver settings to see if there is a blank page deletion option.

If your scanner doesn’t have blank page deletion, use the “Min File Size” setting on the Batch tab of the Options screen to enable it in SimpleIndex. This setting refers to the minimum file size (in bytes) for a page to be considered non-blank. Any file under that size will be deleted. This value varies depending on scan resolution, and does not work with uncompressed files.

No Comments

Features

Monday, 14 November 2022 by Simple Software

SimpleIndex is the perfect solution for small business and departments looking to manage their files from a single interface, developers who don’t want to reinvent the wheel and large companies with many locations looking to decentralize their scanning.

SimpleIndex organizes scanned images and electronic documents into a single document management database your employees can access on their desktops.

SimpleIndex takes the labor out of document imaging by providing powerful barcode recognition and OCR search algorithms that can find index values no matter where they are on the page. By providing these essential automations for a reasonable price, we make document management affordable for anyone.

The most unique automation is our OCR template and dictionary matching search algorithms. This lets you find information like date, customer name, invoice number and other information on documents with different layouts. These can even be applied to the text in Office documents and PDF files to organize these files automatically and attach them to a database.

The SimpleCoversheet barcode printing application lets anyone in your company print bar-coded coversheets with all the information needed to identify a document. This is perfect for scanning with a centralized scanner or networked digital copier. SimpleIndex can then be configured to process these documents automatically without any user intervention whatsoever!

This level of automation is provided by SimpleIndex‘s command line interface. All the settings related to scanning or searching documents can be saved to “Job Files”, which can be saved to an icon and launched with a click of the mouse or a single line of code. These jobs can be configured to scan documents, read barcodes and OCR, generate folders and filenames and upload the index information to a database in a single step.

It is simply not possible to find an easier, faster way to process your documents!

Key Features and Recent Enhancements

All Editions offer OCR and Barcode

- SimpleIndex Standard has Tesseract OCR and DTK Barcode engines. These provide good recognition on clean originals.
- SimpleIndex Professional adds ABBYY FineReader OCR Engine, ICR handprint recognition, ISIS Scanning, and Cloud OCR. These are able to recognize hard-to-read text and bar codes, and improve overall speed and accuracy.
- Process existing text in PDF files and MS Office documents in all versions.

OCR
- Pro version includes ABBYY^® FineReader Engine for faster, more accurate OCR
- Searchable PDF output
- Clipboard / Screen Shot OCR
- Point-Click OCR – Click on text to send it to an index field
- Enhanced OCR Options – match against other indexes, skip OCR on files with existing text
- Support for international character sets using Unicode

Bar Code Recognition
- Multi-engine Barcode voting to boost accuracy (Professional version)
- Support for most 2D barcodes in all versions
- User-defined Barcode delimiter – no longer restricted to “|” for parsing multi-index barcodes
- Find/Replace characters in barcodes for matching or autofill

TWAIN and ISIS Scanning
- TWAIN support in all versions. ISIS support available as an add-on or in Pro version.
- Improved real-time image processing
- Multiple scan windows when using ISIS
- Scan directly to a network folder (processing occurs during scan)

Desktop Processing
- Selectively reprocess files
- Run multiple copies of SimpleIndex simultaneously
- Save any image region to a separate file for signature capture, etc.

Server Processing
- Run multiple jobs on different schedules
- Run multiple copies of the same job for parallel processing and increased throughput
- Server licenses can be purchased in 1 Million Page per Year increments as an add-on to any workstation license
- Unlimited page barcode processing license available with Advanced Barcode Server
- Server processing compatible with Windows 7 or above and all Windows Server versions

PDF Handling
- Support for reading and writing password protected PDF files
- Convert MS Office, HTML, XML and images to PDF before processing
- Searchable PDF output
- Native PDF Viewer – no dependence on 3rd party software
- PDF Auto-repair – attempts correction of bad PDF files

Indexing
- Return MD5 hash values
- Configure default values for empty fields
- Export to XML
- Edit Fixed Fields – make changes to auto-generated indexes

SharePoint Integration
- Compatible with all versions of SharePoint and SharePoint Online
- Append to or replace existing files in SharePoint
- Automatically match index fields to SharePoint columns to populate data

Website Resources

SimpleIndex.com is full of information and interactive content to help you learn what you need to know to implement SimpleIndex in your organization.

Design Philosophy & Getting Started Guide
An introduction to the way SimpleIndex approaches document processing
Bringing Document Imaging to Everyone
How Simple Software solutions reduce the cost of entry into the Paperless Office
Sample Applications
Examples of different ways you can put SimpleIndex to use
Compare SimpleIndex to the Competition
Videos showing the same batch of documents being scanned and indexed with SimpleIndex as well as 4 top desktop document capture applications.
Top Features of SimpleIndex
Detailed information on the automated scanning and indexing features of SimpleIndex
Demonstration Videos
See how SimpleIndex automates indexing with OCR and barcode recognition
Simple Software University
Online training videos teaching all aspects of Simple Software configuration
Simple Software FAQ
Answers to common questions about Simple Software products
How Many Clicks does it Take to Scan My Documents?
Printable brochure in PDF format

SimpleIndex Applications in Your Industry

PDF brochures outlining some of the applications for SimpleIndex in various industries.

SimpleIndex Feature Highlights

Links to more information on the major features of SimpleIndex.

Streamlined Document Capture
Ways that SimpleIndex helps reduce labor by streamlining the workflow and automating common indexing tasks
TWAIN and ISIS Scanner Driver Support
Use any scanner with SimpleIndex
Zone, Full Page and Dynamic OCR
Extract index data no matter where it appears on the page
Barcode Recognition
Read barcodes from scanned images to automate indexing
Optical Mark Recognition (OMR)
Read check boxes to find True/False or Yes/No values
Index Autofill
Populate multiple search fields with existing data from your database
Electronic Imprinting
Apply bates stamps and other image stamps/endorsements electronically
Database Integration
A full range of interactive database features allow for creative integration with custom database applications
Document Presence Auditing
Make sure that all required documents are present in the batch before it is released
Document Retrieval Options
How to find and view files once you have indexed them with SimpleIndex
Command Line Processing and Custom Application Integration
The SimpleIndex command-line interface makes it the easiest document capture application to integrate with your custom business software
Enable Distributed Document Capture
Companies with many remote locations can now afford to implement Distributed Capture with SimpleIndex

The Simple Software Imaging Suite

These applications enhance and expand the functionality of SimpleIndex by providing barcode printing, automatic uploading, quality control and direct integration with popular applications like QuickBooks.

Software Catalog

Top Features

The following is a list of the Top 25 major document capture features of SimpleIndex, in no particular order.

Indexing support for all file types
Viewing support for any installed, OLE-enabled application
No monthly page processing limitations
Text processing support for OCR’d images, PDF files and MS Office documents
Barcode Recognition
Dynamic & Zone OCR
Manual Zone OCR by indexing operator
Full-Page OCR to text, MS Word or HTML
Optical Mark Recognition (OMR)
Unstructured data capture using Template and Dictionary Matching
ODBC & OLEDB database connectivity
Use any database to store index data for document management and retrieval
Automatic population of index fields using database lookup (Autofill)
Document Presence Auditing – ensures all required pages are present in each batch
Clipboard / Screen Shot OCR
Command line execution
Input from any TWAIN or ISIS scanner or network folder
Media Wizard to create royalty-free, searchable document CDs or DVDs
Output images to TIFF, JPEG, PDF or PDF/A
Page Order Validation: reads the page number from each page with OCR and compare it to the scanned page order.
Double-Index Validation: compare the value of two fields during unattended processing and automatically route documents to exceptions when the values don’t match.
Automatic forwarding of a copy of the first page: from each exported file a first page is forwarded to a separate folder for data processing.
Integrated document separation: combines pages into multi-page documents without the need for a 2-step configuration.
Output index information to database or comma-delimited text file
Blank page detection and deletion
SharePoint 2010 Integration
Auto-Rotate
Easy-to-use cropping and redaction tools to remove confidential parts of images
Electronic imprinting and bates stamping

New versions History

Simple Software is always working on updating and upgrading. Here you can find change log for SimpleIndex.

Learn More:

KB Articles for OCR Features

1-Click Processing, Batch Scanning, Document Capture Solution, Document Numbering System, Document Scanning, Full Text Indexing, OCR, Office PDF Document Indexing, on-prem OCR, on-site OCR, Paperless Office, Personal Document Management, Scanned Document Indexing, SharePoint Migration, Sunshine Software OCR, TWAIN & ISIS Scanning, XSLT Data Conversion Software

1-Click Processing Batch Scanning Document Capture Solution Document Numbering System Document Scanning Full Text Indexing OCR Office PDF Document Indexing on-prem OCR on-site OCR Paperless Office Personal Document Management Scanned Document Indexing SharePoint Migration Sunshine Software OCR TWAIN & ISIS Scanning XSLT Data Conversion Software

No Comments

Zone OCR and Dynamic OCR

Monday, 07 November 2022 by Simple Software

Other document scanning applications in this price range use Zone OCR to obtain index data from the page.

SimpleIndex improves upon this time-tested but limited model with its Dynamic OCR feature.

Let’s look at the difference between the two methods:

Zone OCR

Zone OCR is used to read document indexes or tags from text on the page. It is a great way to automate the data entry associated with scanning documents.

However, there are several limitations to zone OCR that must be overcome:

Index information must be in the exact same place on every page
Documents shift and skew during scanning, causing the zones to not line up
If surrounding lines or text on the document are too close, they can encroach on the zone

Dynamic OCR

SimpleIndex overcomes these limitations by using Dynamic OCR technology to extract values from anywhere on the page. Our simplified version of Dynamic OCR works great for many types of documents at a fraction of the cost of other solutions.

Index information can appear anywhere
Unwanted characters are ignored
Find unique patterns of letters and numbers using Template Matching
Use Dictionary Matching to find a value from a list of possible values
Use Cloud OCR or ChatGPT to perform AI analysis and intelligent data extraction

Download document scanning and OCR software.

Dynamic OCR and AI Assisted OCR

AI assisted OCR is the popular solution to the problem of unstructured and semi-structured documents. But there are many scenarios where simple Template and Dictionary matching provide much better results. And all of these solutions are much more expensive than SimpleIndex!

Often there are only a few key values that need to be extracted, and a wide variety of possible layouts. AI-based document training requires manual processing of several samples of each possible format before it learns how to read them reliably, where a Template could read them all with a single setting. Dictionary matching can perform advanced classification without analyzing thousands of samples.

When data extraction requires natural language processing, field label extraction, handwriting, AI document analysis, or other advanced features, SimpleIndex offers Cloud OCR and ChatGPT integrations.

Dynamic OCR Examples

In the video we see how SimpleIndex approaches a typical Zone OCR example. With SimpleIndex you can use large zones that give a wide margin for error. Template and Dictionary matching are then used to extract the 7-digit Account Number, 6-digit Order Number and Company Name. SimpleIndex discards the surrounding text and keeps the correct value.

Another common example is finding a unique identifier, for example a social security number, that could appear anywhere on the page. Simply enter the template ###-##-#### and SimpleIndex will search the full OCR text until it finds a match. Since only one social security number is likely to appear on the page, a match on this pattern is almost certainly the required value.

With dictionary matching, you can give SimpleIndex a list of possible values and it will automatically search the zone or page for each possible value until it finds a match.

Many dynamic forms processing applications can be implemented using these simple algorithms. This makes SimpleIndex far more versatile than other zone OCR solutions that require the index value to be in the exact same location on every page. Yet SimpleIndex costs only a fraction of the price!

SimpleIndex‘s dynamic forms processing can greatly speed up data entry by eliminating a good percentage of indexing work. For many this can put the labor cost of scanning within their reach.

Dynamic OCR can also be applied to MS Office and PDF files, creating a fully automated process for intelligently indexing and reorganizing electronic documents.

MS Office Document OCR Text Parsing Video

Amazon AWS Textract Cloud OCR

With Textract you can capture data from almost any type of form, including handwritten ones! Textract identifies labeled text anywhere on the document and returns the label text along with the corresponding value. Map the labels to index fields in SimpleIndex and you are ready to capture that data no matter where it appears on the page.

SimpleIndex Cloud OCR with Amazon Textract

Textract uses machine learning with a huge model based on the billions of pages processed using Textract to provide the most accurate OCR and form field extraction solution available.

By default, Textract is only available as an API and requires custom coding to integrate it into your document workflows. SimpleIndex turns it into a fully-featured batch document and data processing app that is ready to use out-of-the-box.

Since there are no templates to configure or train, setup can be done in hours instead of days or weeks months required by other enterprise data capture solutions.

Pay-as-you-go pricing makes SimpleIndex with Textract the most affordable way to batch process forms for projects with less than 50,000 pages per year to process, especially if you need to read handwriting or have forms with many layout variations.

Got a preference for ABBYY Cloud OCR, Microsoft Azure AI Vision, or Google Cloud Vision OCR? These can be quickly added for a small customization fee. Contact Us for a quote!

Wiki: How to configure AWS Textract OCR in SimpleIndex

Handprint and Handwriting Recognition

SimpleIndex 11 adds handprint recognition capabilities to the FineReader OCR engine to allow recognition of simple form fields and printed text. It works best with constrained form fields, with letter boxes for each character like you see on tax forms and credit applications. And no additional licensing or per-page costs are required!

For unconstrained handprint and cursive handwriting, use the Cloud OCR option to achieve the best recognition accuracy available. This option requires additional AWS processing fees for each page.

Support for Regular Expressions

SimpleIndex OCR has a simple built-in template format, as well as support for Regular Expressions. Regular Expressions (RegEx for short) let you define complex search patterns to extract matching values from the text. This greatly enhances the functionality of the dynamic OCR in SimpleIndex, making it capable of finding variable-length fields with no distinct pattern.

Regular Expressions are a commonly used in text parsing applications. The Perl programming language makes extensive use of RegEx, as do UNIX utilities like “grep”. Many programmers and IT personnel are already familiar with RegEx and can create complex expressions without specific training.

Click here for a reference guide to Regular Expressions

How to Configure SimpleIndex OCR

Our Wiki help has extensive information on how to configure OCR for various document and data capture scenarios.

Zone OCR read data in a specific location
Template matching to match unique patterns
Dictionary matching to match a list of possible values
OCR Options OCR job settings that apply to all fields
File Formats that can be output by OCR
Languages supported by OCR
FineReader versus Tesseract OCR engines
Searchable PDF with MRC compression
OCR to Field for point and click OCR during verification
Cloud OCR using Textract

Watch this Simple Software University training video to see how to configure and run an OCR job with SimpleIndex.

Learn More:

Scan, file, and process document data quickly and efficiently with Simple Software's tailored OCR automation and one-click processing that fits your unique business needs

Use SimpleIndex OCR to convert scanned and digital images to searchable PDF files for automated sorting, filing, and export to applications such as Word, Excel, PowerPoint, etc.

KB Articles for Optical Character Recognition (OCR)

Automatic Data Capture, Batch Scanning, Document Classification, Document Imaging, File Indexing, Invoice OCR, OCR, Office PDF Text Processing, on-prem OCR, on-site OCR, Optical Character Recognition, RegEx, Screenshot OCR, Search, Sunshine Software OCR, Text Processing, Watermark PDF Files, Workflow Software, Zone OCR

No Comments

Error in Scanning Batch 743

Thursday, 26 September 2019 by Alex Stewart

Please refer to the Wiki Documentation for the complete Batch Processing Stages reference.

If you get an error that says “Error in Scanning Batch. 743 – Unable to create or activate a new instance of ‘Simple.FileEditor.Client.ImageViewer’.” after running previous batches successfully then the SimpleIndex Windows Layout is corrupt.

To resolve locate this folder in Windows Exporer/My Computer:

C:\Users\<Windows User Name>\AppData\Local\Simple_Folder
(<Windows User Name> = Windows user currently logged into the computer that is receiving this error)

Then delete this folder and reopen SimpleIndex.

Batch Scanning

No Comments

Create Unique Batch Name

Monday, 29 July 2019 by Simple Software

Please refer to the Wiki Documentation for the complete Batch Processing Stages reference.

SimpleIndex creates a Batch ID each time you run a SimpleIndex Job Configuration, which creates a new batch.

The Batch ID is the Date and Time that the batch was started.
EX. 2020-01-23@145419

In this example 2020 is the Year, 01 is the Month, 23 is the Day, 14 is the Hour, 54 is the Second and 19 is the Millisecond that the batch was started.

When running SimpleIndex as a Windows Service using the Server Add-on or using the Windows Task Scheduler you can set-up multiple Job Configurations to run on different time frames and have them all running at once. This can lead to a very small possibility that two different Job Configurations will start at the same Millisecond, especially when the Job Configurations are set to run on the same time frames. This can lead to errors or missing files during the process.

With the following option you can make every batch name unique in case there are multiple batches that are created at the exact same Millisecond, which can occur with multi-thread processing on the same Input folder.

Instructions for Unique Batch Name:

Open the Windows Registry Editor by searching for “regedit”
Find this location in the Registry Editor:
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\SimpleIndex\Misc
Right click in the right and select New>String Value
Name the key the following: GUIDBatchNames
Open the key and set the value to the following: 1

Batch Scanning

No Comments

In the middle of a batch that was scanned our computer is receiving an error, from that point to the end of the batch it’s not saved. Is this a SimpleIndex problem or windows?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Batch Processing reference.

This usually indicates an overload of the memory of the computer, which causes an error. The usual solution to this issue is to scan batches with a fewer number of pages.

Batch Scanning

Published in Indexing & UI

No Comments

Can I scan with a Kofax Adrenaline based scanner interface?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Scanning reference.

Yes it is possible to scan with Kofax Adrenaline based scanners using the Adrenaline TWAIN or ISIS data source.

To configure ISIS, you may need to select one of the general Kofax drivers such as “Any Fujitsu Scanner with Adrenaline” or “Kofax Arenaline Scanner”.

To configure the TWAIN data source:

1. Download the Kofax Adrenaline TWAIN Data Source from Kofax Adrenaline Patches

2. Configure a scan source in KSM (in your computer’s Control Panel) called TWAIN SOURCE

3. Select the Kofax Adrenaline TWAIN driver from the Select Scanner dialog in SimpleIndex

There should be little difference in performance due to using the TWAIN interface, and all the Kofax image processing controls will still be available.

Batch Scanning Document Scanning Fast Scanning Front End Scanning Image Scanning ISIS Driver Scanning Software TWAIN & ISIS Scanning

Published in TWAIN & ISIS Scanning

No Comments

I have a scanner with Virtual ReScan (VRS) that is not scanning properly. How do I solve this issue?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Scanning reference.
Any time that you have a scanner with Virtual ReScan (VRS) you must pick Kofax VRS as the scanner instead of the scanner model itself. With VRS you assign the scanner to VRS and then anytime you pick Kofax VRS as the scanner in your scanning software the assigned scanner will be the scanner that is used.

Batch Scanning Document Scanning Fast Scanning Front End Scanning Image Scanning ISIS Driver Scanning Software TWAIN & ISIS Scanning

Published in TWAIN & ISIS Scanning

No Comments

How can I change from letter to legal size documents?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Scanning reference.

This is done through the TWAIN or ISIS settings for your scanner. To access these settings select “Display Scanner Settings” from the “Scan” file menu. Next click the “Run Job” button and before the scanning process starts your scanners TWAIN or ISIS settings dialog box will appear.

Every TWAIN or ISIS dialog is different, but any of them have a clear option for changing page size or auto detecting page size.

Batch Scanning Distributed Scanning Document Scanning Fast Scanning Front End Scanning Image Scanning ISIS Driver PDF Archive Scanning Software Remote Capture Scanning Software TWAIN TWAIN & ISIS Scanning

Published in TWAIN & ISIS Scanning

No Comments

What scanners are compatible with SimpleIndex? How do I find a list?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Scanning reference.

SimpleIndex is compatible with any device that has a TWAIN or ISIS driver. This includes virtually all makes and models of scanner, as well as many specialty scanners, digital cameras and other devices.

In the few instances when a scanner has a proprietary driver, it can still be used with SimpleIndex by first scanning to a folder and setting SimpleIndex‘s input to “From Folder” on the Input tab.

For many high-speed scanners (over 50 pages/minute), ISIS drivers provide improved throughput versus TWAIN. It is recommended you purchase the ISIS driver option with these scanners.

ISIS drivers also let you save your scanner settings to a file that can be distributed with your SimpleIndex configuration.

You can find more information on selecting the best scanner for your specific requirements on the ScanStore Scanners Guide, as well as a wide assortment of scanners available for purchase.

Batch Scanning Distributed Scanning Document Capture Solution Document Scanning Fast Scanning Front End Scanning Image Scanning ISIS Driver PDF Archive Scanning Software Remote Capture Scanning Software SharePoint Scanning TWAIN TWAIN & ISIS Scanning

Published in TWAIN & ISIS Scanning

No Comments

How can I configure SimpleIndex to perform bates stamping or page numbering for my images?

Wednesday, 28 February 2018 by dwilder

This is all done through the electronic imprinting features, which puts the desired information electronically on the output images that are saved in your output folder. This is all done in SimpleIndex by clicking going to the File menu, selecting Job Settings Wizard and then going to the Imprinting step.

To implement bates stamping or page numbering click the ‘Enable Imprinting’ check box and also the ‘Imprint page numbers’ check box. This is the most basic method, but there are also features which allow you to manipulate what this information is, what it looks like and where on the page it will go.

The ‘Font Size’ field allows you to choose what point font you would like the imprinted value to be.

The ‘Page # Length’ lets you determine how many digits you would like the page number or bates number to be. It will add leading zeros to the page number based on the number you enter into this field. This is used to keep images with page numbers in the proper order when saving them. EX. If you put 4 into the ‘Page # Length’ field the number will read starting at 0001 and will count up from there always keeping the page number 4 digits long.

If you would like leading characters on the front of the page number you can add these to the ‘Imprint Text or Image’ field and they will appear in front of the page number. These pages will appear as (leading characters) – (page number). If you would like them to appear directly next to each other you would remove everything from the ‘Separator’ field, because this field is what is inserted between the imprint values. EX. You want the page numbers to read PMB#####. You would put PMB in the ‘Imprint Text’ field, remove everything from the ‘Imprint Separator’ field and put 5 in the ‘Page # Length’ field.

You will then decide where you want this information to appear on the image. If you want the page number on the top of the image do not check the box marked ‘Measure X,Y from bottom-right’ if you want page number on the bottom of the image check this box. Next, set up the X and Y coordinates to have the imprinted information located in the section of the image that you would like it. The X coordinate measures from top to bottom (bottom to top when ‘Measure X,Y from bottom-right’ is checked) and the Y coordinate measures from left to right (right to left when ‘Measure X,Y from bottom-right’ is checked). The unit of measurement of the X and Y coordinates are pixels. The number of pixels per page change based on the resolution or dpi (dots per inch) that the image was scanned at. So if you are scanning at 200 dpi 1 inch = 200, but at 300 dpi 1 inch = 300 and so on. EX. You have a 300 dpi, 8.5×11″ image that you want to imprint page numbers on the bottom left of the image an inch and a half from the the left and bottom of the page. You would want to have ‘Measure X,Y from bottom-right’ checked, 1950 in the ‘X coordinate’ field (6.5″ from right x 300 dpi) and 450 in the ‘Y coordinate’ field (1.5″ from bottom x 300 dpi).

How do I automatically delete blank pages from duplex documents?

Wednesday, 28 February 2018 by dwilder

This is done using the Minimum File Size drop down in the Blank Page Deletion section of the Image Enhancement step of the Job Settings Wizard. This setting uses the image file size in bytes to determine which files are blank pages.

In this box you would put the number of bytes for the threshold for which documents are considered to be blank, which will be deleted and those considered non-blank that should be kept. If the scanned or imported image file is smaller than the number in bytes for the file size then it will be deleted and if it is higher it will be kept.

For 200dpi, compressed, black & white TIFF images (the default format used by SimpleIndex) this is usually around 2500 bytes and 300dpi, compressed, black & white TIFF images (the default for OCR) is usually around 7500.

Depending on your scanner settings and how much black is in your images, blank pages could be significantly larger. If you are scanning small pages the margin for error will be less. You will want to do some trial and error testing to ensure the setting is right for your images.

How do I configure SimpleIndex to scan documents?

Wednesday, 28 February 2018 by dwilder

First make sure that you scanner’s TWAIN and/or ISIS driver (found on the included CD or manufacturer website) is installed and that the scanner shows up in the ‘Printers and Scanners’ section of the ‘Control Panel’ in Windows.

On the SimpleIndex Settings Wizard step of the Job Settings Wizard set the Input Type to Scanner or Both. Scanner will just use the scanner and both will scan first and then pull image from the Input folder.

Settings for the scanner can be found in the Scanner Settings step of the Job Settings Wizard. These allow you to select TWAIN or ISIS scanning, whether images are displayed during scanning and more.

From the Scan menu you have the option to select your scanner, display or hide the scanner’s TWAIN/ISIS settings interface or have SimpleIndex prompt the user when the feeder is empty.

The Scan to File option in the Scan menu lets you separate the scanning from processing. Use this to scan a sample image and save directly to a file folder.

Published in

No Comments

OMR Optical Mark Recognition

Tuesday, 23 January 2018 by Simple Software

Simple Checkbox Recognition

Some forms require scanning software to recognize the presence or absence of a mark in a particular location, such as a checkbox, without worrying about the specific shape or symbol drawn therein. The ability to do this is called Optical Mark Recognition, or OMR. Let’s take a look at how this feature can help you index your documents and how SimpleIndex improves upon the standard OMR process:

Optical Mark Recognition

Optical Mark Recognition lets you define check box regions on scanned images. OMR is very fast and can be used for a variety of applications:

Business reply mail
Simple surveys
Separate multi-page documents
Document routing control
Verify presence of signatures

To configure OMR, use an unfilled form to obtain baseline counts of how many black “pixels” are in the box. When processing, SimpleIndex compares the amount of black in each image to the baseline value to determine if the box is checked or not.

With OMR, it is very important that the check boxes appear in the same place on every scan, and that other text on the document does not move into your check box zone. For best results, use large boxes with plenty of white space around them.

We wouldn’t recommend using the SimpleIndex OMR feature to grade the SATs, but if your documents include a few check box values that you want to capture the SimpleIndex OMR feature is what you need. For more advanced OMR and forms processing solutions, please visit ScanStore.com.

OMR Document Separation

SimpleIndex includes a unique use for mark recognition that can save you thousands on document separator pages. One of the most labor-intensive parts of scanning multi-page files is detecting where one document ends and the next one starts. Traditionally this has been done with blank pages (which doesn’t work with 2-sided documents) or barcodes and patch codes (which must be printed). All of these solutions require someone to insert a piece of paper between each document before scanning, wasting time, money, and paper.

Using the OMR feature in SimpleIndex, create a checkmark field in the upper-left corner of the page.
Create an Autonumber field that increments a document number each time the checkmark is found.
Use this job file to scan and separate documents into multi-page files.
Create a 2nd job file to index the multi-page files.
Use the Post-Process feature to run the two jobs consecutively.

When prepping files, simply take a felt tip pen and put a small mark the upper-left corner on the first page of each new document. This can be done very quickly, creates no additional paper and has a negligible effect on scan quality.

Learn More:

KB Articles for Optical Mark Recognition

Automatic Data Capture, Batch Scanning, OCR, offline OCR, OMR, on-prem OCR, on-site OCR, One-time payment OCR, Optical Mark Recognition, Self-hosted OCR, Subscription free OCR, Sunshine OCR, Watermark PDF Files

Automatic Data Capture Batch Scanning OCR offline OCR OMR on-prem OCR on-site OCR One-time payment OCR Optical Mark Recognition Self-hosted OCR Subscription free OCR Sunshine OCR Watermark PDF Files

No Comments

Streamlined Interface

Tuesday, 23 January 2018 by Simple Software

Maximum Data, Minimum Clicks

As with any repetitive task, a few seconds saved scanning and filing a single document quickly adds up to dozens or hundreds of hours over the course of a long project or daily routine. The most import part of planning your document capture project is to find the most efficient way to file them correctly. Creating an efficient workflow will save you countless hours of labor over the life of your project.

SimpleIndex is faster and easier because it is designed to perform all of the steps necessary to scan or import documents, process, verify and export them in one continuous workflow rather than requiring the user to click extra buttons each time to initiate the next step. When taken to the extreme, SimpleIndex is capable of performing all of these tasks automatically with just a single mouse click.

SimpleIndex does this by saving all of the settings for a document capture workflow to a file that can be opened just like an Office document. This file is configured by the administrator so the user doesn’t have to see any of the technical details. Very rarely does the operator need to be able to change, for instance, the export file format and file naming scheme. So why do some applications show you a complicated export settings screen every time you try to save a batch? It is this attention to detail that allows SimpleIndex to process the same batch 35-75% faster than its competitors.

SimpleIndex also has the ability to pre-set index values and run jobs using the Command Line Interface. More on this design feature can be found on our Getting Started page.

Index Automation Features

The two main methods for automating indexing are Barcode Recognition and Optical Character Recognition (OCR).

Barcode recognition is faster and more accurate, but your documents must contain a barcode on the document or a cover page for this to work.

OCR is able to read printed data directly from the page, which means most documents can be processed as-is. However it is not 100% accurate and usually requires some human review. Handwriting can be recognized as well, using the Cloud OCR option.

If your index data already exists in another database, SimpleIndex has features that can make use of this data to automate processing. The Index Autofill feature matches data read from barcodes or OCR to data in your database, verifying the correct value is read and populating additional search fields automatically.

Paper and Electronic Documents

Traditional document capture is focused on digitizing paper documents with a document scanner. However, more and more documents are living their best lives as native PDF and Word files, never once having to enter our physical realm.

SimpleIndex is designed to handle both scanned physical documents and electronic files in their native format seamlessly. The OCR function will use existing text from any PDF file or Office document when it is available, or automatically OCR scanned images when it isn’t.

Use the built-in SimpleView viewer to view most common file types, or use the PDF editor and word processor of your choice to provide full editing capabilities embedded right within the SimpleIndex application.

It can also simultaneously scan and import documents from a hotfolder into a single batch. So if, for example, you receive both paper and email invoices, you can process your day’s work all at once with just one click!

Using Pre-Indexed Batches

The Pre-Index Batch feature of SimpleIndex is what enables 1-click scanning and indexing, as well as command line and unattended processing.

Pre-indexing lets you set fixed values for index fields and apply them to a whole batch. These can be combined with automatic values from barcode recognition, OCR and Autofill to create fully automated batch processes that can be launched from your custom application, a desktop shortcut, scheduled server task or even linked to the scan button on your scanner.

Learn More:

KB Articles for Streamlined Interface

Automatic Data Capture, Barcode Recognition Software, Batch Scanning, Command Line Interface, Database, Document Automation, Document Classification, Document Imaging, Fast Scanning, OCR, Office PDF Text Processing, on-prem OCR, on-site OCR, RPA, Scanning Software, Solution, Sunshine Software OCR, TWAIN & ISIS Scanning, Unattended, Workflow, Workflow Software

No Comments

TWAIN and ISIS Scanning

Tuesday, 23 January 2018 by Simple Software

SimpleIndex works with all TWAIN and ISIS document scanners.

Virtually all document scanners support both the TWAIN and ISIS driver standards. TWAIN is more common and is usually the only driver provided with consumer scanner models.

ISIS is the unfortunately named driver standard developed by Pixel Translations years before the acronym (which stands for “Image and Scanner Interface Specification”) had any jihadist connotations. ISIS provides a more standardized interface for high-speed scanners, and is often required to scan at the scanners rated speed.

TWAIN scanning is supported in all versions of SimpleIndex.

Support for ISIS is included with SimpleIndex Professional and available as an Add-on for other versions. ISIS is recommended for high-speed document scanners, or applications that require central management of scanner settings. ISIS driver settings are saved in your SimpleIndex job file so every user gets the exact scanner settings they need for each type of document.

What is a WIA Driver?

WIA stands for Windows Image Acquisition and is the driver standard promoted by Microsoft for basic scanning functions that don’t require additional software other than the Windows OS.

WIA is much better suited for flatbed scanners than for document scanning applications. While you can use the WIA driver in SimpleIndex, you will get much better results with the full TWAIN or ISIS drivers, which may require a separate download from your scanner manufacturer.

What Happens After You Scan?

The difference between SimpleIndex and other desktop scanning applications is that it fully automates the process of indexing and filing your documents once they are scanned. Use bar codes, OCR, database lookups and other features to identify keywords and metadata, organize files into folders, or populate your database and document management system.

Indexing is the key to making your scanned files organized and searchable so you can find them when you need them. Without automation, the indexing process can be very labor intensive, taking many times longer than it takes just to scan documents.

TWAIN & ISIS Scanning Video

Since scanning happens in the meatspace there’s really not a lot to show in a screen recording. But Google really likes videos so here’s a quick preview what it looks like when you’re scanning with SimpleIndex.

Unique Book & Magazine Scanning Features

SimpleIndex streamlines the process of scanning books and magazines. Anyone who commonly scans bound material needs SimpleIndex just for this feature!

Book scanning lets you scan bound books quickly on a flatbed scanner. The user is prompted after each scan so they only need to turn the page and press “enter” to perform the next scan. 2-page scans can be automatically split vertically or horizontally to create single page images.

When scanning a typical tabloid-sized magazine that is folded down the middle, SimpleIndex’s magazine scanning function will automatically split the images in half, reorder and rotate the pages so they appear in proper reading order when finished. This can save hundreds of hours of manual image QC, or provide your clients with a level of image quality they can’t get from anyone else.

Using Scanning in SimpleIndex

Wiki manual pages detailing the scanning functions of SimpleIndex.

Learn More:

KB Articles for TWAIN and ISIS Scanning

Batch Scanning, Distributed Scanning, Document Capture Solution, Document Scanning, Fast Scanning, Front End Scanning, Image Scanning, ISIS Driver, on-prem OCR, on-site OCR, PDF Archive Scanning Software, Remote Capture, Scanning Software, SharePoint Scanning, Sunshine Software OCR, TWAIN, TWAIN & ISIS Scanning

Batch Scanning Distributed Scanning Document Capture Solution Document Scanning Fast Scanning Front End Scanning Image Scanning ISIS Driver on-prem OCR on-site OCR PDF Archive Scanning Software Remote Capture Scanning Software SharePoint Scanning Sunshine Software OCR TWAIN TWAIN & ISIS Scanning

No Comments

Barcode Scanning Guide

Saturday, 13 January 2018 by Simple Software

So you want to organize your documents using barcodes? That is an excellent idea! Not only will it improve the speed and accuracy of your document management workflow, but it is easier to set up than it sounds.

The guide is written to give you real information instead of marketing, but you can follow the links links to read about the relevant features of SimpleIndex and other document management solutions on ScanStore.

Other Useful Guides

Scanning Guides? Ain’t nobody got time for that!

Read no further! Contact our experts and we’ll configure the whole scanning process for you remotely using the demo version of SimpleIndex!

Why use barcodes in document scanning?

There are many benefits to using barcodes while scanning your documents.

Traditional scanning methods require you to scan your documents in pre-separated batches and then manually name and organize the resulting files. Barcode scanning, on the other hand, allows you to scan multiple batches in a single stack and let the software automatically name and organize the files based on the embedded barcode information. This allows you to take full advantage of your automatic document feeding scanner. After all, what’s the point of paying for a high-speed scanner if your scanning speed bottlenecks at the processing stage?

As in OCR based document scanning software, barcode scanning software uses information on the scanned images themselves to name files and place them in the appropriate folders. However, OCR can sometimes incorrectly recognize certain words and phrases, especially when you are trying to scan at lower resolutions to reduce file size and scanning time. This results in having to perform validation and correction, which slows down the process. Barcodes condense all the necessary information in a format that is much easier for computers to decipher, even at lower resolutions, with almost 100% accuracy. This allows you to reduce file size and scanning time without the cost of additional validation and correction.

What is the difference between the various types of barcodes?

Barcodes come in many standards, but they can all be grouped into two general flavors:

1D (linear) barcodes and 2D (matrix) barcodes.

Linear barcodes are composed of parallel lines of varying widths and distances from each other, such as the UPC’s that are scanned from your purchases with a laser barcode reader at most stores.

Matrix barcodes are usually square (though sometimes circular) arrangements of smaller squares, circles, or triangles, such as the QR codes that you can scan with your phone from many modern advertisements. Matrix barcodes can pack more information per unit of area than their linear counterparts, but not all software is designed to read them.

There are both advantages and disadvantages to using one standard over
another. In addition to the amount of information that can be stored and the
capability of your particular scanning software in deciphering it, some
standards have additional functions such as checksums, which automatically
validate whether or not the barcode was read correctly. A few common standards
used in the document scanning business are listed below:

1D Barcode standards include:Codabar
2D Barcode standards include:

Where do you get barcodes for document scanning?

Now that you know a little bit about barcodes and why you should use them, you might be wondering how to apply them to your documents. There are a few different methods, depending on your situation.

If you can still edit the document, your best bet would be to use a special barcode font, which will allow you to type a string of characters or digits directly into a barcode format on your document using a regular keyboard. You can find many barcode fonts online for both download and purchase, with some that are purely decorative and others that match to a particular industry standard. We particularly recommend BarcodesInc.com as a source of a good free barcode font.

Often, files cannot be edited due to access, permission, or simply because they are already printed out. For instance, if you are scanning a backlog of files, it becomes difficult to add barcodes to the existing pages. Have no fear! You can still use barcode scanning by either applying barcode marked stickers, such as Avery lables, onto a blank area on the first page of each document or by printing out and inserting coversheets to separate the files.

With applications such as SimpleCoversheet, you can create one-off coversheets or link it up to a data source and automate creating a separator for every customer folder in your filing cabinet.

What scanner settings are best for barcode recognition?

While barcodes are naturally more accurate than text-based document management, there are still ways to ensure a higher degree of recognition. Resolution plays a part, and although barcodes are more forgiving than OCR, 300dpi is still the recommended resolution for the highest degree of accuracy. Likewise, since most barcodes encode information in the contrast between black and white areas of the page, bitonal (black-and-white) scanning is preferred over the anti-aliasing effects of greyscale and color scanning. You can also adjust the brightness and contrast options in your scanner settings to improve recognition on documents where the barcode is not being read.

What are the guidelines to ensure barcodes will be read?

The way barcodes are printed also has an effect on how easily they will be recognized. Coversheet software usually
has formatting standards built in to ensure that barcodes are printed in a way that can be recognized by scanning software.

There are both minimum and maximum sizes that barcodes can be scaled to while remaining useful – usually ranging between 80% – 200% of the suggested size. The suggested size varies by standard, but the most common width is around 1.5″. Keep in mind that there should also be at least a 0.3″ clean margin between the barcode edge and any other markings to avoid confusing the software.

The different barcode standards have varying amount of error correction built in, allowing for more and less compact barcodes with the same level of accuracy at equivalent scan settings. There are also error detection methods, such as check-sum, which act as redundancy checks by running the data through an algorithm and confirming that the result is equal to a small, easily recognized part of the barcode.

What are the limitations of reading barcodes?

As with any automated process, barcode recognition has some limitations. Different scanning applications use different barcode recognition engines, which are the prebuilt blocks of code that perform the actual recognition. As with any software, some of these engines are better and/or faster than others at the same job, and they vary in the types of barcodes that they can recognize. You must make sure that your scanning software has the capability of accurately recognizing the barcode standards you use.

Regardless of the software that you use, image quality will play a factor. While barcodes simplify the format that data is encoded in, reducing the margin for error, degraded barcode images can still cause the engine to incorrectly read the data encoded therein. Degradation can happen digitally or physically. Digital degradation can occur when an image is shrunk to a smaller resolution and then sized back up or when an image is copied too many times and thus accrues artifacts and other transcription errors. Physical degradation can occur when a printed barcode is smeared, worn, torn, or marked upon prior to scanning. Both of these alter the contrast and precise distances between parts of the barcode that determine the data that is encoded.

Introduction to OCR and Barcode Recognition Video

Creating Barcode Configurations Video

Learn More:

Bar Code Scanning, Barcode OCR, Barcode Printing, Barcode Reading Software, Barcode Recognition Software, Batch Scanning, Document Capture Solution, Document Scanning, Fast Scanning, Front End Scanning, Image Scanning, on-prem OCR, on-site OCR, PDF Archive Scanning Software, PDF Barcode Recognition, Scanned Document Indexing, Scanning Software, Sunshine Software OCR, TWAIN & ISIS Scanning

No Comments

Compare Leading Solutions

Tuesday, 07 November 2017 by dwilder

The best way to see how the SimpleIndex processing workflow compares to other leading desktop scanning solutions is to see the same process performed side-by-side in each program. Below are videos we recorded of the same batch of documents being scanned and indexed in Kofax Express™, Kodak Capture Pro™, PaperVision™ Capture Express and Office Gemini DiamondVision™. In each one we configured the software to perform the same tasks:

Scan a batch of 10 pages
Capture a 7-digit account number using Zone OCR
Correct any fields that fail to recognize
Use a database lookup to populate additional index fields
Export the batch to PDF files

Using our standard benchmark batch* we recorded the following processing times:

SimpleIndex: 0:45
Kodak Capture Pro: 1:50
Kofax Express: 2:20
PaperVision Capture Desktop: 3:00
DiamondVision: 3:20

As you will see in the videos below, SimpleIndex provides the most efficient scanning and indexing workflow of any major document capture application.

SimpleIndex™

Kodak Capture Pro™

Kofax Express™

PaperVision™ Capture Desktop

Note: This video depicts PaperVision Capture Desktop, a now discontinued software that has since been replaced by the similarly functioning updated version of PaperFlow.

Office Gemini DiamondVision™

Testing Methods

The benchmark times were recorded using all available software shortcuts, and by performing data entry and user interactions as fast as possible. The same scanner and computer hardware was used for each test. Much care was taken to ensure that each application yielded the most accurate OCR results possible given the sample documents.

Unfortunately none our competitors could accurately capture the account number on all 10 pages. The extra time to correct these errors accounts for 15-30% of the difference in processing times. The difference in accuracy is due in large part to SimpleIndex‘s pattern matching OCR feature, which the other programs lack.

Keep in mind these videos were recording using the latest version available at the time this test was taken. Results may vary with with later versions.

Learn More:

Batch Scanning, Database, Database Autofill, Document Imaging, Document Scanning, Fast Scanning, Front End Scanning, Image Scanning, Keyword Indexing, offline OCR, on-prem OCR, on-site OCR, One-time payment OCR, PDF, Scanned Document Indexing, Self-hosted OCR, Subscription free OCR, Sunshine OCR, Workflow, Zone OCR

No Comments

SIGN IN YOUR ACCOUNT TO HAVE ACCESS TO DIFFERENT FEATURES

FORGOT YOUR DETAILS?

CREATE ACCOUNT

Key Features and Recent Enhancements

Website Resources

SimpleIndex Applications in Your Industry

SimpleIndex Feature Highlights

The Simple Software Imaging Suite

Top Features

New versions History

Learn More:

KB Articles for OCR Features

Dynamic OCR and AI Assisted OCR

Dynamic OCR Examples

Amazon AWS Textract Cloud OCR

Handprint and Handwriting Recognition

Support for Regular Expressions

How to Configure SimpleIndex OCR

Learn More:

KB Articles for Optical Character Recognition (OCR)

Related Links

Related Links

Related Pages

Simple Checkbox Recognition

Optical Mark Recognition

OMR Document Separation

Learn More:

KB Articles for Optical Mark Recognition

Maximum Data, Minimum Clicks

Index Automation Features

Paper and Electronic Documents

Using Pre-Indexed Batches

Learn More:

KB Articles for Streamlined Interface

What is a WIA Driver?

What Happens After You Scan?

TWAIN & ISIS Scanning Video

Unique Book & Magazine Scanning Features

Using Scanning in SimpleIndex

Learn More:

KB Articles for TWAIN and ISIS Scanning

Other Useful Guides

Scanning Guides? Ain’t nobody got time for that!

Why use barcodes in document scanning?

What is the difference between the various types of barcodes?

Where do you get barcodes for document scanning?

What scanner settings are best for barcode recognition?

What are the guidelines to ensure barcodes will be read?

What are the limitations of reading barcodes?

Introduction to OCR and Barcode Recognition Video

Creating Barcode Configurations Video

Learn More:

SimpleIndex™

Kodak Capture Pro™

Kofax Express™

PaperVision™ Capture Desktop

Office Gemini DiamondVision™

Testing Methods

Learn More: