Full Text Indexing Pages

Features

Monday, 14 November 2022 by Simple Software

SimpleIndex is the perfect solution for small business and departments looking to manage their files from a single interface, developers who don’t want to reinvent the wheel and large companies with many locations looking to decentralize their scanning.

SimpleIndex organizes scanned images and electronic documents into a single document management database your employees can access on their desktops.

SimpleIndex takes the labor out of document imaging by providing powerful barcode recognition and OCR search algorithms that can find index values no matter where they are on the page. By providing these essential automations for a reasonable price, we make document management affordable for anyone.

The most unique automation is our OCR template and dictionary matching search algorithms. This lets you find information like date, customer name, invoice number and other information on documents with different layouts. These can even be applied to the text in Office documents and PDF files to organize these files automatically and attach them to a database.

The SimpleCoversheet barcode printing application lets anyone in your company print bar-coded coversheets with all the information needed to identify a document. This is perfect for scanning with a centralized scanner or networked digital copier. SimpleIndex can then be configured to process these documents automatically without any user intervention whatsoever!

This level of automation is provided by SimpleIndex‘s command line interface. All the settings related to scanning or searching documents can be saved to “Job Files”, which can be saved to an icon and launched with a click of the mouse or a single line of code. These jobs can be configured to scan documents, read barcodes and OCR, generate folders and filenames and upload the index information to a database in a single step.

It is simply not possible to find an easier, faster way to process your documents!

Key Features and Recent Enhancements

All Editions offer OCR and Barcode

- SimpleIndex Standard has Tesseract OCR and DTK Barcode engines. These provide good recognition on clean originals.
- SimpleIndex Professional adds ABBYY FineReader OCR Engine, ICR handprint recognition, ISIS Scanning, and Cloud OCR. These are able to recognize hard-to-read text and bar codes, and improve overall speed and accuracy.
- Process existing text in PDF files and MS Office documents in all versions.

OCR
- Pro version includes ABBYY^® FineReader Engine for faster, more accurate OCR
- Searchable PDF output
- Clipboard / Screen Shot OCR
- Point-Click OCR – Click on text to send it to an index field
- Enhanced OCR Options – match against other indexes, skip OCR on files with existing text
- Support for international character sets using Unicode

Bar Code Recognition
- Multi-engine Barcode voting to boost accuracy (Professional version)
- Support for most 2D barcodes in all versions
- User-defined Barcode delimiter – no longer restricted to “|” for parsing multi-index barcodes
- Find/Replace characters in barcodes for matching or autofill

TWAIN and ISIS Scanning
- TWAIN support in all versions. ISIS support available as an add-on or in Pro version.
- Improved real-time image processing
- Multiple scan windows when using ISIS
- Scan directly to a network folder (processing occurs during scan)

Desktop Processing
- Selectively reprocess files
- Run multiple copies of SimpleIndex simultaneously
- Save any image region to a separate file for signature capture, etc.

Server Processing
- Run multiple jobs on different schedules
- Run multiple copies of the same job for parallel processing and increased throughput
- Server licenses can be purchased in 1 Million Page per Year increments as an add-on to any workstation license
- Unlimited page barcode processing license available with Advanced Barcode Server
- Server processing compatible with Windows 7 or above and all Windows Server versions

PDF Handling
- Support for reading and writing password protected PDF files
- Convert MS Office, HTML, XML and images to PDF before processing
- Searchable PDF output
- Native PDF Viewer – no dependence on 3rd party software
- PDF Auto-repair – attempts correction of bad PDF files

Indexing
- Return MD5 hash values
- Configure default values for empty fields
- Export to XML
- Edit Fixed Fields – make changes to auto-generated indexes

SharePoint Integration
- Compatible with all versions of SharePoint and SharePoint Online
- Append to or replace existing files in SharePoint
- Automatically match index fields to SharePoint columns to populate data

Website Resources

SimpleIndex.com is full of information and interactive content to help you learn what you need to know to implement SimpleIndex in your organization.

Design Philosophy & Getting Started Guide
An introduction to the way SimpleIndex approaches document processing
Bringing Document Imaging to Everyone
How Simple Software solutions reduce the cost of entry into the Paperless Office
Sample Applications
Examples of different ways you can put SimpleIndex to use
Compare SimpleIndex to the Competition
Videos showing the same batch of documents being scanned and indexed with SimpleIndex as well as 4 top desktop document capture applications.
Top Features of SimpleIndex
Detailed information on the automated scanning and indexing features of SimpleIndex
Demonstration Videos
See how SimpleIndex automates indexing with OCR and barcode recognition
Simple Software University
Online training videos teaching all aspects of Simple Software configuration
Simple Software FAQ
Answers to common questions about Simple Software products
How Many Clicks does it Take to Scan My Documents?
Printable brochure in PDF format

SimpleIndex Applications in Your Industry

PDF brochures outlining some of the applications for SimpleIndex in various industries.

SimpleIndex Feature Highlights

Links to more information on the major features of SimpleIndex.

Streamlined Document Capture
Ways that SimpleIndex helps reduce labor by streamlining the workflow and automating common indexing tasks
TWAIN and ISIS Scanner Driver Support
Use any scanner with SimpleIndex
Zone, Full Page and Dynamic OCR
Extract index data no matter where it appears on the page
Barcode Recognition
Read barcodes from scanned images to automate indexing
Optical Mark Recognition (OMR)
Read check boxes to find True/False or Yes/No values
Index Autofill
Populate multiple search fields with existing data from your database
Electronic Imprinting
Apply bates stamps and other image stamps/endorsements electronically
Database Integration
A full range of interactive database features allow for creative integration with custom database applications
Document Presence Auditing
Make sure that all required documents are present in the batch before it is released
Document Retrieval Options
How to find and view files once you have indexed them with SimpleIndex
Command Line Processing and Custom Application Integration
The SimpleIndex command-line interface makes it the easiest document capture application to integrate with your custom business software
Enable Distributed Document Capture
Companies with many remote locations can now afford to implement Distributed Capture with SimpleIndex

The Simple Software Imaging Suite

These applications enhance and expand the functionality of SimpleIndex by providing barcode printing, automatic uploading, quality control and direct integration with popular applications like QuickBooks.

Software Catalog

Top Features

The following is a list of the Top 25 major document capture features of SimpleIndex, in no particular order.

Indexing support for all file types
Viewing support for any installed, OLE-enabled application
No monthly page processing limitations
Text processing support for OCR’d images, PDF files and MS Office documents
Barcode Recognition
Dynamic & Zone OCR
Manual Zone OCR by indexing operator
Full-Page OCR to text, MS Word or HTML
Optical Mark Recognition (OMR)
Unstructured data capture using Template and Dictionary Matching
ODBC & OLEDB database connectivity
Use any database to store index data for document management and retrieval
Automatic population of index fields using database lookup (Autofill)
Document Presence Auditing – ensures all required pages are present in each batch
Clipboard / Screen Shot OCR
Command line execution
Input from any TWAIN or ISIS scanner or network folder
Media Wizard to create royalty-free, searchable document CDs or DVDs
Output images to TIFF, JPEG, PDF or PDF/A
Page Order Validation: reads the page number from each page with OCR and compare it to the scanned page order.
Double-Index Validation: compare the value of two fields during unattended processing and automatically route documents to exceptions when the values don’t match.
Automatic forwarding of a copy of the first page: from each exported file a first page is forwarded to a separate folder for data processing.
Integrated document separation: combines pages into multi-page documents without the need for a 2-step configuration.
Output index information to database or comma-delimited text file
Blank page detection and deletion
SharePoint 2010 Integration
Auto-Rotate
Easy-to-use cropping and redaction tools to remove confidential parts of images
Electronic imprinting and bates stamping

New versions History

Simple Software is always working on updating and upgrading. Here you can find change log for SimpleIndex.

Learn More:

KB Articles for OCR Features

1-Click Processing, Batch Scanning, Document Capture Solution, Document Numbering System, Document Scanning, Full Text Indexing, OCR, Office PDF Document Indexing, on-prem OCR, on-site OCR, Paperless Office, Personal Document Management, Scanned Document Indexing, SharePoint Migration, Sunshine Software OCR, TWAIN & ISIS Scanning, XSLT Data Conversion Software

1-Click Processing Batch Scanning Document Capture Solution Document Numbering System Document Scanning Full Text Indexing OCR Office PDF Document Indexing on-prem OCR on-site OCR Paperless Office Personal Document Management Scanned Document Indexing SharePoint Migration Sunshine Software OCR TWAIN & ISIS Scanning XSLT Data Conversion Software

No Comments

Exclude Index Field from Index Log

Tuesday, 29 December 2020 by Alex Stewart

Please refer to the Wiki Documentation for the complete Index & Batch Logging reference.

Many times when outputting a Log file via CSV, XML, TXT, etc. there will be index fields that are required in the Job Configuration, but not desired to be output in the Index Log. In those cases those fields can be excluded from the Index Log with a “~” character at the end of the Index Field Name.

To do this go into the Job Options/Job Settings Wizard, go to the Index tab/step, find the Index field that you want to exclude from the Index Log and add this to the end of the field name: ~

EX. The original Index Name is “OCR Text” and that field should be excluded from the Index Log, so it doesn’t appear. This field should be changed to “OCR Text~”.

Automatic Indexing Software File Indexing Full Text Indexing Office PDF Document Indexing Scanned Document Indexing

No Comments

Change the Font Size of Index Fields

Thursday, 03 December 2020 by Alex Stewart

Please refer to the Wiki Documentation for the complete Index Fields reference.

Index fields have a default 8 point font size, but this can be changed if required for visibility. You can change this by editing a registry file in the Registry Editor and set any font size you would like.

To make this change follow these instructions

Close out of SimpleIndex entirely
Open the Windows Registry by going to the Windows Search and searching for “RegEdit”
Go to this location in the Registry Folder Tree: Computer\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\SimpleIndex\Misc
In this window you will see a key called “IndexFontSize”.
Open this file and change the value to the font size you would like and then click OK.
Close and reopen SimpleIndex to see the change in effect.

Automatic Indexing Software File Indexing Full Text Indexing Office PDF Document Indexing

No Comments

Index With Non-Latin Character Sets

Monday, 29 July 2019 by Simple Software

Please refer to the Wiki Documentation for the complete Languages reference.

By default SimpleIndex uses the ANSI character set to display and edit captured OCR data, index field values and full-text OCR. This works for all languages based on the Latin alphabet (English, French, Spanish, German, etc.)

To index documents in other languages like Chinese, Japanese, Russian, Arabic and other non-Latin alphabets, set the default character set using this registry key. If the key is not set correctly then Unicode text will show up as ??????????.

Use Notepad to edit the “Charset” value from the sample setting below and save it to a .reg file. Then double-click the .reg file to install (Administrator privileges required).

You can download the .reg file here but you still need to edit in Notepad to set the Charset value before installing.

If you are on a 32-bit operating system be sure to remove the extra “\WOW6432Node” from the registry path.

[HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\SimpleIndex\Misc]
“Charset”=”1”

Charset Name	Charset Value
ANSI_CHARSET (Latin)	0
DEFAULT_CHARSET	1
SYMBOL_CHARSET	2
SHIFTJIS_CHARSET (Japanese)	128
HANGUL_CHARSET (Korean)	129
GB2312_CHARSET (Simplified Chinese)	134
CHINESEBIG5_CHARSET (Chinese)	136
GREEK_CHARSET (Greek)	161
TURKISH_CHARSET (Turkish)	162
HEBREW_CHARSET (Hebrew)	177
ARABIC_CHARSET (Arabic)	178
BALTIC_CHARSET (Baltic)	186
RUSSIAN_CHARSET (Russian)	204
THAI_CHARSET (Thai)	222
EE_CHARSET	238
OEM_CHARSET	255

The full list of values is at https://msdn.microsoft.com/en-us/library/cc194829.aspx.

Automatic Data Capture Automatic Indexing Software File Indexing Full Text Indexing Keyword Indexing Metadata Microsoft Word Data Extraction Office PDF Document Indexing PDF Data Extraction Software Scanned Document Indexing

No Comments

I’m using full page OCR. The information is all appearing in the txt file but it is losing format about half way through. Data to the right is ending up at the end of the txt doc. Can this be fixed?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Full-Page OCR reference.

SimpleIndex version 7 solves this problem with the incorporation of the FineReader OCR engine. Full text in PDFs will now flow with the formatting of the PDF.

Legacy Versions: SimpleIndex can also be used with other OCR applications and servers to improve accuracy, formatting and performance. Use the OCR applications to convert the scanned images to text or searchable PDF, and SimpleIndex can extract index values from the text and automatically sort and organize the files.

Full Text Indexing OCR OCR Form Processing OCR Scanning Office PDF Text Processing PDF Data Extraction Software Text Processing Unattended Processing Zone OCR

Published in OCR

No Comments

Can the values of 2 or more fields be combined in a single field?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Field types reference.

Set the Type for the field that you want to store the combined value to “Fixed”.

In the template setting for that field, you can enter the keyword %FIELD#% (where # is the field number) and the keyword will be replaced with the value of the designated field when it is saved.

For example, to combine your first 2 fields into third, inserting a comma between them, set the template for field 3 to:
%FIELD1%,%FIELD2%

Automatic Indexing Software File Indexing Full Text Indexing Keyword Indexing Metadata Microsoft Word Data Extraction Office PDF Document Indexing PDF Data Extraction Software Scanned Document Indexing

Published in Indexing & UI

No Comments

How do you configure full text searching in Retrieval mode?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Database Settings reference.

On the Database tab there dropdown in the lower portion of the panel for Full Text OCR Field. Put the name of the field that will store the full-text data there. This must be configured both for Insert and Retrieval mode configurations. The database field needs to be sufficient length to store the entire text of your document.

Of course, the Insert Mode configuration must have “Enable Full Page OCR” checked to generate full text data from images. Text from MS Office documents, PDF files and existing OCR text files can be used without setting this option.

When designing your Retrieval Mode configuration, create a Text field to use for full text search queries. On the Database tab, set the corresponding “Database Field Name” to the full text database field.

When searching on your full text field, SimpleIndex finds the text you enter no matter where it appears in the document. It is able to match partial words. It does not perform boolean or natural language searches. The text entered must match the document text exactly.

Published in Database & Retrieval, OCR

No Comments

Can OCR text be saved to Office, Text, HTML or other formats?

Wednesday, 28 February 2018 by dwilder

Yes. On the OCR step of the Job Settings Wizard you can select the text output format need in the “Full-page OCR file type” drop down. By default it is set to PDF, but can be changed to Text (txt), Word (docx), Rich Text (rtf), Open Office (odt), Excel (xlsx), PowerPoint (pptx), ePub Zip (epub), FictionBook (fb2), HTML (htm), XML (xml) or Alto XML (alto.xml).

If the output file type is set to PDF, OCR text will be embedded as hidden text in the PDF file.

Can SimpleIndex create searchable PDF Image+Text files with hidden text?

Wednesday, 28 February 2018 by dwilder

Yes, it can. You can configure this setting in the Job Settings Wizard by going to the OCR step and checking “Enable full-page OCR”. There are many settings in the OCR step that you can used to customize the output and recognition of images.

SimpleIndex has two different OCR engines (Standard and Professional) that can be used to produced PDF Image + Text files or Searchable PDFs.

How do I configure the output folder and file naming scheme?

Wednesday, 28 February 2018 by dwilder

Use the Folder and Filename check boxes on the Indexing & File Naming step in the Job Settings Wizard to indicate whether field values will be used to generate subfolders or filenames. Any field with the Folder option checked will create nested subfolders for each value in the order the fields are listed. Any field with the Filename checked will have the values concatenated to form the filename.

For example, if Field 1 and Field 3 have the Folder option checked, and Field 2 and Field 3 have the Filename option checked, image filenames will be created in the format:

%OUTPUTFOLDER%\Field 1\Field 3\Field 2 – Field 3.tif

The Filename Separator option on the Advanced tab lets you change the ” – ” between the fields in the filename to anything you want.

Published in Export

No Comments

Automatic Indexing Using Existing Data

Wednesday, 24 January 2018 by Simple Software

Automatic Indexing Using Existing Data

The Autofill feature of SimpleIndex is an easy way to associate many index fields with one document without retyping data that already exists in another database. Autofill uses a database lookup to retrieve records that match a key value entered by the user. Blank index fields are then filled in automatically with the data from this lookup. The result is a document database with many different possible search fields, of which only one needed to be entered during scanning.

The key field may be typed by the user, or it may be read from the document automatically using barcode recognition or OCR. The lookup is performed either when the user changes this field or when the index values are saved. If the lookup finds multiple matching records, the user will be notified and the first set of values will be used by default.

When used with pre-index batches, key information can be read automatically from barcodes or OCR and matched to database records with a single click. Search on up to 99 index fields without a single keystroke!

Learn More:

Scan, file, and process document data quickly and efficiently with Simple Software's tailored OCR automation and one-click processing that fits your unique business needs

Use SimpleIndex OCR to convert scanned and digital images to searchable PDF files for automated sorting, filing, and export to applications such as Word, Excel, PowerPoint, etc.

KB Articles for Automatic Indexing

1-Click Processing, Automatic Data Capture, Automatic Indexing Software, Barcode Recognition Software, Database, Database Autofill, Document Automation, File Indexing, Full Text Indexing, Keyword Indexing, Metadata, Microsoft Word Data Extraction, OCR, Office PDF Document Indexing, offline OCR, on-prem OCR, on-site OCR, One-time payment OCR, PDF Data Extraction Software, Scanned Document Indexing, Scanning Software, Self-hosted OCR, Subscription free OCR, Sunshine OCR

No Comments

Organize Office Documents with Text Parsing

Tuesday, 23 January 2018 by Simple Software

This video shows the Sort My Documents sample job included with the SimpleIndex trial download. It shows how you can organize office documents automatically by parsing the file’s text for relevant metadata and keywords. You can then use those keywords to tag documents with metadata and create standardized folders and filenames.

First we sort Word documents, Excel spreadsheets and PowerPoint presentations automatically using the SimpleIndex template and dictionary matching algorithms that match patterns and keywords in the parsed text.

Then the files are organized into folders and filenames using the Sales Rep, Customer, Document Type and Date values extracted from the text.

Organize Office Documents for Cloud Storage

You can also upload organized files to SharePoint or Cloud Storage platforms without the chaos and disorganization you inevitably get when users create their own folders and filenames.

Organize Office Documents for Document Management

In the video, we use SimpleSearch to search and view the sorted files. But you can just as easily use any third party document management system or custom database to perform keyword or full-text searching.

You can use the SimpleView embedded viewer to view Office documents, PDF files and images in a common interface. In the video we use the full version of Word, Excel, and PowerPoint to edit Office documents right from the search screen.

Find Out More

Learn More:

FAQ Related to Organizing Office Documents

Document Classification, Full Text Indexing, MS Office, Office PDF Document Indexing, Office PDF Text Processing, Office to PDF, Paperless Office, Search, SharePoint Migration, SharePoint Scanning, Text Processing

Document Classification Full Text Indexing MS Office Office PDF Document Indexing Office PDF Text Processing Office to PDF Paperless Office Search SharePoint Migration SharePoint Scanning Text Processing

No Comments

Searching and Viewing Documents

Wednesday, 04 October 2017 by dwilder

If you have not yet decided on a plan for how to organize your electronic documents for later retrieval, you should take some time to consider the possible options.

There are several ways to search and view documents processed with SimpleIndex^®

Use SimpleSearch to use keyword searches to find and view indexed documents
Use SimpleView to browse folders, search files, view, edit and annotate scanned documents without a database
Use a Document Management System for integrated searching, viewing, workflow, security, compliance auditing and other records management functions
Use Cloud Storage platforms like Google Drive, Box and OneDrive
Use SharePoint to share documents online with custom metadata, create custom document workflows and employ records management standards
Link files to a custom application using the Command Line Interface or RPA bot
Integrate with your Database to associate documents with records via link or binary field
Work with our Professional Services Team or an Authorized Dealer to create a customized solution or direct integration with virtually any application

Choosing a Document Management Solution

Given the variety of free or very low cost file storage solutions available, why would you invest thousands of dollars in a document management system?

When high security or access tracking logs are required
Compliance with regulations like HIPAA, Sarbanes-Oxley, FINRA, FOIA, SEC, etc.
There are document-based workflows that can benefit from automation
Users need to view files without installing software licenses (like DWG, VSD or PSD)

If you already have a database or business app that you use to search for records, and that application has the ability to store or link external documents to those records, this is usually the best choice.

If your business application doesn’t have document management capabilities, there are integrations that can overlay a button or hotkey that lets users open associated files from any screen.

If your business has many different types of documents spread across multiple departments that use different applications, and they sometimes need to be able to access each others’ documents, then a central repository is the better solution.

Cloud Storage platforms like Google Drive, Box and OneDrive provide low-cost, secure online storage that makes it very easy to share documents worldwide on any device. However they don’t have the ability to do field-level indexing to allow for more granular searches, lack the compliance tracking features of a more robust DMS, and don’t have integrated viewers that can display some of the less common file formats without having the application installed.

Find Out More

Read more about Affordable Document Management solutions with SimpleIndex.

Check out our How to Scan Documents for a detailed guide to creating scanning and retrieval systems.

Download SimpleIndex Now

Our Professional Services department can help you with every step of our project, and often have you up and running in just a couple of hours.

Please Contact Us to schedule a demo or ask us any questions you have!

Learn More:

KB Articles for Document Management

Contentverse, Document Imaging, Document Management Software, Document Retrieval, Full Text Indexing, on-prem OCR, on-site OCR, Paperless Office, PaperVision, PDF Archive Scanning Software, Personal Document Management, QuickBooks Document Management, Search, Sunshine Software OCR, Unattended Processing

Contentverse Document Imaging Document Management Software Document Retrieval Full Text Indexing on-prem OCR on-site OCR Paperless Office PaperVision PDF Archive Scanning Software Personal Document Management QuickBooks Document Management Search Sunshine Software OCR Unattended Processing

No Comments

Features

Key Features and Recent Enhancements

Website Resources

SimpleIndex Applications in Your Industry

SimpleIndex Feature Highlights

The Simple Software Imaging Suite

Top Features

New versions History