SIGN IN YOUR ACCOUNT TO HAVE ACCESS TO DIFFERENT FEATURES

Login with Google
CREATE AN ACCOUNT FORGOT YOUR PASSWORD?

FORGOT YOUR DETAILS?

AAH, WAIT, I REMEMBER NOW!

CREATE ACCOUNT

ALREADY HAVE AN ACCOUNT?

Login with Google

QUESTIONS? CALL: 865-637-8986
  • SIGN UP
  • LOGIN

SimpleIndex

  • LEARN MORE
    • GENERAL INFO
      • Getting Started
      • How To Scan Documents
      • Barcode Scanning Guide
      • Searching & Viewing
      • News & Updates
      • Schedule a Consultation
    • FEATURES
      • Streamlined Interface
      • TWAIN and ISIS Scanning
      • Zone OCR and Dynamic OCR
      • Database Integration
      • Required Documents Check
      • Automated Processing & 1-Click Interface
      • SharePoint Document Scanning
      • Convert Email to PDF
    • –
      • Document Classification
      • PDF & MS Office Text Parsing
      • Barcode Recognition
      • Handwriting Recognition
      • Optical Mark Recognition
      • Match Documents to Existing Data
      • Imprinting & Watermarking
      • Screenshot OCR
  • SOLUTIONS
    • General
      • All-In-One Scanning & Sorting Tool
      • Affordable Document Management
      • Instant Integration
      • Network Scanners & Copiers
      • Remote Document Capture
      • Reduce Click Charges for Data Capture
    • Specific
      • Sales Tax Exemption Forms
      • Federal Tax Returns
      • Invoice Processing
      • Material Safety Data Sheets (MSDS)
      • Patent ID and Title Extraction
      • Mortgage & Loan Documents
    • Feature Demos
      • Zone OCR with Template Matching
      • PDF Text Processing
      • Organize Office Documents
      • AP to QuickBooks Online with RPA
      • PDF Form Filling with XML & RPA
      • Full-Page OCR & Multi-User Workflow
      • Compare with Other Solutions
  • SUITE
    • SimpleCoversheet – Print Bar Codes
    • SimpleExport – Data File Converter
    • SimpleView – Search, View & Edit
    • SimpleQB – QuickBooks Integrator
    • SimpleOCR – Freeware OCR
    • Buy Suite Apps
    • Buy Suite Bundles
  • DOWNLOAD
  • SHOP
    • COMPARE VERSIONS
    • SIMPLEINDEX WORKSTATION
      • Machine License
      • Concurrent User
      • Subscription License
    • SIMPLEINDEX SERVER
    • SUITE APPLICATIONS
    • SUITE BUNDLES
    • MAINTENANCE & RENEWALS
    • MANAGE SUBSCRIPTIONS
    • FIND A DEALER
      • Dealer Locator
      • Become a Dealer
    • CONTACT SALES
  • SUPPORT
    • WIKI HELP
    • KNOWLEDGE BASE
    • SIMPLEINDEX UNIVERSITY
      • SimpleIndex University – 100 Series
      • SimpleIndex University – 200 Series
      • SimpleIndex University – 300 Series
    • PRIVACY POLICY
    • CONTACT SUPPORT
  • My Account
    • MANAGE SUBSCRIPTIONS
    • Downloads
  • MY CART
    No products in cart.
  • Home
  • Simple Software Knowledge Base - Article

Processing and text extraction of Microsoft Office, Adobe PDF files, HTML files and other electronic documents.

Change the Dictionary Separator Value

Monday, 29 July 2019 by Simple Software

This is used to change the dictionary separator value when doing thesaurus matching from the default character of | to any character(s) that you want. This can be useful in cases where the values you would like in your list or dictionary might include the pipe character or “|” or “Shift Backslash”

This setting is also used as the delimiter when parsing multiple index field values from bar codes (e.g. field1|field2|field3).

Instructions for changing the dictionary separator value:

  1. Right click on the Job Configuration file that you would like to suppress the prompt on and select Open With>Notepad
  2. Search the XML settings text open in Notepad for this term:
    <OCR_DICT_SEPARATOR>
  3. Change the value in-between from “|” to any other single character that you want.
  4. For TAB separation use %TAB%
This image has an empty alt attribute; its file name is Separator1.jpg

Bar Code ScanningBar CodesBarcode OCRBarcode Reading SoftwareBarcode Recognition SoftwareOCROCR Form ProcessingOCR ScanningPDF Barcode RecognitionZone OCR
Read more
No Comments

Regular Expression (RegEx) – Syntax or Type

Monday, 29 July 2019 by Simple Software

Please refer to the Wiki Documentation for the complete Regular Expressions reference.

SimpleIndex uses the .NET regular expressions library.

.NET uses the JavaScript/ECMAScript regular expression syntax format.

For more information see the Regular Expressions Wiki Page.

Barcode OCRClipboard OCRInvoice OCROCROCR Form ProcessingOCR ScanningScreen Scraping OCRScreenshot OCRTWAIN Scanning SoftwareUnattended ProcessingZone OCR
Read more
No Comments

Check and Repair All PDF Files

Monday, 29 July 2019 by Simple Software

Please refer to the Wiki Documentation for the complete PDF reference.

You can set SimpleIndex to assume that it needs to check every PDF file and fix it.

Go to this location in the Windows Registry:

Computer\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\SimpleIndex\Misc

Create a New String Value called “FixAllPDF” and set the value to 1

Office PDF Document IndexingOffice PDF Text ProcessingOffice to PDFPDFPDF Archive Scanning SoftwarePDF Barcode RecognitionPDF Data Extraction SoftwarePDF FormsUnattended Processing
Read more
No Comments

Keep Pages in Original Order when Bookmarking

Monday, 29 July 2019 by Simple Software

If you want to keep all the pages in the same order that they were imported, even though they all go with different bookmarks then do the following.

1.  Open the configuration in Notepad.
2.  Search for <BOOKMARK_PAGE_ORDER>
3.  Change this line from “false” to “true”:  <BOOKMARK_PAGE_ORDER>true</BOOKMARK_PAGE_ORDER>
4.  Save and close.

Office PDF Document IndexingOffice PDF Text ProcessingOffice to PDFPDFPDF Archive Scanning SoftwarePDF Barcode RecognitionPDF BookmarkingPDF Data Extraction SoftwarePDF FormsUnattended Processing
Read more
No Comments

Do Not Combine Pages to 1 Bookmark

Monday, 29 July 2019 by Simple Software

Please refer to the Wiki Documentation for the complete PDF Bookmarking reference.

If you want to keep pages in bookmarks separate instead of combining them into a single bookmark when the same bookmark value is found in several interspersed images in the batch do the following:

1.  Open the Job Configuration file in Notepad.
2.  Search for this value:  <BOOKMARK_PDF1>
3.  Enter this directly above the line that has <BOOKMARK_PDF1> if its not already there:  <BOOKMARK_UNIQUE_LEVELS>-1</BOOKMARK_UNIQUE_LEVELS>
4.  -1 is the default value and that means that no pages should be combined into one bookmark unless they fall in order.  0 means that the first bookmark level should be combined into one bookmark value and the rest should not.  1 means that the first and second bookmark levels should be combined and the rest should not be.  ETC.

PDFPDF Archive Scanning SoftwarePDF Bookmarking
Read more
No Comments

Can I split a PDF based on bookmark values?

Friday, 12 July 2019 by aaron

Please refer to the Wiki Documentation for the PDF Bookmarking reference.

SimpleIndex can create PDF files with bookmarks based on the index data captured in your batch.

Going the other way–splitting an existing PDF file based on the bookmark value–is not a built-in feature of SimpleIndex. However there are inexpensive command line utilities that you can integrate with SimpleIndex in order to accomplish this.

For example, the CoolUtils PDFSplitter and A-PDF Split both offer this function starting around $35.

The command line to split the PDF can be integrated into the Pre-Process setting in SimpleIndex, found under the Advanced Settings section of the Configuration Wizard. An example pre-process using PDFSplitter to split based on the second level bookmark values would be:

PDFSplitter.exe “c:\Images\BookmarkFile.pdf” “%CONFIGFILEFOLDER%\Input” -em bookmarks -b 2

Command Line Interface
Read more
No Comments

Is it possible to search for and retrieve documents with Windows desktop search?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Searchable PDF reference.

Windows Search works great with SimpleIndex because all index data can be saved to the folder and file names as well as the file properties, and OCR text can be saved to hidden layers in PDF files. Windows Search will read all of these elements when building its index and will return any matching files when you search.

Using Windows Search on a file server allows for instantaneous searching across terabytes of documents and text for all of the users on your network.

IFilters allow Windows Search to search within file contents.

Here are three popular PDF IFilters that will enable text searching for PDF files:

  • Foxit PDF IFilter (commercial)
  • TET PDF IFilter (free/commercial)
  • Adobe PDF IFilter (32-bit / 64-bit) (free)

If you have issues with PDF text searching in Windows 10, this article has detailed instructions for resolving PDF IFilter issues:

https://fixedit.itxpress.biz/2018/07/05/searching-pdfs-in-windows-10/

ContentverseDocument Management SoftwareDocument RetrievalFile IndexingMicrosoft Word Data ExtractionOffice PDF Document IndexingOffice PDF Text ProcessingPaperless OfficePaperVisionPDF Archive Scanning SoftwareQuickBooks Document ManagementSearchServer OCRText ProcessingUnattended Processing
Read more
  • Published in Database & Retrieval, Export, Office PDF Text Processing
No Comments

Can SimpleIndex read bar codes from existing PDF files?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Bar Code Recognition reference.

There are 2 types of PDF files. PDFs created by scanning applications use images, while PDF files created by software or printer drivers use text. SimpleIndex can read bar codes from either type of document.

With image PDFs, SimpleIndex will use normal image barcode recognition. With text PDFs, SimpleIndex can read the value of the barcode from text (if it was created with a font) or convert the PDF to an image and read it (if the bar code is an image).

To read the barcode from text is much faster and all versions of SimpleIndex include the ability to parse the text of PDF file.

Find out more about bar code scanning on our Bar Code Scanning Guide.

Bar Code ScanningBar CodesBarcode OCRBarcode Reading SoftwareBarcode Recognition SoftwarePDF Barcode RecognitionPDF417QR Code
Read more
  • Published in Bar Codes, Import, Office PDF Text Processing
No Comments

Is there a way to just use part of a bar code or OCR value? For example, extract “50” from the value “124450”

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Bar Code Recognition reference.

To do this example, create a barcode field (Field 1 for example) and a 2nd field with type “Fixed”. In the template for the 2nd field, enter %FIELD1[5,2]% to get “50” from “124450”.

%FIELD1% would get the entire value for Field #1, the barcode field. By adding the [5,2] you tell SimpleIndex to start at the 5th character (5) and take 2 characters from the value (50).

Find out more about barcode scanning on our Barcode Scanning Guide and read up on Optical Character Recognition on the SimpleOCR scanning solutions guide.

Automatic Data CaptureAutomatic Indexing SoftwareBar Code ScanningBar CodesBarcode OCRBarcode Reading SoftwareBarcode Recognition SoftwareClipboard OCRDocument ImagingDocument ScanningImage ScanningInvoice OCRKeyword IndexingOCROCR Form ProcessingOCR ScanningOffice PDF Document IndexingPDF Barcode RecognitionPDF417QR CodeQuickBooks Document ManagementScanned Document IndexingScreen Scraping OCRScreenshot OCRTWAIN Scanning SoftwareZone OCR
Read more
  • Published in Bar Codes, OCR, Office PDF Text Processing
No Comments

How do you configure OCR to read index information from MS Office or PDF documents?

Wednesday, 28 February 2018 by dwilder

Please refer to the Wiki Documentation for the complete Zones & OCR Settings reference.

MS Office and PDF files generated by software or PDF printer drivers already have the text you need to recognize in the file. Scanned documents need to use OCR to read text from an image of the page. With Office and PDF files, SimpleIndex can just read the text, which is much faster and accurate than image OCR.

To recognize index fields from the document text, first create OCR fields on the Index tab as you would normally. Next, on the Zones & OCR options tab, check the “Use Full Page OCR for this Field” option for each OCR field. This tells SimpleIndex to process the existing file text.

If the index value is a unique pattern of digits or list of possible values, use Template or Dictionary matching to locate the value within the text. Please see the manual for details on Template and Dictionary matching.

If the value appears in a specific location in each file, coordinates can be used to locate it. When processing text, the X, Y, Width and Height settings correspond to line and column numbers within the file text. This is explained in greater depth in the manual.

SimpleIndex will assume that any TXT file with the same name as a file being processed is the OCR text for that file, so this method can work with any type of file.

Find out more about Optical Character Recognition on the SimpleOCR Guide.

Microsoft Word Data ExtractionMS OfficeOffice PDF Document IndexingOffice PDF Text ProcessingOffice to PDFPaperless OfficePDFPDF Archive Scanning SoftwarePDF Barcode RecognitionPDF Data Extraction SoftwarePDF FormsText ProcessingUnattended Processing
Read more
  • Published in OCR, Office PDF Text Processing
No Comments

Can SimpleIndex create searchable PDF Image+Text files with hidden text?

Wednesday, 28 February 2018 by dwilder

Yes, it can.  You can configure this setting in the Job Settings Wizard by going to the OCR step and checking “Enable full-page OCR”.  There are many settings in the OCR step that you can used to customize the output and recognition of images.

SimpleIndex has two different OCR engines (Standard and Professional) that can be used to produced PDF Image + Text files or Searchable PDFs.

Related Links

  • SimpleIndex.com – OCR Languages
  • SimpleOCR.com – OCR Guide
  • SimpleIndex Wiki – OCR
  • SimpleIndex Wiki – Searchable PDF
  • SimpleIndex Wiki – OCR Options
  • SimpleIndex Wiki – FineReader
  • SimpleIndex Wiki – MRC
  • SimpleIndex Wiki – Tesseract
  • SimpleIndex Wiki – Languages

Full Text IndexingOCROCR Form ProcessingOCR ScanningOffice PDF Text ProcessingPDF Data Extraction SoftwareText ProcessingUnattended ProcessingZone OCR
Read more
  • Published in Export, OCR, Office PDF Text Processing
No Comments

Search

Contact Us Today!

=

Search Knowledge Base

Recent KB Articles

  • Why are my barcodes not being recognized properly?
  • How do I configure SimpleIndex to scan documents?
  • Can SimpleIndex create searchable PDF Image+Text files with hidden text?
  • How are Simple Software products licensed?
  • I know nothing about databases. Can I still use the database and Retrieval Mode features?
  • How can I improve recognition rates for my OCR fields?
  • How can I use barcodes or blank pages as Document Separators.
  • How do I automatically delete blank pages from duplex documents?

Feature Cloud

Document Capture Solution Patch Code Unattended Processing PDF XSLT TIFF PDF Annotations Batch Scanning MS Office Text Processing Screenshot OCR Document Scanning RegEx TWAIN & ISIS Scanning PDF Forms SimpleSend Bates Numbering Software Invoice OCR Watermark PDF Files QuickBooks Online ISIS Driver Contentverse Barcode Recognition Software Unattended Document Classification Document Retrieval Scanned Document Indexing Paperless Office Convert Email to PDF Imprinting & Watermarking MySQL Command Line Interface Front End Scanning TWAIN Scanning Software PDF Data Extraction Software PDF Barcode Recognition SAGE Document Automation Zone OCR Mortgage OCR Form Processing Full Text Indexing Office to PDF Scanning Software Optical Mark Recognition Optical Character Recognition

Online Support Options

Check our Wiki Help, Knowledge Base and Training Videos, or Contact Support if you still need Help

How to Buy

Solutions start at just $500! Buy SimpleIndex online or from an Authorized Dealer in your area.

Authorized Dealers

Authorized DealersSimpleIndex is a great addition to any system integrator's product line. Become an Authorized Dealer.

Get a Web Demo

Get a free online demo with a scanning specialist who can configure SimpleIndex on your computer remotely.
Sign up for a demo now!

Download a Trial

SimpleIndex Trial30-day trial downloads are available for all Simple Software applications.
Download Now!

SimpleIndex Applications

SimpleIndex Applications Packaged apps built with SimpleIndex.
SimpleInvoice for AP
Sales Tax Manager
Mortgage LoanStacker
MSDS and Patents
SimpleIndex

© 2023 Meta Enterprises, LLC | Knoxville, Tennessee | A Family Owned Company
© 2023 SimpleSoftware | Consulting Services in the Field of Software as a Service

TOP
Manage Cookie Consent
We use cookies to optimize our website and our service.
Functional cookies Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage vendors Read more about these purposes
View preferences
{title} {title} {title}
});