File Input Settings: Difference between revisions

From Simple Wiki
Line 49: Line 49:
== Max Files/Batch ==
== Max Files/Batch ==


This option determines the maximum number of image files that will be used to determine a batch. In Update (5.9.2.3) and Retrieval modes (5.9.2.4), this value should be set to prevent users from retrieving too many records at once, which can impact performance.
This option determines the maximum number of image files that will be used to determine a batch. In [[Update]] and [[Retrieval]] modes, this value should be set to prevent users from retrieving too many records at once, which can impact performance.


This feature is also useful when using SimpleIndex to index many files from a very large Input folder. Processing more than 1,000 pages in a single batch is not recommended.
This feature is also useful when using SimpleIndex to index many files from a very large Input folder. Processing more than 1,000 pages in a single batch is not recommended.

Revision as of 18:01, 16 January 2022

Back to Settings Wizard

SimpleIndex Simple Setup Configuration Wizard File Input Settings Stage
SimpleIndex Simple Setup Configuration Wizard File Input Settings Stage

The File Input Settings screen is displayed when Folder is selected as the input type. These settings determine which files are imported, and what order they are processed.

Input Folder[edit | edit source]

The folder that new files will be imported from.

Keep Import Files[edit | edit source]

Select this option to keep all of the original files in the input folder instead of removing them as they are processed. This option is useful when testing configuration settings, as well as for re-organizing documents while leaving the originals intact. This option is also necessary to prevent errors when the Input folder is read-only.

When using the Last batch time () setting, it is possible to process only new files added to the Input folder since the last batch. Without this option selected, all files in the folder are processed in every batch.

Split Multi-Page Input Files[edit | edit source]

When processing multi-page input files, each file is treated as a single document with a single index value by default. Checking this option causes SimpleIndex to split multi-page files into single images that must be indexed individually.

Process Subfolders[edit | edit source]

All subfolders of the Input folder are processed automatically when this option is checked. Each folder is processed in its own batch until all folders have been processed. Not available when scanning.

This option is useful when integrating with network scanners that scan multiple documents to a single file. Another common scenario is when batches of documents are saved to a single PDF file but must be separated for searching.

Remove Empty Subfolders After Processing[edit | edit source]

Automatically deletes empty folders after the files have been processed. Only applicable when the Process Subfolders option is selected.

Sort Files by Date Modified[edit | edit source]

Sort input files by modified date instead of name. Ensures that files are processed in the order they were created.

Sort Subfolders by Date Modified[edit | edit source]

Sort subfolders by modified date instead of name when processing subfolders. Ensures that subfolders are processed in the order that they are added to the input folder.

Recompress Images[edit | edit source]

Check this box to automatically recompress all imported images using the default compression settings.

By default pages are only recompressed when they are converted from one format to another, such as PDF to TIFF.

Resample Images[edit | edit source]

Check this box to automatically resample all imported images using the PDF Resolution setting. This is helpful to standardize image resolution for zone OCR on files that come from different sources like scanners, copiers and fax machines.

Max Files/Batch[edit | edit source]

This option determines the maximum number of image files that will be used to determine a batch. In Update and Retrieval modes, this value should be set to prevent users from retrieving too many records at once, which can impact performance.

This feature is also useful when using SimpleIndex to index many files from a very large Input folder. Processing more than 1,000 pages in a single batch is not recommended.

Another common scenario is when batches of documents are scanned on a network scanner and saved to the Input folder as multipage files. Set Max files/batch to 1 and enable Split Multipage Input Files to process each file in its own batch.

Set to 0 (default) for unlimited.

Run Job until Input Folder is Empty[edit | edit source]

When running a job on a timer loop, this option will stop the timed processing once all files in the input folder have been processed. This is useful for processing very large folders in several smaller batches or for processing several multi-page files as individual batches.

Fast Import[edit | edit source]

Disables sorting and other extra checks performed when reading the input folder when they aren't needed, significantly improving import times, especially when there are more than 1000 files in the Input folder.

Backup Folder[edit | edit source]

The Backup folder is used to store a copy of the original files before they are processed, as well as invalid files that cannot be processed and batches that have errors or were aborted by the user.

Backup All Input Files[edit | edit source]

When checked, all files are automatically copied to the Backup folder before they are processed. Subfolders are created automatically in the Backup folder using the Batch ID (the current date/time formatted as YYYY-MM-DD@HHMMSS).

File Types to Process[edit | edit source]

The File Types to Process interface has been enhanced in the wizard to allow you to select from a list of predefined document types and add all the associated file types at once. Individual file extensions can be added or removed from the list.

Type a file extension and click Add to add it to the list, or select an existing file extension from the list and click Remove to remove it from the list.

It is also possible to use a file mask (e.g. "a*.pdf" to capture PDF files that start with the letter "a"). Enter the mask manually and click Add to add it to the list.

TIF, JPG, GIF, and BMP images may be processed as images. PDF, DOC, XLS, PPT, and other files may be processed and viewed in SimpleIndex if the appropriate software is installed on the computer. Any OLE-enabled application may be processed with SimpleIndex as long as its viewer is installed. Non-OLE file types may be opened automatically in their default viewer.

It is also possible to use advanced options using the Input file types or mask option on the Job Options screen.