File Input Settings

From Simple Wiki

Back to Settings Wizard

Setup Configuration Wizard File Input Settings
File Input

The File Input Settings screen is displayed when Folder is selected as the input type. These settings determine which files are imported, and what order they are processed.

Input Folder[edit]

The folder that new files will be imported from. Enter the path of the folder to import from or click the "Set..." to pick a specific folder. The drop down includes different Relative Paths options.

Keep Input Files[edit]

Select this option to keep all of the original files in the input folder instead of removing them as they are processed. This option is useful when testing configuration settings, as well as for re-organizing documents while leaving the originals intact. This option is also necessary to prevent errors when the Input folder is read-only.

When using the Last batch time () setting, it is possible to process only new files added to the Input folder since the last batch. Without this option selected, all files in the folder are processed in every batch.

See Keep Existing Folders and Filenames for additional information on preserving the existing structure for all folders and files when processing.

Split Multi-Page Files[edit]

When processing multi-page input files, each file is treated as a single document with a single index value by default. Checking this option causes SimpleIndex to split multi-page files into single images that must be indexed individually, but can be recombined into new files during output based on the indexes.

Process Subfolders[edit]

All subfolders of the Input folder are processed automatically when this option is checked. Each folder is processed in its own batch until all folders have been processed. Not available when scanning.

This option is useful when integrating with network scanners that scan multiple documents to a single file. Another common scenario is when batches of documents are saved to a single PDF file but must be separated for searching.

See Keep Existing Folders and Filenames for additional information on preserving the existing structure for all folders and files when processing.

Remove Empty Subfolders After Processing[edit]

Automatically deletes empty folders after the files have been processed. Only applicable when the Process Subfolders option is selected.

Fast Import[edit]

Disables sorting and other extra checks performed when reading the input folder when they aren't needed, significantly improving import times, especially when there are more than 1000 files in the Input folder.

Disabled Features and Checks:

  • Sort Files by Date
  • Recompress Images
  • Resample Images
  • Extract Existing Electronic Text from Imported Files

Sort Files by Date[edit]

Sort input files by modified date instead of name. Ensures that files are processed in the order they were created.

Sort Subfolders by Date[edit]

Sort subfolders by modified date instead of name when processing subfolders. Ensures that subfolders are processed in the order that they are added to the input folder.

Recompress Images[edit]

Check this box to automatically recompress all imported images using the default compression settings.

By default pages are only recompressed when they are converted from one format to another, such as PDF to TIFF.

Resample Images[edit]

Check this box to automatically resample all imported images using the PDF Resolution setting. This is helpful to standardize image resolution for zone OCR on files that come from different sources like scanners, copiers and fax machines.

Max Files/Batch[edit]

This option determines the maximum number of image files that will be used to determine a batch. In Update and Retrieval modes, this value should be set to prevent users from retrieving too many records at once, which can impact performance.

This feature is also useful when using SimpleIndex to index many files from a very large Input folder. Processing more than 1,000 pages in a single batch is not recommended.

Another common scenario is when batches of documents are scanned on a network scanner and saved to the Input folder as multipage files. Set Max files/batch to 1 and enable Split Multipage Input Files to process each file in its own batch.

Set to 0 (default) for unlimited.

Run Job until Input Folder is Empty[edit]

When running a job on a timer loop, this option will stop the timed processing once all files in the input folder have been processed. This is useful for processing very large folders in several smaller batches or for processing several multi-page files as individual batches.

Prevent Other Jobs from Running[edit]

During import and export, a STOPFILE will be placed in the folder to prevent other processes from running on the imported file that could cause errors in the process. This can be disable in standalone configurations or when exported files will always be unique.

Backup Folder[edit]

The Backup folder is used to store a copy of the original files before they are processed, as well as invalid files that cannot be processed and batches that have errors or were aborted by the user.

Backup All Input Files[edit]

When checked, all files are automatically copied to the Backup folder before they are processed. Subfolders are created automatically in the Backup folder using the Batch ID (the current date/time formatted as YYYY-MM-DD@HHMMSS).

Move invalid Files to Backup Folder[edit]

This will detect corrupted images, PDF files or office documents during the import process and move them to the Backup folder before processing, which avoids issues with further processing of invalid and/or improperly created files.

File Types to Process[edit]

To ensure that only valid files are processed, SimpleIndex only imports files that match the File Types to Process setting. This setting is defined as a list of file extensions or filename masks.

You may select from a list of predefined document type groups and add all the associated file extensions at once. Individual file extensions can be added or removed from the list.

Type a file extension and click Add to add it to the list, or select an existing file extension from the list and click Remove to remove it from the list.

It is also possible to use a file mask (e.g. "a*.pdf" to capture PDF files that start with the letter "a"). Enter the mask manually and click Add to add it to the list.

See File Formats for the full list of files that can be viewed, processed or exported by SimpleIndex. If the appropriate software is installed on the computer, any OLE-enabled application may be edited with SimpleIndex. For example, MS Office documents can be edited in Word, Excel, etc. if those applications are installed. Other file types may be opened automatically in their default application.

File Input Settings Training Video[edit]

Video was recorded in a previous version of SimpleIndex. Refer to the wiki documentation for latest updates.

Related Knowledge Base Articles[edit]

Next Step Email Import