PDF Processing Settings

From Simple Wiki

Back to Settings Wizard

Setup Job Configuration PDF Processing Settings Screen
PDF Processing

The PDF Processing options determine how PDF files will be converted on input or whenever it must be rasterized for OCR, OMR and barcode recognition.

PDF to Image Conversion[edit | edit source]

This option converts imported PDF files to TIF. Select Convert to TIFF (Detect Color) to automatically detect color versus black and white images and save them accordingly. Use Convert to B&W or Convert to Color to convert all pages to one format or the other.

PDF to Image Resolution (X,Y)[edit | edit source]

When converting PDF to TIFF, these settings indicate the output resolution of the TIFF image. The first indicates the X or horizontal DPI (dots per inch) setting. The second indicates the Y or vertical DPI.

When saving embedded images from a PDF file, a default resolution of 300dpi is assumed for images found in the PDF files. Use these settings to override the X and Y resolution settings when extracting PDF images. If these settings do not match the originals, output images will show incorrect page dimensions.

Prior to version 10, the default resolution for PDF files was 200dpi, but this has been increased in order to improve OCR and barcode recognition from PDF using the default settings.

The viewer automatically samples images at a lower resolution (96dpi) for viewing in order to optimize performance and memory usage.

Convert Office, HTML, Text and Images Files to PDF[edit | edit source]

Converts all MS Office, HTML, XML, text, and image files in the Input folder to PDF before processing. Files are converted and saved in the Input folder before the import step.

If Keep Input Files is unchecked, the original files are deleted following conversion. To avoid having both copies of a file imported after conversion, use the Input File Types option.

This option requires MS Office 2003 or above or OpenOffice 3 or above to work. OpenOffice is available for free at OpenOffice.org.

Repair Corrupt and Non-Compliant PDF Files[edit | edit source]

Automatically repair corrupt and non-compliant PDF files is enabled by default. This prevents most errors during batch processing whenever non-standard PDF files cannot be read.

There are thousands of applications that generate PDF files and many of these do not fully conform to the PDF standard. The repair function corrects the internal PDF structures allowing the images and text to be processed. However, in rare cases this can result in the loss of graphic elements like form fields or annotations.

Use Backwards Compatible PDF Text Extraction[edit | edit source]

Newer versions of SimpleIndex's PDF text extraction supports more PDF formats and features. Enabling this option makes the Job Configuration backwards compatible with older versions of SimpleIndex and keeps the PDF formatting from previous versions.

Detect and Binarize Black and White Images[edit | edit source]

Enabling this options will save images with low-color as black and white to reduce the file size.

PDF Encryption/Decryption Password[edit | edit source]

Use the following options when importing or exporting password-protected PDF files. The password entered here is used to decrypt password-protected PDF files when importing and processing. The password will also be used to encrypt PDF files on export, if that option is select.

Enter the password required and click "Set...", then enter the password again for confirmation.

Use Password from User Login[edit | edit source]

When working with PDF files that have different passwords, this option is useful as it allows the user to enter a password when signing on to SimpleIndex and have that password used for that session. All files processed in a single batch must have the same password, but this option allows you to quickly change the password between batches without having to edit the job options.

Encrypt PDF on Export[edit | edit source]

All PDF files are encrypted with the current password when the batch is exported.

PDF Processing Training Video[edit | edit source]

Video was recorded in a previous version of SimpleIndex. Refer to the wiki documentation for latest updates.

Next Step File Output Settings