Automatic archival of Microsoft Office documents to PDF via batch conversion, indexing and document management workflow.
You can set SimpleIndex to assume that it needs to check every PDF file and fix it.
Go to this location in the Windows Registry:
Create a New String Value called “FixAllPDF” and set the value to 1
If you want to keep all the pages in the same order that they were imported, even though they all go with different bookmarks then do the following.
1. Open the configuration in Notepad.
2. Search for <BOOKMARK_PAGE_ORDER>
3. Change this line from “false” to “true”: <BOOKMARK_PAGE_ORDER>true</BOOKMARK_PAGE_ORDER>
4. Save and close.
You can tell SimpleIndex what types of files it should process and which file types to ignore. This is done by clicking “Job Options” On the “Batch” tab you will find a field labeled “Input file types or mask”. These are the file types that SimpleIndex will input files from. The default types are: TIF,PDF,JPG,GIF,BMP,DOC,XLS,PPT,DOCX,XLSX,PPTX,VSD,DWG,AVI,MP3 To process all files, enter * SimpleIndex will ignore any file whose extension does not appear on the list. In SimpleIndex 6 or above you can enter file masks to filter input files. Some examples are: abc*.pdf (PDF files starting with “abc”) ab??ef.* (All files starting with “ab”, 2 characters and “ef”) It is possible to have some file types open automatically in their default application. This can be done by inserting a pipe “|” into the list. Any file types after the pipe will be opened in their default application. For example: TIF,PDF,JPG|WAV,M
MS Office and PDF files generated by software or PDF printer drivers already have the text you need to recognize in the file. Scanned documents need to use OCR to read text from an image of the page. With Office and PDF files, SimpleIndex can just read the text, which is much faster and accurate than image OCR. To recognize index fields from the document text, first create OCR fields on the Index tab as you would normally. Next, on the Zones & OCR options tab, check the “Use Full Page OCR for this Field” option for each OCR field. This tells SimpleIndex to process the existing file text. If the index value is a unique pattern of digits or list of possible values, use Template or Dictionary matching to locate the value within the text. Please see the manual for details on Template and Dictionary matching. If the value appears in a specific location in each file, coordinates can be used to locate it. When processing text, the X, Y, Width and Height settings correspond to