How can I improve recognition rates for my OCR fields?
There are several things you can do to improve accuracy for ocr.
- Scan at 300dpi, black & white for best results.
- Adjust the scan settings to remove background noise and improve the definition of characters.
- For zone ocr, field recognition can often vary based on the surrounding white space and text in the zone. Try varying the size of the zone to achieve optimal results.
- For template matching, make sure all variations of the field format are included in the template list.
- For dictionary matching, add common variations and OCR mistakes to the “thesaurus” list.
- On the Zones & OCR tab (accessed from the Job Options) you can adjust the Max Errors setting to allow for more mistakes in the dictionary matching process.
- Use the Strip Spaces, Strip Characters, Replace Characters and Case Fixing options to standardize the field format prior to matching.
Please refer to the SimpleIndex Wiki for details on how to configure these options.
- SimpleIndex.com – Zone OCR
- SimpleIndex.com – Dynamic OCR
- SimpleOCR.com – OCR Guide
- SimpleIndex Wiki – OCR
- SimpleIndex Wiki – OCR Options
- SimpleIndex Wiki – Zone OCR
- SimpleIndex Wiki – Full Page OCR
- SimpleIndex Wiki – Zones & OCR Settings
- SimpleIndex Wiki – OCR to Field
- SimpleIndex Wiki – OCR Text View
- SimpleIndex Wiki – Template & Dictionary Matching OCR
- SimpleIndex Wiki – OMR and OCR Document Separation