Languages: Difference between revisions

From Simple Wiki
 
Line 11: Line 11:
* [https://www.simpleindex.com/knowledge-base/languages-supported-in-simplesoftware-ocr-engines/ Languages Supported in SimpleSoftware OCR Engines]
* [https://www.simpleindex.com/knowledge-base/languages-supported-in-simplesoftware-ocr-engines/ Languages Supported in SimpleSoftware OCR Engines]
* [https://www.simpleindex.com/knowledge-base/change-default-font-character-set/ Index With Non-Latin Character Sets]
* [https://www.simpleindex.com/knowledge-base/change-default-font-character-set/ Index With Non-Latin Character Sets]
* [https://www.simpleindex.com/knowledge-base/can-simpleindex-create-searchable-pdf-imagetext-files-with-hidden-text/ Can SimpleIndex create searchable PDF Image+Text files with hidden text?]

Latest revision as of 17:27, 15 August 2023

SimpleIndex supports OCR in over 175 different languages.

View the FineReader and Tesseract pages to see the full list of languages supported by each engine.

The Language Pack is required to add extended language support. By default only Latin (Western European) and Arabic languages are installed.

For non-Latin alphabets that require Unicode support, the Character Set setting in the Global Settings Wizard should be set to the appropriate value for your language. This ensures that all OCR text, database interactions, CSV, and XML files use Unicode encoding that can support non-ASCII characters. The default encoding for text is Windows-1252.

Related Knowledge Base Articles[edit | edit source]