Tesseract: Difference between revisions

From Simple Wiki
(Created page with "Tesseract is the open-source OCR engine included with all versions of SimpleIndex. FineReader OCR is enabled with an OCR or Professional License....")
 
No edit summary
 
Line 6: Line 6:


Tesseract provides adequate recognition speed and accuracy when dealing with good quality documents.
Tesseract provides adequate recognition speed and accuracy when dealing with good quality documents.
Text Output Formats with Tesseract:
* Text (txt)
* PDF (pdf)


Tesseract also supports some languages that are unsupported by [[FineReader]] and other commercial engines, for example Indian languages like Hindi and Tamil.
Tesseract also supports some languages that are unsupported by [[FineReader]] and other commercial engines, for example Indian languages like Hindi and Tamil.

Latest revision as of 15:31, 6 January 2023

Tesseract is the open-source OCR engine included with all versions of SimpleIndex.

FineReader OCR is enabled with an OCR or Professional License.

The current release includes version 3.04 of the Tesseract engine.

Tesseract provides adequate recognition speed and accuracy when dealing with good quality documents.

Text Output Formats with Tesseract:

  • Text (txt)
  • PDF (pdf)

Tesseract also supports some languages that are unsupported by FineReader and other commercial engines, for example Indian languages like Hindi and Tamil.

The Language Pack must be installed via the Global Settings Wizard in order to enable all languages.

The full list of Tesseract supported languages is below.

  • Afrikaans
  • Arabic
  • AzeriCyrillic
  • Belarusian
  • Bengali
  • Bulgarian
  • Catalan
  • Czech
  • ChineseSimplified
  • ChineseTraditional
  • Cherokee
  • Danish
  • German
  • Greek
  • English
  • EnglishMiddle
  • Esperanto
  • Estonian
  • Basque
  • Finnish
  • French
  • Frankish
  • FrenchMiddle
  • Galician
  • GreekAncient
  • Hebrew
  • Hindi
  • Croatian
  • Hungarian
  • Indonesian
  • Icelandic
  • Italian
  • ItalianOld
  • Japanese
  • Kannada
  • Korean
  • Latvian
  • Lithuanian
  • Malayalam
  • Macedonian
  • Maltese
  • MalayMalaysian
  • DutchStandard
  • NorwegianBokmal
  • Polish
  • PortugueseStandard
  • Romanian
  • Russian
  • Slovak
  • Slovenian
  • Spanish
  • SpanishOld
  • Albanian
  • SerbianLatin
  • Swahili
  • Swedish
  • Tamil
  • Telugu
  • Tagalog
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese