Automatic Text Redaction: Difference between revisions

From Simple Wiki
No edit summary
No edit summary
 
Line 3: Line 3:
Automatic Text Redaction can be enabled on any [[OCR]] field type by selecting the ''Redact Matches'' option in the [[Index Field Settings]] wizard page.
Automatic Text Redaction can be enabled on any [[OCR]] field type by selecting the ''Redact Matches'' option in the [[Index Field Settings]] wizard page.


This option is only enabled if the field type is set to [[OCR]] and the [[Zones_%26_OCR_Settings#Text_Source Test Source]] is set to ''Use Full Page Text''.
This option is only enabled if the field type is set to [[OCR]] and the [[Zones_%26_OCR_Settings#Text_Source|Text Source]] is set to ''Use Full Page Text''.


Processed [[PDF]] files must have [[Full-Page OCR]] text in order to locate the matching terms on the image. However, the matching pages will be converted to images and the text will be discarded if any redacted terms are found. Otherwise, a viewer could still read the value from the text layer.
Processed [[PDF]] files must have [[Full-Page OCR]] text in order to locate the matching terms on the image. However, the matching pages will be converted to images and the text will be discarded if any redacted terms are found. Otherwise, a viewer could still read the value from the text layer.


Automatic Text Redaction uses the [[Template]] and [[Dictionary]] matching [[OCR]] features to locate values on the page. When a match is found, a black box is drawn around the text region to obscure the value. The value will be extracted as an index field in the job, but it can be excluded from [[Index Log]] and other [[Export]] values.
Automatic Text Redaction uses the [[Template]] and [[Dictionary]] matching [[OCR]] features to locate values on the page. When a match is found, a black box is drawn around the text region to obscure the value. The value will be extracted as an index field in the job, but it can be excluded from [[Index Log]] and other [[Export]] values.

Latest revision as of 11:34, 15 December 2023

Automatic Text Redaction allows you to obscure private and other sensitive data on documents automatically on PDF files by blacking out the matching text.

Automatic Text Redaction can be enabled on any OCR field type by selecting the Redact Matches option in the Index Field Settings wizard page.

This option is only enabled if the field type is set to OCR and the Text Source is set to Use Full Page Text.

Processed PDF files must have Full-Page OCR text in order to locate the matching terms on the image. However, the matching pages will be converted to images and the text will be discarded if any redacted terms are found. Otherwise, a viewer could still read the value from the text layer.

Automatic Text Redaction uses the Template and Dictionary matching OCR features to locate values on the page. When a match is found, a black box is drawn around the text region to obscure the value. The value will be extracted as an index field in the job, but it can be excluded from Index Log and other Export values.