Force whole repository text extraction

Symptoms

Can't find PDF or TIFF documents by content.

Diagnosis

Go to Administration > Utilities > Repository view and browse the document repository to confirm whether the documents have had their text extracted.

Solution

This may happen when the OCR is misconfigured. Go to Administration > Configuration and check the system.ocr configuration property. In order to verify it's working properly, go to Administration > Utilities > Check text extraction. Once you can verify the text extraction is working properly, force a whole repository text extraction by running this SQL statement from Administration > Utilities > Database query:

update OKM_NODE_DOCUMENT set NDC_TEXT_EXTRACTED='F' 

Properties

Properties

Date

2017-03-27

Applies to

  • Administration
  • Third-party software integration

Keywords

  • AllVersions
  •