Force whole repository text extraction

Symptons

Can't find PDF or TIFF documents by content.

Diagnosis

Go to Administration > Utilities > Repository view and browse the document repository to confirm if the documents have the text extracted.

Solution

This may happend when the OCR is misconfigured. Go to Administration > Configuration and check the system.ocr configuration property. In order to verify it's working properly, got to Administration > Utilities > Check text extraction. Once you can verify the text extraction is working properly, force a whole repository text extraction by running this SQL sentence from Administration > Utilities > Database query:

update OKM_NODE_DOCUMENT set NDC_TEXT_EXTRACTED='F' 

Properties

Properties

Date

2017-03-27

Applies to

  • Administration
  • Thirdparty software integration

Keywords

  • AllVersions