Creating your own OCR Data Capture plugin

You can create your own OCR Data Capture plugin.

Conditions:

  • The new OCR Data Capture class must implement the "OCRTemplateParser" interface.
  • The new OCR Data Capture class must be declared in the package "com.openkm.plugin.ocr.template.parser".
  • The new OCR Data Capture class must be annotated with "@PluginImplementation".
  • The new OCR Data Capture class must extend "BasePlugin".

OCR Data Capture interface:

package com.openkm.plugin.ocr.template;

import com.openkm.db.bean.OCRTemplateField;
import net.xeoh.plugins.base.Plugin;

/**
 * OCRTemplateParser
 */
public interface OCRTemplateParser extends Plugin {

    Object parse(OCRTemplateField otf, String text) throws OCRTemplateException, OCRParserEmptyValueException;

    String getName();

    boolean isPatternRequired();
}

The new class must be loaded into the package com.openkm.plugin.ocr.template.parser because the application plugin system will try to load from there.

Do not miss the tag @PluginImplementation otherwise the application plugin system will not be able to retrieve the new class.

More information at:  Register a new plugin.

Method descriptions

MethodTypeDescription

parse(OCRTemplateField otf, String text)

Object

Returns the captured data value as an Object.

Allowed Object classes:

  • String
  • Calendar (will be converted to an ISO 8601 string)
  • Other (will be converted to a string with String.valueOf(object)).

getName()

String

Sets the name that will be shown in the administrator user interface selector list.

isPatternRequired()

boolean

If it returns true, it indicates that the pattern field is required.

The administrator user interface will show an error ? 'missing pattern' ? if the pattern is required and the user has not set it.

Example

StringParser class:

package com.openkm.plugin.ocr.template.parser;

import com.openkm.db.bean.OCRTemplateField;
import com.openkm.plugin.BasePlugin;
import com.openkm.plugin.ocr.template.OCRParserEmptyValueException;
import com.openkm.plugin.ocr.template.OCRTemplateException;
import com.openkm.plugin.ocr.template.OCRTemplateParser;
import net.xeoh.plugins.base.annotations.PluginImplementation;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * StringParser
 */
@PluginImplementation
public class StringParser extends BasePlugin implements OCRTemplateParser {

    @Override
    public Object parse(OCRTemplateField otf, String text) throws OCRTemplateException, OCRParserEmptyValueException {
        if (text == null || text.equals("")) {
            throw new OCRParserEmptyValueException("Empty value");
        }

        if (otf.getPattern() == null || otf.getPattern().equals("")) {
            return text != null ? text.trim() : null;
        } else {
            Pattern pattern = Pattern.compile("(" + otf.getPattern() + ")", Pattern.UNICODE_CASE);
            Matcher matcher = pattern.matcher(text);

            if (matcher.find() && matcher.groupCount() == 1) {
                return matcher.group();
            } else {
                throw new OCRTemplateException("Bad format, parse exception");
            }
        }
    }

    @Override
    public String getName() {
        return "String";
    }

    @Override
    public boolean isPatternRequired() {
        return false;
    }
}
Table of contents [ Hide Show ]