Configuring Lucene Analyzer

Depending on the language used in the documents and properties, you can obtain better search results by configuring a proper Lucene Analyzer.

By default, OpenKM uses the org.apache.lucene.analysis.standard.StandardAnalyzer, which works fine with English and most languages, but you can get better search results by configuring a more specific analyzer for your language.

Some analyzers:

org.apache.lucene.analysis.en.EnglishAnalyzer
org.apache.lucene.analysis.es.SpanishAnalyzer
org.apache.lucene.analysis.fr.FrenchAnalyzer
org.apache.lucene.analysis.it.ItalianAnalyzer
org.apache.lucene.analysis.de.GermanAnalyzer
org.apache.lucene.analysis.el.GreekAnalyzer
org.apache.lucene.analysis.hi.HindiAnalyzer

Special analyzers:

com.openkm.search.lucene.analysis.AccentInsensitiveAnalyzer

More information is available at the Lucene documentation site and the Guide to Lucene Analyzers.

If you want a non-accent-sensitive analyzer, you can try the AccentInsensitiveAnalyzer.

If you are working with oriental languages like Chinese or Japanese, you have several analyzers available. Read the Lucene documentation. You can also try ik-analyzer.

If you want only a whitespace-tokenized analyzer, you can try the WhitespaceAnalyzer.

If you have not configured the search analyzer before you start OpenKM for the first time, then Lucene indexes will be created using this default analyzer.

If you want to change this configuration property after the OpenKM repository has been created, you need to rebuild the Lucene indexes.

Once the operation has been completed, the Lucene indexes will be using the new analyzer.

For more information, take a look at Rebuild indexes.

Configure an Analyzer

Edit the $TOMCAT_HOME/openkm.properties file and add the line:

spring.jpa.properties.hibernate.search.backend.analyzer=org.apache.lucene.analysis.es.SpanishAnalyzer

If you want the accent-insensitive analyzer to ignore the distinction between accented and unaccented letters:

spring.jpa.properties.hibernate.search.backend.analyzer=com.openkm.search.lucene.analysis.AccentInsensitiveAnalyzer

The changes will take effect after restarting the application.
Remember to rebuild the Lucene indexes after the application restarts.

Table of contents [ Hide Show ]

Configure an Analyzer

OpenKM 8.1

Configuring Lucene Analyzer

Configure an Analyzer