KEA Summarization

OpenKM keyphrase extraction summarization service is an open-source software distributed under the GNU Affero General Public License.

OpenKM KEA Summarization service is based in KEA. KEA is an algorithm for extracting keyphrases from text documents. It can be either used for free indexing or for indexing with a controlled vocabulary.

Keywords and keyphrases (multi-word units) are widely used in large document collections.

They describe the content of single documents and provide a kind of semantic metadata that is useful for a wide variety of purposes.

The task of assigning keyphrases to a document is called "keyphrase indexing". For example, academic papers are often accompanied by a set of keyphrases freely chosen by the author.

In libraries professional indexers select keyphrases from a controlled vocabulary (also called "Subject Headings") according to defined cataloguing rules. On the Internet, digital libraries, or any depositories of data also use keyphrases (or here called content tags or content labels) to organize and provide a thematic access to their data.

KEA is an algorithm for extracting keyphrases from text documents. It can be either used for free indexing or for indexing with a controlled vocabulary build from The University of Waikato in the Digital Libraries and Machine Learning Labs of the Computer Science Department by Eibe Frank and Olena Medelyan.

Table of contents [ Hide Show ]