RapidMiner Information Extraction Plugin


Nowadays more and more information is available spread all over the internet or other huge document collections.
The information is present on websites (containing pure text on the one hand and html-code on the other hand), in documents -- pdf-documents for instance --, or in log-files and so on.
To process this (daily growing) huge amount of information manually is impossible.

Therefore IE-techniques are used for the automatic identification of selected types of entities, relations, or events in free text.
While some IE-systems process IE-tasks like for instance Named Entity Recognition (NER) in a somehow black-boxed way, we present a very modular system, which can easily be adjusted and extended for already known or new tasks.



Jungermann, Felix


