Selected Projects

SFB 876

  • Förderzeitraum: seit 01/2011 (DFG)
  • Sprecher: Prof. Dr. Katharina Morik
  • URL: SFB 876

The collaborative research center SFB876 brings together data mining and embedded systems. On the one hand, embedded systems can be further improved using machine learning. On the other hand, data mining algorithms can be realized in hardware, e.g. FPGAs, or run on GPGPUs. The restrictions of ubiquitous systems in computing power, memory, and energy demand new algorithms for known learning tasks. These resource bounded learning algorithms may also be applied on extremely large data bases on servers.

Selected Publications

Aartsen/etal/2014a Aartsen, M. G. and Abbasi, R. and Abdou, Y. and Ackermann, M. and Adams, J. and Aguilar, J. A. and Ahlers, M. and Altmann, D. and Auffenberg, J. and Bai, X. and et al.. Improvement in fast particle track reconstruction with robust statistics. In Nuclear Instruments and Methods in Physics Research A, Vol. 736, pages 143-149, 2014.
Lee/etal/2014a Sangkyun Lee and Jörg Rahnenführer and Michel Lang and Katleen de Preter and Pieter Mestdagh and Jan Koster and Rogier Versteeg and Raymond Stallings and Luigi Varesio and Shahab Asgharzadeh and Johannes Schulte and Kathrin Fielitz and Melanie Heilmann and Katharina Morik and Alexander Schramm. Robust Selection of Cancer Survival Signatures from High-Throughput Genomic Data Using Two-Fold Subsampling. In PLoS ONE, Vol. 9, pages e108818, 2014.
Lee/Poelitz/2014a Lee, Sangkyun and Pölitz, Christian. Kernel Completion for Learning Consensus Support Vector Machines in Bandwidth-Limited Sensor Networks. In International Conference on Pattern Recognition Applications and Methods, 2014.
Piatkowski/etal/2014a Piatkowski, Nico and Sangkyun, Lee and Morik,Katharina. The Integer Approximation of Undirected Graphical Models. In De Marsico, Maria and Tabbone, Antoine and Fred, Ana (editors), ICPRAM 2014 - Proceedings of the 3rd International Conference on Pattern Recognition Applications and Methods, ESEO, Angers, Loire Valley, France, 6-8 March, 2014, pages 296--304, SciTePress, 2014.
Bhaduri/Stolpe/2013a Bhaduri, Kanishka and Stolpe, Marco. Distributed Data Mining in Sensor Networks. In Aggarwal, Charu C. (editors), Managing and Mining Sensor Data, Berlin, Heidelberg, Springer, 2013.
Lee/Wright/2013a Lee, Sangkyun and Wright, Stephen J.. Stochastic Subgradient Estimation Training for Support Vector Machines. In Latorre Carmona, Pedro and S\'anchez, J. Salvador and Fred, Ana L.N. (editors), Mathematical Methodologies in Pattern Recognition and Machine Learning, Vol. 30, pages 67--82, Springer, 2013.
Bockermann/Blom/2012a Christian Bockermann and Hendrik Blom. Processing Data Streams with the RapidMiner Streams-Plugin. In Proceedings of the 3rd RapidMiner Community Meeting and Conference, 2012.
Lee/etal/2012a Lee, S. and Stolpe, M. and Morik, K.. Separable Approximate Optimization of Support Vector Machines for Distributed Sensing. In Flach, Peter A.and De Bie, Tijland Cristianini, Nello (editors), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2012, Bristol, UK, September 24-28, 2012. Proceedings, Part II, Vol. 7524, pages 387--402, Springer, 2012.
Michaelis/etal/2012a Michaelis, Stefan and Piatkowski, Nico and Morik, Katharina. Predicting Next Network Cell IDs for Moving Users with Discriminative and Generative Models. In Mobile Data Challenge by Nokia Workshop in conjunction with Int. Conf. on Pervasive Computing, Newcastle, UK, 2012.
Piatkowski/2012a Piatkowski, Nico. iST-MRF: Interactive Spatio-Temporal Probabilistic Models for Sensor Networks. In International Workshop at ECML PKDD 2012 on Instant Interactive Data Mining (IID), 2012.
Lee/Bockermann/2011a Lee, Sangkyun and Bockermann, Christian. Scalable stochastic gradient descent with improved confidence. In Big Learning -- Algorithms, Systems, and Tools for Learning at Scale, 2011.
Stolpe/Morik/2011a Stolpe, M. and Morik, K.. Learning from Label Proportions by Optimizing Cluster Model Selection. In Gunopulos, Dimitriosand Hofmann, Thomasand Malerba, Donatoand Vazirgiannis, Michalis (editors), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2011, Athens, Greece, September 5-9, 2011, Proceedings, Part III, pages 349--364, Springer, 2011.

VaVeL: Variety, Veracity, VaLue - VaVeL: Variety, Veracity, VaLue

  • Laufzeit: since 03/2016
  • Partner: National and Kapodistrian University of Athens, TU Dortmund University, IBM, Technion - Israel Institute of Technology, Fraunhofer IAIS, Dublin City Council,WUT - Warsaw University of Technology,City of Warsaw, OPL, AGT International
  • URL: http://www.vavel-project.eu/

Urban environments are awash with data from fixed and mobile sensors and monitoring infrastructures from public, private, or industry sources. Making such data useful would enable developing novel big data applications to benefit the citizens of Europe in areas such as transportation, infrastructures, and crime prevention. Urban data is heterogeneous, noisy, and unlabeled, which severely reduces its usability. Succinctly stated, urban data are difficult to understand. The goal of the VaVeL project is to radically advance our ability to use urban data in applications that can identify and address citizen needs and improve urban life. Our motivation comes from problems in urban transportation. This project will develop a general purpose framework for managing and mining multiple heterogeneous urban data streams for cities become more efficient, productive and resilient. The framework will be able to solve major issues that arise with urban transportation related data and are currently not dealt by existing stream management technologies. The project brings together two European cities that provide diverse large scale data of cross-country origin and real application needs, three major European companies in this space, and a strong group of researchers that have uniquely strong expertise in analyzing real-life urban data. VaVeL aims at making fundamental advances in addressing the most critical inefficiencies of current (big) data management and stream frameworks to cope with emerging urban sensor data thus making European urban data more accessible and easy to use and enhancing European industries that use big data management and analytics. The consortium develops end-user driven concrete scenaria that are addressing real, important problems with the potential of enormous impact, and a large spectrum of technology requirements, thus enabling the realization of the fundamental capabilities required and the realistic evaluation of the success of our methods.

Selected Publications

Liebig/2017a Liebig, Thomas. Smart navigation - chances, risk and challenges. In M. Jankowska and M. Pawelczyk and S. Augustyn and M. Kulawiak (editors), Navigation and Earth Observation - Law & Technology, pages (accepted), Warsaw, IUS PUBLICUM, 2017.
Liebig/2017b Liebig, Thomas. Report on Data Privacy. No. H2020-688380 D4.1, VAVEL Consortium, Dortmund, Germany, 2017.
Liebig/etal/2017a Liebig, Thomas and Peter, Sebastian and Grzenda, Maciej and Junosza-Szaniawski, Konstanty. Dynamic Transfer Patterns for Fast Multi-modal Route Planning. In Bregt, Arnold and Sarjakoski, Tapani and van Lammeren, Ron and Rip, Frans (editors), Societal Geo-innovation: Selected papers of the 20th AGILE conference on Geographic Information Science, pages 223--236, Cham, Springer, 2017.
Liebig/etal/2017b Thomas Liebig and Nico Piatkowski and Christian Bockermann and Katharina Morik. Dynamic Route Planning with Real-Time Traffic Predictions. In Information Systems, Vol. 64, pages 258--265, Elsevier, 2017.
Liebig/Sotzny/2017a Thomas Liebig and Maurice Sotzny. On Avoiding Traffic Jams with Dynamic Self-Organizing Trip Planning. In Eliseo Clementini, Maureen Donnelly, May Yuan, Christian Kray, Paolo Fogliaroni, and Andrea Ballatore (editors), 13th International Conference on Spatial Information Theory (COSIT 2017), Vol. 86, pages 17:1--17:12, Dagstuhl, Germany, Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2017.
Shaik/etal/2017a Nayabrasul Shaik and Thomas Liebig and Christopher Kirsch and Heinrich Müller. Dynamic map update of non-static facility logistics environment with a multi-robot system. In Proceedings of the 40th German Conference on Artificial Intelligence, pages (accepted), Springer, 2017.
Souto/Liebig/2016a Gustavo Souto and Thomas Liebig. On Event Detection from Spatial Time Series for Urban TrafficApplications. In Stefan Michaelis and Nico Piatkowski and Marco Stolpe (editors), Solving Large Scale Learning Tasks: Challenges and Algorithms, Vol. 9580, pages 221--233, Springer, 2016.
Liebig/2015b Liebig, Thomas. Analysis Methods and Privacy Aspects in Spatio-Temporal Data Mining. In Marlena Jankowska and Miroslaw Pawelczyk and Sylvie Allouche and Marcin Kulawiak (editors), AI: Philosophy, Geoinformatics & Law, pages (to appear), Warsaw, IUS PUBLICUM, 2015.

INSIGHT - INtelligent Synthesis and Real-tIme Response using Massive StreaminG of HeTerogeneous Data

  • Duration: 09/2012 - 08/2015
  • Partners: National and Kapodistrian University of Athens (coordinator), TU Dortmund University, IBM, Technion - Israel Institute of Technology, Fraunhofer IAIS, Dublin City Council, German Federal Office of Civil Protection and Disaster Assistance (BBK)
  • URL: http://www.insight-ict.eu/

The instrumentation of the world with diverse sensors, smart phones, and social networks acquires exascale data that offer the potential of enhanced science and services. In particular, a better societal management of the overall cycle of disaster monitoring and response becomes possible, citizens may now become involved in decision making and data acquisition (crowd-sourcing), and advanced planning can conserve resources. Current systems are limited in three important elements: (i) lack of methods for handling heterogeneous data streams in real-time,(ii) absence of social computing integrated with big data analysis, (iii) real-time prediction and alarm capabilities have not yet been incorporated into the infrastructure for intelligent management. The goal of the INSIGHT project is to radically advance our ability of coping with emergency situations in Smartcities by developing innovative technologies, methodologies and systems that will put new capabilities in the hands of disaster planners and city personnel to improve emergency planning and response.

Selected Publications

Panagiotou/etal/2016b Nikolaos Panagiotou and Ioannis Katakis and Dimitrios Gunopulos. Detecting Events in Online Social Networks: Definitions, Trends and Challenges. In Solving Large Scale Learning Tasks. Challenges and Algorithms - Essays Dedicated to Katharina Morik on the Occasion of Her 60th Birthday, pages 42--84, 2016.
Souto/Liebig/2016a Gustavo Souto and Thomas Liebig. On Event Detection from Spatial Time Series for Urban TrafficApplications. In Stefan Michaelis and Nico Piatkowski and Marco Stolpe (editors), Solving Large Scale Learning Tasks: Challenges and Algorithms, Vol. 9580, pages 221--233, Springer, 2016.
Artikis/etal/2014a Alexander Artikis and Matthias Weidlich and Francois Schnitzler and Ioannis Boutsis and Thomas Liebig and Nico Piatkowski and Christian Bockermann and Katharina Morik and Vana Kalogeraki and Jakub Marecek and Avigdor Gal and Shie Mannor and Dimitrios Gunopulos and Dermot Kinane. Heterogeneous Stream Processing and Crowdsourcing for Urban Traffic Management. In Proceedings of the 17th International Conference on Extending Database Technology, 2014.
Kinane/etal/2014a Dermot Kinane and François Schnitzler and Shie Mannor and Thomas Liebig and Katharina Morik and Jakub Marecek and Bernard Gorman and Nikolaos Zygouras and Yannis Katakis and Vana Kalogeraki and Dimitrios Gunopulos. Intelligent Synthesis and Real-time Response using Massive Streaming of Heterogeneous Data (INSIGHT) and its anticipated effect on Intelligent Transport Systems (ITS) in Dublin City, Ireland. In Proceedings of the 10th ITS European Congress, Helsinki, pages (to appear), 2014.
Liebig/etal/2014b Liebig, Thomas and Andrienko, Gennady and Andrienko, Natalia. Methods for Analysis of Spatio-Temporal Bluetooth Tracking Data. In Journal of Urban Technology, Vol. 21, No. 2, pages 27--37, Taylor and Francis, 2014.
Liebig/etal/2014d Thomas Liebig and Nico Piatkowski and Christian Bockermann and Katharina Morik. Route Planning with Real-Time Traffic Predictions. In Proceedings of the LWA 2014 Workshops: KDML, IR, FGWM, pages 83-94, 2014.
Schnitzler/etal/2014b Schnitzler, Francois and Artikis, Alexander and Weidlich, Matthias and Boutsis, Ioannis and Liebig, Thomas and Piatkowski, Nico and Bockermann, Christian and Morik, Katharina and Kalogeraki, Vana and Marecek, Jakub and Gal, Avigdor and Mannor, Shie and Kinane, Dermot and Gunopulos, Dimitrios. Heterogeneous Stream Processing and Crowdsourcing for Traffic Monitoring: Highlights. In Proceedings of the European Conference on Machine Learning (ECML), Nectar Track, pages 520-523, Springer, 2014.
Schnitzler/etal/2014c Francois Schnitzler and Thomas Liebig and Shie Mannor and Gustavo Souto and Sebastian Bothe and Hendrik Stange. Heterogeneous Stream Processing for Disaster Detection and Alarming. In IEEE International Conference on Big Data, pages 914-923, IEEE Press, 2014.
Liebig/etal/2013a T. Liebig and Z. Xu and M. May. Incorporating Mobility Patterns in Pedestrian Quantity Estimation and Sensor Placement. In J. Nin and D. Villatoro (editors), Proceedings of the First International Workshop on Citizen Sensor Networks CitiSens 2012, LNAI 7685, pages 67--80, Springer, 2013.
Roesler/Liebig/2013a Rösler, Roberto and Liebig, Thomas. Using Data from Location Based Social Networks for Urban Activity Clustering. In Vandenbroucke, Danny and Bucher, Bénédicte and Crompvoets, Joep (editors), Geographic Information Science at the Heart of Europe, pages 55--72, Springer, 2013.


  • Start: 06/2012
  • Partners: University of Zurich (Coordindator), TU Dortmund University, Rapid-I GmbH, Zattoo Europa AG, Vrije Universiteit Amsterdam, BBC
  • URL: Vista-TV.eu

Live video content is increasingly consumed over IP networks in addition to traditional broadcasting. The move to IP provides a huge opportunity to discover what people are watching in much greater breadth and depth than currently possible through interviews or set-top box based data gathering by rating organizations, because it allows direct analysis of consumer behavior via the logs they produce. The ViSTA-TV project proposes to gather consumers’ anonymized viewing behavior and the actual video streams from broadcasters/IPTV-transmitters, to combine them with enhanced electronic program guide information as the input for a holistic live-stream data mining analysis.
ViSTA-TV will employ the gathered information via a stream-analytics process to generate a high-quality linked open dataset (LOD) describing live TV programming. Combining the LOD with the behavioral information gathered, ViSTA-TV will be in the position to provide highly accurate market research information about viewing behavior that can be used for a variety of analyses of high interest to all participants in the TV-industry. ViSTA-TV will employ the information gathered to build a recommendation service that exploits both usage information and personalized feature extraction in conjunction with existing metadata to provide real-time viewing recommendations.
These results will be made possible by scientific progress in data-stream mining consisting of advances in data mining for tagging, recommendations, and behavioral analyses and temporal/probabilistic RDF-triple stream processing.

ViSTA-TV is a European Union-funded research project, beginning on 1 June 2012, and lasting for two years.

KobRA - Korpus-basierte linguistische Recherche und Analyse mit Hilfe von Data-Mining

  • Duration : 09/2012 - 08/2015
  • Participants: Prof. Dr. Angelika Storrer, Prof. Dr. Katharina Morik, Prof. Dr. Erhard Hinrichs, Dr. Alexander Geyken, Dr. Marc Kupietz, Dr. Andreas Witt
  • URL: KoBRA

Korpus-basierte Linguistik hat sich in den letzten Jahren zu einem wichtigen Gebiet der Sprachforschung entwickelt. In Infrastrukturprojekten wie CLARIN werden umfangreiche, strukturierte Sprachressourcen (Textkorpora, Baumbanken, lexikalische Wortnetze) bereitgestellt, die neuartige und attraktive Möglichkeiten bieten, linguistische Fragestellungen an authentischen Sprachverwendungsdaten zu untersuchen und quantitativ auszuwerten.

Ziel des Projekts ist es, durch den Einsatz innovativer Data-Mining-Verfahren (insbesondere Verfahren des maschinellen Lernens) die Möglichkeiten der empirischen linguistischen Arbeit mit strukturierten Sprachressourcen zu verbessern.

DDMD Data Driven Material Development

In diesem Projekt soll das systematische Design neuer Materialien durch die interdisziplinäre Zusammenarbeit zwischen Materialwissenschaften und Informatik vorangetrieben werden. Der neue Wissenschaftszweig heißt „Data Driven Materials Development“ oder „Datengetriebene Materialentwicklung“. In diesem Gebiet sollen sowohl neue Entdeckungen und Einsichten, z.B. über bisher unbekannte Phasen oder über besondere physikalische Eigenschaften der Materialien, gewonnen werden, als auch die Entwicklung neuer Materialien beschleunigt werden. Hierzu arbeiten in der Materialforschung zwei Lehrstühle der RUB zur synergistischen Nutzung von experimentellen Hochdurchsatzmethoden und analytischer Modellierung mit zwei Informatik-Lehrtsühlen der TU Dortmund und der Universität Duisburg-Essen zum Data Mining bzw. zur Hochdurchsatzanalyse zusammen. Dies ist notwendig, da in der systematischen Materialerforschung, insbesondere in den Bereichen Dünnschicht-Materialbibliotheken, Eigenschafts-Screenings und „Advanced Materials Simulation“, sehr große und hochdimensionale Datenmengen anfallen, die nur mit Hilfe von neuartigen Datenanalyseverfahren und entsprechenden Computerressourcen effizient analysiert werden können.

SFB 475 - Project A4

  • Duration: since 07/1997 (DFG)
  • Project Leader: Prof. Dr. Katharina Morik, Prof. Dr. Claus Weihs
  • Staff: Thorsten Joachims, Stefan Rüping, Ralf Klinkenberg, Ingo Mierswa, Martin Scholz, Michael Wurst
  • URL: SFB 475 - A4

The aim of project A4 is to combine statistical methods and methods of machine learning in order to improve Knowledge Discovery in Databases (KDD). After the process of the knowledge discovery was examined as a whole in the last period, we focus on two important problems in the current period. These problems often occur in practice of knowledge discovery. Corresponding research promises a special synergy effect because of the combination of statistical methods and machine learning methods: analysis temporal phenomenons in the form of events and the application of experimental design. Additionally, emphasis of the project is placed on the applied analysis of real databases.

Selected Publications

Mierswa, Ingo and Morik, Katharina. Automatic Feature Extraction for Classifying Audio Data. Machine Learning Journal, 58, 127-149, 2005. [pdf]
Mierswa, Ingo and Wurst, Michael. Efficient Case Based Feature Construction for Heterogeneous Learning Tasks. In Proceedings of the European Conference on Machine Learning (ECML), Springer-Verlag, Berlin, 641-648, 2005. [pdf]
Morik, Katharina and Siebes, Arno and Boulicault, Jean-François (editors). Detecting Local Patterns, Springer Lecture Notes in Artificial Intelligence, Volume 3539, Springer-Verlag, Berlin, 2005. Springer
Rüping, Stefan and Scheffer, Tobias (editors). Proceedings of the ICML 2005 Workshop on Learning with Multiple Views, 2005.
Scholz, Martin. Sampling-Based Sequential Subgroup Mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Databases (KDD), 265-274, 2005.
Klinkenberg, Ralf and Rüping, Stefan. Concept Drift and the Importance of Examples. In Franke, Jürgen and Nakhaeizadeh, Gholamreza and Renz, Ingrid (editors), Text Mining - Theoretical Aspects and Applications, Seiten 55--77, Physica-Verlag, Berlin, 2003.
Morik, Katharina and Rüping, Stefan. A Multistrategy Approach to the Classification of Phases in Business Cycles. In Proceedings of the European Conference on Machine Learning (ECML), Springer-Verlag, 307-318, 2002. [pdf]
Joachims, Thorsten. Estimating the Generalization Performance of a SVM Efficiently. In Proceedings of the International Conference on Machine Learning (ICML), Morgan Kaufman, 431-438, 2000. [pdf]
Joachims, Thorsten. Making large-Scale SVM Learning Practical. In: Advances in Kernel Methods - Support Vector Learning. MIT Press, 1999. [pdf]
Joachims, Thorsten. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning (ECML), Springer-Verlag, 137-142, 1998. [pdf]


  • Duration: ab 01/2006 (EU)
  • Project Leader: Fraunhofer Institut for Intelligent Autonomous Systems
  • Staff: Katharina Morik, Sebastian Land
  • URL:http://www.kdubiq.org

KDUbiq brings together newly emerging research in ubiquitous knowledge discovery. This multi-disciplinary approach constitutes a paradigm shift for the field of knowledge discovery since the idea of standalone analysis tools is abandoned in favour of process integrated, distributed and autonomous analysis systems.

Selected Publications

SFB 531 - Project B5

  • Duration: 01/2000 - 12/2002 (DFG)
  • Project Leader: Prof. Dr. Katharina Morik
  • Staff: Oliver Ritthoff, Ralf Klinkenberg, Ingo Mierswa
  • URL: SFB 531 - B5

The goal of this project is the identification and formalization of practically relevant learning tasks on the basis of applications in the C-projects. Particular learning tasks which deviate from the standard scenario of classification respectively optimization as, e.g., learning with non-factual knowledge, repeated learning of similar concepts, learning of temporally varying concepts and feature selection/construction will be considered. In this context the problem of feature selection/construction will be a central aspect in the scope of investigations.

Selected Publications

Klinkenberg, Ralf. Learning Drifting Concepts: Example Selection vs. Example Weighting. In Intelligent Data Analysis (IDA), Special Issue on Incremental Learning Systems Capable of Dealing with Concept Drift, Vol. 8, No. 3, 2004.
Klinkenberg, Ralf and Rüping, Stefan. Concept Drift and the Importance of Examples. In Franke, Jürgen and Nakhaeizadeh, Gholamreza and Renz, Ingrid (editors), Text Mining -- Theoretical Aspects and Applications, Seiten 55-77, Berlin, Germany, Physica-Verlag, 2003.
Ritthoff, Oliver and Klinkenberg, Ralf. Evolutionary Feature Space Transformation using Type-Restricted Generators. In Cantu-Paz, E. et al.(editors), Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2003) - Part II, Seiten 1606-1607, Springer, 2003.
Ritthoff, Oliver and Klinkenberg, Ralf and Fischer, Simon and Mierswa, Ingo. A Hybrid Approach to Feature Selection and Generation Using an Evolutionary Algorithm. In Bullinaria, John A. (editors), Proceedings of the 2002 U.K. Workshop on Computational Intelligence (UKCI-02), Seiten 147-154, Birmingham, UK, University of Birmingham, 2002.
Klinkenberg, Ralf und Joachims, Thorsten. Detecting concept drift with support vector machines. In P. Langley (Hrsg.), Proceedings of the Seventeenth International Conference on Machine Learning (ICML), Seiten 487-494. Morgan Kaufmann, San Francisco, CA, USA, 2000.

SFB 531 - Project C11

  • Duration: 01/2003 - 12/2005 (DFG)
  • Project Leader: Prof. Dr. Katharina Morik, Prof. Dr. Henner Schmidt-Traub
  • Staff: Dipl.-Ing. Bernd Hicking, Dipl.-Inform. Hanna Köpcke, Dipl.-Inform. Ingo Mierswa, Dipl.-Inform. Oliver Ritthoff
  • URL: SFB 531 - C11

The goal of this project is to find optimal positionings for given chemical equipment with methods from the field of Computational Intelligence. We compare and evaluate several knowledge-based and numerical approaches to optimize a plant layout under given constraints. Up to now previous knowledge is not used for sub-symbolic optimization and ideas of knowledge-based optimization should be transferred into Computation Intelligence. This knowledge is extracted from plans provided by engineers.

Selected Publications

Morik, Katharina and Schmidt-Traub, Henner and Hicking, Bernd and Köpcke, Hanna and Mierswa, Ingo. Layout optimization for chemical plants. In Industriemanagement, 2005.
Mierswa, Ingo. Incorporating Fuzzy Knowledge into Fitness: Multiobjective Evolutionary 3D Design of Process Plants. In Proceedings of the Genetic and Evolutionary Computation Conference GECCO 2005, Washington D.C., USA, 2005.


  • Duration: 04/2001 - 12/2003 (BMBF)
  • Project Leader: Fraunhofer for Media Communication
  • Staff: Michael Wurst, Katharina Morik
  • URL: http://awake.imk.fhg.de

The aim of the project Awake is to explore how implicit knowledge structures in different communities of experts can be discovered, visualised and employed for semantic navigation of information spaces and construction of new knowledge. The developed methods combine semantic text analysis with Machine Learning and interfaces for visualising relationships and creating new knowledge structures. Application scenarios include automatic generation of personalised knowledge portals, collaborative semantic exploration of complex information spaces and construction of shared ontology networks for the SemanticWeb. The real-world testbed and context of development is the Internet platform netzspannung.org that aims at establishing a knowledge portal connecting digital art, culture and information technology.

Selected Publications

Novak, Jasminko and Wurst, Michael. Supporting Knowledge Creation and Sharing in Communities Based on Mapping Implicit Knowledge. In j-jucs, Vol. 10, No. 3, pages 235--251, 2004.
Wurst, Michael and Novak, Jasminko. Knowledge Sharing im Heterogeneous Expert Communities based on Personal Taxonomies. In ECAI Workshop on Agent Mediated Knowledge Management, 2004.
Novak, Jasminko and Wurst, Michael. Discovering, Visualizing and Sharing Knowledge through Personalized Learning Knowledge Maps. In Agent Mediated Knowledge Management, 2003.
Novak, Jasminko and Wurst, Michael. Supporting Communities of Practice Through Personalisation and Collaborative Structuring based on Capturing Implicit Knowledge. In Proceedings of the International Conference on Knowledge Management, 2003.
Morik, Katharina and Wurst, Michael. Knowledge Dicovery and Knowledge Visualization, Perspektiven vernetzter Wissensraeume, Workshop 2002. 2002.

Mining Mart

  • Duration: 01/2000 - 02/2003 (EU)
  • Project Leader: Katharina Morik
  • Staff: Katharina Morik, Martin Scholz, Timm Euler, Harald Liedtke
  • URL:http://mmart.cs.uni-dortmund.de

Within the data mining process considerable time is spent for pre-processing the data. Practical experiences have shown that the time spent on preprocessing can take from 50% up to 80% of the entire data mining process when using the traditional attribute-value learners. Thats why preprocessing is the key issue in data analysis. The time is spend for:

  • Choosing the learning task
  • Sampling
  • Feature generation, extraction, and selection
  • Data cleaning
  • Model selection or tuning the hypothesis space
  • Defining appropriate evaluation criteria

Experienced users can apply any learning system successfully to any application, since they prepare the data well. The representation of examples and the choice of a sample determines the applicability of learning methods. A chain of data transformations (learning steps or manual preprocessing) delivers the desired result. Experienced users remember prototypical successful transformation/learning chains.

Selected Publications

Euler, Timm. Publishing Operational Models of Data Mining Case Studies. In Proceedings of the Workshop on Data Mining Case Studies at the 5th IEEE International Conference on Data Mining (ICDM), pages 99--106, Houston, Texas, USA, 2005.
Euler, Timm. Modelling Data Mining Processes on a Conceptual Level. In Proceedings of the 5th International Conference on Decision Support for Telecommunications and Information Society, Warsaw, Poland, 2005.
Morik, Katharina and Scholz, Martin. The MiningMart Approach to Knowledge Discovery in Databases. In Ning Zhong and Jiming Liu (editors), Intelligent Technologies for Information Analysis, pages 47--65, Springer, 2004.
Kietz, Jörg-Uwe and Vaduva, Anca and Zücker, Regina, MiningMart: Metadata-Driven Preprocessing. In Proceedings of the ECML/PKDD Workshop on Database Support for KDD, 2001.
Kietz, Jörg-Uwe and Vaduva, Anca and Zücker, Regina, Mining Mart: Combining Case-Based-Reasoning and Multi-Strategy Learning into a Framework to reuse KDD-Application. In Proceedings of the 5th International Workshop on Multistrategy Learning, R.S. Michalki and P. Brazdil (editors), 2000.
Morik, Katharina. The Representation Race - Preprocessing for Handling Time Phenomena. In Proceedings of the European Conference on Machine Learning, Barcelona, Spain, Springer, 2000.


The COMRIS project aims to develop, demonstrate and experimentally evaluate a scalable approach to integrating the Inhabited Information Spaces schema with a concept of software agents. The COMRIS vision of co-habited mixed-reality information spaces emphasizes the co-habitation of software and human agents in a pair of closely coupled spaces, a virtual and a real one. However, this project does not pursue the perceptual integration of real and virtual space into an augmented reality. Instead the coupling aims at focusing the large potential for useful social interactions in each of the spaces, so that they become more manageable, goal-directed and effective.

Selected Publications

Cranefield, Stephen and Haustein, Stefan and Purvis, Martin. UML-Based Ontology Modelling for Software Agents. In Proceedings of the Autonomous Agents 2001 Workshop on Ontologies in Agent Systems, 2001.
Haustein, Stefan. Semantic Web Languages: RDF vs. SOAP Serialization. In Proceedings of the Second International Workshop on the Semantic Web at WWW10, 2001.
Haustein, Stefan. Utilising an Ontology Based Repository to Connect Web Miners and Application Agents. In Proceedings of the ECML/PKDD Workshop on Semantic Web Mining, 2001.
Haustein, Stefan and Lüdecke, Sascha and Schwering, Christian. The Knowledge Agency. In Proceedings of the Forth International Conference on Autonomous Agents, pages 205 -- 206, ACM SIGART, Barcelona, Spain, ACM Press, New York, 2000.
Haustein, Stefan and Lüdecke, Sascha. Towards Information Agent Interoperability. In Cooperative Information Agents IV -- The Future of Information Agents in Cyberspace, Vol. 1860, pages 208 -- 219, Boston, USA, Springer, 2000.
Morik, Katharina and Haustein, Stefan. The Challenge of Discovering Meta--Data. In Proceedings of the Seventeenth National Conference on Artificial Intelligence, American Association for Artificial Intelligence (AAAI), AAAI press, 2000.


  • Duration: 9/1992 - 8/1995 (EU)
  • Project Leader: University of Karlsruhe
  • Staff: Volker Klingspor, Katharina Morik, Anke Rieger
  • URL:

Within the project BLearn II machine learning methods are applied to robotics, in order to reduce the time for setting up and modifying robot applications, and in order to make the operation of robots more user-friendly. The task of chair VIII within this project is to integrate logic-based learning into navigation. The goal is to allow a human user to give abstract commands, such as &qoute;Pass through the doorway, turn left and stop &qoute;. In order to execute these commands, the robot has to be able to recognize, for example, a door or a cupboard. In addition, the robot has to be able to find a door and to execute a left turn in a flexible way, adjusting itself to the different spatial conditions. A hierarchy of learning steps has been developed, which starts from sensor data and robot moves, and which leads to operational concepts. They integrate information about perceptions and actions, such that object recognition and action are coupled directly.

Selected Publications

Morik, Katharina and Klingspor, Volker and Kaiser, Michael (editors). Making Robots Smarter -- Combining Sensing and Action through Robot Learning. Kluwer Academic Press, 1999.
Klingspor, Volker and Morik, Katharina and Rieger, Anke. Learning Concepts from Sensor Data of a Mobile Robot. In Machine Learning, Vol. 23, No. 2/3, pages 305-332, 1996.
Klingspor, Volker and Demiris, J. and Kaiser, Michael. Human-Robot-Communication and Machine Learning. In Applied Artificial Intelligence, Vol. 11, No. 7/8, pages 719--746, 1997.
Klingspor, Volker and Morik, Katharina. Towards Concept Formation Grounded on Perception and Action of a Mobile Robot. In U. Rembold and R. Dillmann and L.O. Hertzberger and T. Kanade (editors), IAS--4, Proc. of the 4th Intern. Conference on Intelligent Autonomous Systems, pages 271--278, Amsterdam, IOS Press, 1995.