Selected Projects
SFB 876
- Förderzeitraum: seit 01/2011 (DFG)
- Sprecher: Prof. Dr. Katharina Morik
- URL: SFB 876
The collaborative research center SFB876 brings together data mining and embedded systems. On the one hand, embedded systems can be further improved using machine learning. On the other hand, data mining algorithms can be realized in hardware, e.g. FPGAs, or run on GPGPUs. The restrictions of ubiquitous systems in computing power, memory, and energy demand new algorithms for known learning tasks. These resource bounded learning algorithms may also be applied on extremely large data bases on servers.
Selected Publications
| Lee/Wright/2013a |
Lee, Sangkyun and Wright, StephenJ..
Stochastic Subgradient Estimation Training for Support Vector Machines.
In
Latorre Carmona, Pedro and S\'anchez, J. Salvador and Fred, Ana L.N. (editors),
Mathematical Methodologies in Pattern Recognition and Machine Learning,
Vol. 30,
No. null,
pages 67--82,
Springer,
2013.
|
| Lee/etal/2012a |
Lee, Sangkyun and Stolpe, Marco and Morik, Katharina.
Separable Approximate Optimization of Support Vector Machines for Distributed Sensing.
In
Peter Flach and Tijl De Bie and Nello Cristianini (editors),
Machine Learning and Knowledge Discovery in Databases,
Vol. 7524,
pages 387--402,
Berlin, Heidelberg,
Springer,
2012.
|
| Stolpe/Morik/2011a |
Stolpe, Marco and Morik, Katharina.
Learning from Label Proportions by Optimizing Cluster Model Selection.
In
Gunopulos, Dimitrios and Hofmann, Thomas and Malerba, Donato and Vazirgiannis, Michalis (editors),
Machine Learning and Knowledge Discovery in Databases,
Vol. 6913,
pages 349--364,
Berlin, Heidelberg,
Springer,
2011.
|
INSIGHT - INtelligent Synthesis and Real-tIme Response using Massive StreaminG of HeTerogeneous Data
- Duration: 09/2012 - 08/2015
- Partners: National and Kapodistrian University of Athens (coordinator), TU Dortmund University, IBM, Technion - Israel Institute of Technology, Fraunhofer IAIS, Dublin City Council, German Federal Office of Civil Protection and Disaster Assistance (BBK)
- URL: http://www.insight-ict.eu/
The instrumentation of the world with diverse sensors, smart phones, and social networks acquires exascale data that offer the potential of enhanced science and services. In particular, a better societal management of the overall cycle of disaster monitoring and response becomes possible, citizens may now become involved in decision making and data acquisition (crowd-sourcing), and advanced planning can conserve resources. Current systems are limited in three important elements: (i) lack of methods for handling heterogeneous data streams in real-time,(ii) absence of social computing integrated with big data analysis, (iii) real-time prediction and alarm capabilities have not yet been incorporated into the infrastructure for intelligent management.
The goal of the INSIGHT project is to radically advance our ability of coping with emergency situations in Smartcities by developing innovative technologies, methodologies and systems that will put new capabilities in the hands of disaster planners and city personnel to improve emergency planning and response.
Selected Publications
| Liebig/etal/2013a |
T. Liebig and Z. Xu and M. May.
Incorporating Mobility Patterns in Pedestrian Quantity Estimation and Sensor Placement.
In
J. Nin and D. Villatoro (editors),
Proceedings of the First International Workshop on Citizen Sensor Networks CitiSens 2012, LNAI 7685,
pages 67--80,
Springer,
2013.
|
| Roesler/Liebig/2013a |
Rösler, Roberto and Liebig, Thomas.
Using Data from Location Based Social Networks for Urban Activity Clustering.
In
Vandenbroucke, Danny and Bucher, Bénédicte and Crompvoets, Joep (editors),
Geographic Information Science at the Heart of Europe,
pages 55--72,
Springer,
2013.
|
Vista-TV
- Start: 06/2012
- Partners: University of Zurich (Coordindator), TU Dortmund University, Rapid-I GmbH, Zattoo Europa AG, Vrije Universiteit Amsterdam, BBC
- URL: Vista-TV.eu
Live video content is increasingly consumed over IP networks in addition to traditional broadcasting. The move to IP provides a huge opportunity to discover what people are watching in much greater breadth and depth than currently possible through interviews or set-top box based data gathering by rating organizations, because it allows direct analysis of consumer behavior via the logs they produce. The ViSTA-TV project proposes to gather consumers’ anonymized viewing behavior and the actual video streams from broadcasters/IPTV-transmitters, to combine them with enhanced electronic program guide information as the input for a holistic live-stream data mining analysis.
ViSTA-TV will employ the gathered information via a stream-analytics process to generate a high-quality linked open dataset (LOD) describing live TV programming. Combining the LOD with the behavioral information gathered, ViSTA-TV will be in the position to provide highly accurate market research information about viewing behavior that can be used for a variety of analyses of high interest to all participants in the TV-industry. ViSTA-TV will employ the information gathered to build a recommendation service that exploits both usage information and personalized feature extraction in conjunction with existing metadata to provide real-time viewing recommendations.
These results will be made possible by scientific progress in data-stream mining consisting of advances in data mining for tagging, recommendations, and behavioral analyses and temporal/probabilistic RDF-triple stream processing.
ViSTA-TV is a European Union-funded research project, beginning on 1 June 2012, and lasting for two years.
KobRA - Korpus-basierte linguistische Recherche und Analyse mit Hilfe von Data-Mining
- Duration : 09/2012 - 08/2015
- Participants: Prof. Dr. Angelika Storrer, Prof. Dr. Katharina Morik, Prof. Dr. Erhard Hinrichs, Dr. Alexander Geyken, Dr. Marc Kupietz, Dr. Andreas Witt
- URL: KoBRA
Korpus-basierte Linguistik hat sich in den letzten Jahren zu einem wichtigen Gebiet der Sprachforschung entwickelt. In Infrastrukturprojekten wie CLARIN werden umfangreiche, strukturierte Sprachressourcen (Textkorpora, Baumbanken, lexikalische Wortnetze) bereitgestellt, die neuartige und attraktive Möglichkeiten bieten, linguistische Fragestellungen an authentischen Sprachverwendungsdaten zu untersuchen und quantitativ auszuwerten.
Ziel des Projekts ist es, durch den Einsatz innovativer Data-Mining-Verfahren (insbesondere Verfahren des maschinellen Lernens) die Möglichkeiten der empirischen linguistischen Arbeit mit strukturierten Sprachressourcen zu verbessern.
DDMD Data Driven Material Development
In diesem Projekt soll das systematische Design neuer Materialien durch die interdisziplinäre Zusammenarbeit zwischen Materialwissenschaften und Informatik vorangetrieben werden. Der neue Wissenschaftszweig heißt „Data Driven Materials Development“ oder „Datengetriebene Materialentwicklung“. In diesem Gebiet sollen sowohl neue Entdeckungen und Einsichten, z.B. über bisher unbekannte Phasen oder über besondere physikalische Eigenschaften der Materialien, gewonnen werden, als auch die Entwicklung neuer Materialien beschleunigt werden. Hierzu arbeiten in der Materialforschung zwei Lehrstühle der RUB zur synergistischen Nutzung von experimentellen Hochdurchsatzmethoden und analytischer Modellierung mit zwei Informatik-Lehrtsühlen der TU Dortmund und der Universität Duisburg-Essen zum Data Mining bzw. zur Hochdurchsatzanalyse zusammen. Dies ist notwendig, da in der systematischen Materialerforschung, insbesondere in den Bereichen Dünnschicht-Materialbibliotheken, Eigenschafts-Screenings und „Advanced Materials Simulation“, sehr große und hochdimensionale Datenmengen anfallen, die nur mit Hilfe von neuartigen Datenanalyseverfahren und entsprechenden Computerressourcen effizient analysiert werden können.
SFB 475 - Project A4
- Duration: since 07/1997 (DFG)
- Project Leader: Prof. Dr. Katharina Morik, Prof. Dr. Claus Weihs
- Staff: Thorsten Joachims, Stefan Rüping, Ralf Klinkenberg, Ingo Mierswa, Martin Scholz, Michael Wurst
- URL: SFB 475 - A4
The aim of project A4 is to combine statistical methods and methods of
machine learning in order to improve Knowledge Discovery in Databases
(KDD). After the process of the knowledge discovery was examined as a
whole in the last period, we focus on two important problems in the
current period. These problems often occur in practice of knowledge
discovery. Corresponding research promises a special synergy effect
because of the combination of statistical methods and machine learning
methods: analysis temporal phenomenons in the form of events and the
application of experimental design. Additionally, emphasis of the
project is placed on the applied analysis of real databases.
Selected Publications
|
Mierswa, Ingo and Morik, Katharina. Automatic Feature Extraction for
Classifying Audio Data. Machine Learning Journal, 58,
127-149, 2005. [pdf]
|
|
Mierswa, Ingo and Wurst, Michael. Efficient Case Based Feature
Construction for Heterogeneous Learning Tasks. In Proceedings
of the European Conference on Machine Learning (ECML),
Springer-Verlag, Berlin, 641-648, 2005. [pdf]
|
|
Morik, Katharina and Siebes, Arno and Boulicault, Jean-François (editors). Detecting
Local Patterns, Springer Lecture Notes in Artificial Intelligence,
Volume 3539, Springer-Verlag, Berlin, 2005. Springer
|
|
Rüping, Stefan and Scheffer, Tobias (editors). Proceedings of the ICML
2005 Workshop on Learning with Multiple Views, 2005.
|
|
Scholz, Martin. Sampling-Based Sequential Subgroup
Mining. In Proceedings of the 11th ACM SIGKDD International
Conference on Knowledge Discovery in Databases (KDD), 265-274, 2005.
|
|
Klinkenberg, Ralf and Rüping, Stefan. Concept Drift and the Importance of Examples. In Franke, Jürgen and Nakhaeizadeh, Gholamreza and Renz, Ingrid (editors), Text Mining - Theoretical Aspects and Applications, Seiten 55--77, Physica-Verlag, Berlin, 2003.
|
|
Morik, Katharina and Rüping, Stefan. A Multistrategy Approach to
the Classification of Phases in Business Cycles. In Proceedings of the
European Conference on Machine Learning (ECML),
Springer-Verlag, 307-318, 2002. [pdf]
|
|
Joachims, Thorsten. Estimating the Generalization Performance of a SVM Efficiently.
In Proceedings of the International Conference on Machine Learning (ICML),
Morgan Kaufman, 431-438, 2000. [pdf]
|
|
Joachims, Thorsten. Making large-Scale SVM Learning Practical.
In: Advances in Kernel Methods - Support Vector Learning. MIT Press, 1999.
[pdf]
|
|
Joachims, Thorsten. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning (ECML),
Springer-Verlag, 137-142, 1998. [pdf]
|
KDUbiq
- Duration: ab 01/2006 (EU)
- Project Leader: Fraunhofer Institut for Intelligent Autonomous Systems
- Staff: Katharina Morik, Sebastian Land
- URL:http://www.kdubiq.org
KDUbiq brings together newly emerging research in ubiquitous knowledge
discovery. This multi-disciplinary approach constitutes a paradigm
shift for the field of knowledge discovery since the idea of
standalone analysis tools is abandoned in favour of process
integrated, distributed and autonomous analysis systems.
Selected Publications
| Morik/etal/2009a |
May, Michael and Berendt, Bettina and Cornuejols, Antoine and Gama, Joao and Giannotti, Fosca and Hotho, Andreas and Malerba, Donato and Menesalvas, Ernestina and Morik, Katharina and Pedersen, Rasmus and Saitta, Lorenza and Saygin, Yucel and Schuster, Assaf and Vanhoof, Koen.
Research Challenges in Ubiquitous Knowledge Discovery.
In
Kargupta and Han and Yu and Motwani and Kumar (editors),
Next Generation of Data Mining,
pages 131--151,
CRC Press,
2009.
|
SFB 531 - Project B5
- Duration: 01/2000 - 12/2002 (DFG)
- Project Leader: Prof. Dr. Katharina Morik
- Staff: Oliver Ritthoff, Ralf Klinkenberg, Ingo Mierswa
- URL: SFB 531 - B5
The goal of this project is the identification and formalization of practically relevant learning tasks on the basis of applications in the C-projects. Particular learning tasks which deviate from the standard scenario of classification respectively optimization as, e.g., learning with non-factual knowledge, repeated learning of similar concepts, learning of temporally varying concepts and feature selection/construction will be considered. In this context the problem of feature selection/construction will be a central aspect in the scope of investigations.
Selected Publications
|
Klinkenberg, Ralf. Learning Drifting Concepts: Example Selection vs. Example Weighting. In Intelligent Data Analysis (IDA), Special Issue on Incremental Learning Systems Capable of Dealing with Concept Drift, Vol. 8, No. 3, 2004.
|
|
Klinkenberg, Ralf and Rüping, Stefan. Concept Drift and the Importance of Examples. In Franke, Jürgen and Nakhaeizadeh, Gholamreza and Renz, Ingrid (editors), Text Mining -- Theoretical Aspects and Applications, Seiten 55-77, Berlin, Germany, Physica-Verlag, 2003.
|
|
Ritthoff, Oliver and Klinkenberg, Ralf. Evolutionary Feature Space Transformation using Type-Restricted Generators. In Cantu-Paz, E. et al.(editors), Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2003) - Part II, Seiten 1606-1607, Springer, 2003.
|
|
Ritthoff, Oliver and Klinkenberg, Ralf and Fischer, Simon and Mierswa, Ingo. A Hybrid Approach to Feature Selection and Generation Using an Evolutionary Algorithm. In Bullinaria, John A. (editors), Proceedings of the 2002 U.K. Workshop on Computational Intelligence (UKCI-02), Seiten 147-154, Birmingham, UK, University of Birmingham, 2002.
|
|
Klinkenberg, Ralf und Joachims, Thorsten. Detecting concept drift with support vector machines. In P. Langley (Hrsg.), Proceedings of the Seventeenth International Conference on Machine Learning (ICML), Seiten 487-494. Morgan Kaufmann, San Francisco, CA, USA, 2000.
|
SFB 531 - Project C11
- Duration: 01/2003 - 12/2005 (DFG)
- Project Leader: Prof. Dr. Katharina Morik, Prof. Dr. Henner Schmidt-Traub
- Staff: Dipl.-Ing. Bernd Hicking, Dipl.-Inform. Hanna Köpcke, Dipl.-Inform. Ingo Mierswa, Dipl.-Inform. Oliver Ritthoff
- URL: SFB 531 - C11
The goal of this project is to find optimal positionings for given chemical equipment with methods from the field of Computational Intelligence. We compare and evaluate several knowledge-based and numerical approaches to optimize a plant layout under given constraints. Up to now previous knowledge is not used for sub-symbolic optimization and ideas of knowledge-based optimization should be transferred into Computation Intelligence. This knowledge is extracted from plans provided by engineers.
Selected Publications
|
Morik, Katharina and Schmidt-Traub, Henner and Hicking, Bernd and Köpcke, Hanna and
Mierswa, Ingo. Layout optimization for chemical plants. In
Industriemanagement, 2005.
|
|
Mierswa, Ingo. Incorporating Fuzzy Knowledge into Fitness:
Multiobjective Evolutionary 3D Design of Process Plants. In
Proceedings of the Genetic and Evolutionary Computation Conference
GECCO 2005, Washington D.C., USA, 2005.
|
AWAKE
- Duration: 04/2001 - 12/2003 (BMBF)
- Project Leader: Fraunhofer for Media Communication
- Staff: Michael Wurst, Katharina Morik
- URL: http://awake.imk.fhg.de
The aim of the project Awake is to explore how implicit knowledge structures in different communities of experts can be discovered, visualised and employed for semantic navigation of information spaces and construction of new knowledge.
The developed methods combine semantic text analysis with Machine Learning and interfaces for visualising relationships and creating new knowledge structures. Application scenarios include automatic generation of personalised knowledge portals, collaborative semantic exploration of complex information spaces and construction of shared ontology networks for the SemanticWeb.
The real-world testbed and context of development is the Internet platform netzspannung.org that aims at establishing a knowledge portal connecting digital art, culture and information technology.
Selected Publications
|
Novak, Jasminko and Wurst, Michael. Supporting Knowledge Creation and Sharing in Communities Based on Mapping Implicit Knowledge. In j-jucs, Vol. 10, No. 3, pages 235--251, 2004.
|
|
Wurst, Michael and Novak, Jasminko. Knowledge Sharing im Heterogeneous Expert Communities based on Personal Taxonomies. In ECAI Workshop on Agent Mediated Knowledge Management, 2004.
|
|
Novak, Jasminko and Wurst, Michael. Discovering, Visualizing and Sharing Knowledge through Personalized Learning Knowledge Maps. In Agent Mediated Knowledge Management, 2003.
|
|
Novak, Jasminko and Wurst, Michael. Supporting Communities of Practice Through Personalisation and Collaborative Structuring based on Capturing Implicit Knowledge. In Proceedings of the International Conference on Knowledge Management, 2003.
|
|
Morik, Katharina and Wurst, Michael. Knowledge Dicovery and Knowledge Visualization, Perspektiven vernetzter Wissensraeume, Workshop 2002. 2002.
|
Mining Mart
- Duration: 01/2000 - 02/2003 (EU)
- Project Leader: Katharina Morik
- Staff: Katharina Morik, Martin Scholz, Timm Euler, Harald Liedtke
- URL:http://mmart.cs.uni-dortmund.de
Within the data mining process considerable time is spent for pre-processing the data. Practical experiences have shown that the time spent on preprocessing can take from 50% up to 80% of the entire data mining process when using the traditional attribute-value learners. Thats why preprocessing is the key issue in data analysis. The time is spend for:
- Choosing the learning task
- Sampling
- Feature generation, extraction, and selection
- Data cleaning
- Model selection or tuning the hypothesis space
- Defining appropriate evaluation criteria
Experienced users can apply any learning system successfully to any application, since they prepare the data well. The representation of examples and the choice of a sample determines the applicability of learning methods. A chain of data transformations (learning steps or manual preprocessing) delivers the desired result. Experienced users remember prototypical successful transformation/learning chains.
Selected Publications
|
Euler, Timm. Publishing Operational Models of Data Mining Case Studies. In
Proceedings of the Workshop on Data Mining Case Studies at the 5th IEEE
International Conference on Data Mining (ICDM), pages 99--106, Houston,
Texas, USA, 2005.
|
|
Euler, Timm. Modelling Data Mining Processes on a Conceptual Level. In
Proceedings of the 5th International Conference on Decision Support for
Telecommunications and Information Society, Warsaw, Poland, 2005.
|
|
Morik, Katharina and Scholz, Martin. The MiningMart Approach to Knowledge
Discovery in Databases. In Ning Zhong and Jiming Liu (editors), Intelligent
Technologies for Information Analysis, pages 47--65, Springer, 2004.
|
|
Kietz, Jörg-Uwe and Vaduva, Anca and Zücker, Regina, MiningMart:
Metadata-Driven Preprocessing. In Proceedings of the ECML/PKDD Workshop on
Database Support for KDD, 2001.
|
|
Kietz, Jörg-Uwe and Vaduva, Anca and Zücker, Regina, Mining Mart: Combining
Case-Based-Reasoning and Multi-Strategy Learning into a Framework to reuse
KDD-Application. In Proceedings of the 5th International Workshop on
Multistrategy Learning, R.S. Michalki and P. Brazdil (editors), 2000.
|
|
Morik, Katharina. The Representation Race - Preprocessing for Handling Time
Phenomena. In Proceedings of the European Conference on Machine Learning,
Barcelona, Spain, Springer, 2000.
|
COMRIS
The COMRIS project aims to develop, demonstrate and experimentally evaluate a scalable approach to integrating the Inhabited Information Spaces schema with a concept of software agents. The COMRIS vision of co-habited mixed-reality information spaces emphasizes the co-habitation of software and human agents in a pair of closely coupled spaces, a virtual and a real one. However, this project does not pursue the perceptual integration of real and virtual space into an augmented reality. Instead the coupling aims at focusing the large potential for useful social interactions in each of the spaces, so that they become more manageable, goal-directed and effective.
Selected Publications
|
Cranefield, Stephen and Haustein, Stefan and Purvis, Martin. UML-Based
Ontology Modelling for Software Agents. In Proceedings of the Autonomous Agents 2001 Workshop on Ontologies in Agent Systems, 2001.
|
|
Haustein, Stefan. Semantic Web Languages: RDF vs. SOAP Serialization. In Proceedings of the Second International Workshop on the Semantic Web at WWW10, 2001.
|
|
Haustein, Stefan. Utilising an Ontology Based Repository to Connect Web Miners and Application Agents. In Proceedings of the ECML/PKDD Workshop on Semantic Web Mining, 2001.
|
|
Haustein, Stefan and Lüdecke, Sascha and Schwering, Christian. The Knowledge Agency. In Proceedings of the Forth International Conference on Autonomous Agents, pages 205 -- 206, ACM SIGART, Barcelona, Spain, ACM Press, New York, 2000.
|
|
Haustein, Stefan and Lüdecke, Sascha. Towards Information Agent Interoperability. In Cooperative Information Agents IV -- The Future of Information Agents in Cyberspace, Vol. 1860, pages 208 -- 219, Boston, USA, Springer, 2000.
|
|
Morik, Katharina and Haustein, Stefan. The Challenge of Discovering Meta--Data. In Proceedings of the Seventeenth National Conference on Artificial Intelligence, American Association for Artificial Intelligence (AAAI), AAAI press, 2000.
|
BLearn
- Duration: 9/1992 - 8/1995 (EU)
- Project Leader: University of Karlsruhe
- Staff: Volker Klingspor, Katharina Morik, Anke Rieger
- URL:
Within the project BLearn II machine learning methods are applied to robotics, in order to reduce the time for setting up and modifying robot applications, and in order to make the operation of robots more user-friendly.
The task of chair VIII within this project is to integrate logic-based learning into navigation. The goal is to allow a human user to give abstract commands, such as &qoute;Pass through the doorway, turn left and stop &qoute;. In order to execute these commands, the robot has to be able to recognize, for example, a door or a cupboard. In addition, the robot has to be able to find a door and to execute a left turn in a flexible way, adjusting itself to the different spatial conditions. A hierarchy of learning steps has been developed, which starts from sensor data and robot moves, and which leads to operational concepts. They integrate information about perceptions and actions, such that object recognition and action are coupled
directly.
Selected Publications
|
Morik, Katharina and Klingspor, Volker and Kaiser, Michael (editors). Making Robots Smarter -- Combining Sensing and Action through Robot Learning. Kluwer Academic Press, 1999.
|
|
Klingspor, Volker and Morik, Katharina and Rieger, Anke. Learning Concepts from Sensor Data of a Mobile Robot. In Machine Learning, Vol. 23, No. 2/3, pages 305-332, 1996.
|
|
Klingspor, Volker and Demiris, J. and Kaiser, Michael. Human-Robot-Communication and Machine Learning. In Applied Artificial Intelligence, Vol. 11, No. 7/8, pages 719--746, 1997.
|
|
Klingspor, Volker and Morik, Katharina. Towards Concept Formation Grounded on Perception and Action of a Mobile Robot. In U. Rembold and R. Dillmann and L.O. Hertzberger and T. Kanade (editors), IAS--4, Proc. of the 4th Intern. Conference on Intelligent Autonomous Systems, pages 271--278, Amsterdam, IOS Press, 1995.
|