Hauptnavigation

Distributed Data Mining

Distributed computing plays an important role in the Data Mining process for several reasons. First, Data Mining often requires huge amounts of resources in storage space and computation time. To make systems scalable, it is important to develop mechanisms that distribute the work load among several sites in a flexible way. Second, data is often inherently distributed into several databases, making a centralized processing of this data very inefficient and prone to security risks. Distributed Data Mining explores techniques of how to apply Data Mining in a non-centralized way.

Software

RapidMiner Distributed Data Mining Plugin

Staff

Bockermann, Christian
Schowe, Benjamin
Stolpe, Marco
Wurst, Michael

Past Master Thesis

Publications

Bhaduri/Stolpe/2013a Bhaduri, Kanishka and Stolpe, Marco. Distributed Data Mining in Sensor Networks. In Aggarwal, Charu C. (editors), Managing and Mining Sensor Data, Berlin, Heidelberg, Springer, 2013.
Stolpe/etal/2011a Stolpe, Marco and Morik, Katharina and Konrad, Benedikt and Lieber, Daniel and Deuse, Jochen. Challenges for Data Mining on Sensor Data of Interlinked Processes. In Proceedings of the Next Generation Data Mining Summit (NGDM) 2011, 2011.
Morik/etal/2009a May, Michael and Berendt, Bettina and Cornuejols, Antoine and Gama, Joao and Giannotti, Fosca and Hotho, Andreas and Malerba, Donato and Menesalvas, Ernestina and Morik, Katharina and Pedersen, Rasmus and Saitta, Lorenza and Saygin, Yucel and Schuster, Assaf and Vanhoof, Koen. Research Challenges in Ubiquitous Knowledge Discovery. In Kargupta and Han and Yu and Motwani and Kumar (editors), Next Generation of Data Mining, pages 131--151, CRC Press, 2009.
Mierswa/etal/2008a Mierswa, Ingo and Morik, Katharina and Wurst, Michael. Handling Local Patterns in Collaborative Structuring. In Masseglia, Florent and Poncelet, Pascal and Teisseire, Maguelonne (editors), Successes and New Directions in Data Mining, pages 167 -- 186, IGI Global, 2008.
Mierswa/etal/2008b Mierswa, Ingo and Morik, Katharina and Wurst, Michael. Collaborative Use of Features in a Distributed System for the Organization of Music Collections. In Shen and Shephard and Cui and Liu (editors), Intelligent Music Information Systems: Tools and Methodologies, pages 147--176, Igi Global Publishing, 2008.
Wurst/2008a Michael Wurst. Distributed Collaborative Structuring -- A Data Mining Approach to Information Management in Loosely Coupled Domains. Fachbereich Informatik, Technische Universität Dortmund, 2008.
Flasch/etal/2007a Flasch, Oliver and Kaspari, Andreas and Morik, Katharina and Wurst, Michael. Nemoz - A Distributed Framework for Collaborative Media Organization. In Proceedings of the Third International Workshop on Distributed Frameworks for Multimedia Applications, 2007.
Flasch/etal/2007b Flasch, Oliver and Kaspari, Andreas and Morik, Katharina and Wurst, Michael. Aspect-Based Tagging for Collaborative Media Organization. In Berendt, Bettina and Hotho, Andreas and Mladenic, Dunja and Semeraro, Giovanni (editors), From Web to Social Web: Discovering and Deploying User and Content Profiles, Vol. 4737, pages 122-141, Springer, 2007.
Mierswa/etal/2007b Mierswa, Ingo and Morik, Katharina and Wurst, Michael. Collaborative Use of Features in a Distributed System for the Organization of Music Collections. In Shen, Shepherd, Cui, and Liu (editors), Intelligent Music Information Systems: Tools and Methodologies, pages 147 - 175, Information Science Reference, 2007.
Wurst/2007b Wurst, Michael. Multi-Agent Learning by Distributed Feature Extraction. In Daniel Kudenko and Ann Nowe and Zahia Guessoum and Karl Tuyls (editors), Adaptive Agents and Multi-Agent Systems III, pages 239 --, Springer, 2007.
Wurst/Morik/2007a Wurst, Michael and Morik, Katharina. Distributed Feature Extraction in a P2P Setting - A Case Study. In Future Generation Computer Systems, Special Issue on Data Mining, Vol. 23, No. 1, pages 69 -- 75, Amsterdam, The Netherlands, Elsevier Science Publishers B. V., 2007.
Mierswa/etal/2006a Mierswa, Ingo and Wurst, Michael and Klinkenberg, Ralf and Scholz, Martin and Euler, Timm. YALE: Rapid Prototyping for Complex Data Mining Tasks. In Tina Eliassi-Rad and Lyle H. Ungar and Mark Craven and Dimitrios Gunopulos (editors), Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pages 935--940, ACM, New York, USA, ACM Press, 2006.
Wurst/etal/2006a Wurst, Michael and Morik, Katharina and Mierswa, Ingo. Localized Alternative Cluster Ensembles for Collaborative Structuring. In Johannes Fürnkranz and Tobias Scheffer and Myra Spiliopoulou (editors), Proceedings of the European Conference on Machine Learning, pages 485--496, Berlin, Springer, 2006.
Wurst/Morik/2006b Wurst, Michael and Morik, Katharina. Multi-Agent Learning By Feature Sharing. In Proceedings of the 6th European Symposium on Adaptive Learning Agents and MAS, 2006.
Wurst/Scholz/2006a Wurst, Michael and Scholz, Martin. Distributed Subgroup Discovery. In Johannes Furnkranz, Tobias Scheffer, Myra Spiliopoulou (editors), Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-06), Vol. 4213, pages 421--433, Berlin, Germany, Springer, 2006.
Mierswa/Wurst/2005a Mierswa, Ingo and Wurst, Michael. Efficient Case Based Feature Construction for Heterogeneous Learning Tasks. No. CI-194/05, Collaborative Research Center 531, University of Dortmund, 2005.
Mierswa/Wurst/2005b Mierswa, Ingo and Wurst, Michael. Efficient Feature Construction by Meta Learning -- Guiding the Search in Meta Hypothesis Space. In Proc. of the International Conference on Machine Learning, Workshop on Meta Learning, 2005.
Mierswa/Wurst/2005c Mierswa, Ingo and Wurst, Michael. Efficient Case Based Feature Construction for Heterogeneous Learning Tasks. In Alipio Jorge and Luis Torgo and Pavel Brazdil and Rui Camacho and Joao Gama (editors), Proceedings of the European Conference on Machine Learning (ECML 2005), pages 641--648, Berlin, Springer, 2005.
Scholz/2005d Scholz, Martin. On the Tractability of Rule Discovery from Distributed Data. In Han, J. and Wah, B.W. and Raghavan, V. and Wu, X. and Rastogi, R. (editors), Proceedings of the 5th IEEE International Conference on Data Mining (ICDM '05), pages 761--764, Houston, Texas, USA, IEEE Computer Society, 2005.
Wurst/2001a Wurst, Michael. Application of Machine Learning Methods in a Multi-Agent System. Gerstner Laboratory, Czech Technical University, 2001.