MiningMart -- Case-Based Support for Database Preprocessing (Extract, Transform, Load)

The Knowledge Discovery in Databases (KDD) process consists of a number of phases before and after the core phase of training a machine learning algorithm (data mining). Data needed for learning must first be accessed, assembled, described, understood and transformed. After mining, the deployment of the learned model to new data must be supported; the new data must therefore go through the same process. The aim of research in this area is to model this process in such a way that users with little experience in the field can apply KDD.


SFB 475 subproject A4


MiningMart system
RapidMiner (YALE)


Euler, Timm
Klinkenberg, Ralf
Köpcke, Hanna
Scholz, Martin

Past Master Thesis


Euler/2006a Timm Euler. Data Mining mit MiningMart. In Programmieren unter Linux, No. 1, pages 56--60, 2006.
Euler/2006b Timm Euler. Modeling Preparation for Data Mining Processes. In Journal of Telecommunications and Information Technology, No. 4, pages 81--87, 2006.
Mierswa/etal/2006a Mierswa, Ingo and Wurst, Michael and Klinkenberg, Ralf and Scholz, Martin and Euler, Timm. YALE: Rapid Prototyping for Complex Data Mining Tasks. In Tina Eliassi-Rad and Lyle H. Ungar and Mark Craven and Dimitrios Gunopulos (editors), Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pages 935--940, ACM, New York, USA, ACM Press, 2006.
Euler/2005a Timm Euler. Publishing Operational Models of Data Mining Case Studies. In Proceedings of the Workshop on Data Mining Case Studies at the 5th IEEE International Conference on Data Mining (ICDM), pages 99--106, Houston, Texas, USA, 2005.
Euler/2005d Timm Euler. Modelling Data Mining Processes on a Conceptual Level. In Proceedings of the 5th International Conference on Decision Support for Telecommunications and Information Society, Warsaw, Poland, 2005.
Euler/Scholz/2004a Euler, Timm and Scholz, Martin. Using Ontologies in a KDD Workbench. In Buitelaar, P. and Franke, J. and Grobelnik, M. and Paa?, G. and Svatek, V. (editors), Workshop on Knowledge Discovery and Ontologies at ECML/PKDD '04, pages 103--108, Pisa, Italy, 2004.
Morik/Koepcke/2004a Morik, Katharina and Köpcke, Hanna. Analysing Customer Churn in Insurance Data - A Case Study. In Jean-Francois Boulicaut and Floriana Esposito and Fosca Giannotti and Dino Pedreschi (editors), PKDD '04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Vol. 3202, pages 325--336, New York, NY, USA, Springer, 2004.
Morik/Scholz/2004a Morik, Katharina and Scholz, Martin. The MiningMart Approach to Knowledge Discovery in Databases. In Ning Zhong and Jiming Liu (editors), Intelligent Technologies for Information Analysis, pages 47--65, Springer, 2004.
Euler/etal/2003a Euler, Timm and Morik, Katharina and Scholz, Martin. MiningMart: Sharing Successful KDD Processes. In Hotho, Andreas and Stumme, Gerd (editors), LLWA 2003 -- Tagungsband der GI-Workshop-Woche Lehren -- Lernen -- Wissen -- Adaptivitat, pages 121--122, 2003.
Morik/etal/2003a Morik,Katharina and Scholz, Martin and Euler, Timm. MiningMart Final Report. No. D20.4, IST Project MiningMart, IST-11993, 2003.
Morik/etal/2003b Morik, Katharina and Scholz, Martin and Euler, Timm. Ext-MM Final Report. No. D20.5, IST Project MiningMart, IST-11993, 2003.
Morik/Rueping/2002a Morik, Katharina and Rüping, Stefan. A Multistrategy Approach to the Classification of Phases in Business Cycles. In Elomaa, Taprio and Mannila, Heikki and Toivonen, Hannu (editors), Machine Learning: ECML 2002, Vol. 2430, pages 307--318, Berlin, Springer, 2002.
Morik/etal/2001b Morik, Katharina and Botta, Marco and Dittrich, Klaus R. and Kietz, Jorg-Uwe and Portinale, Luigi and Vaduva, Anca and Zucker, Regina. M4 -- The MiningMart Meta Model. No. D8/9, IST Project MiningMart, IST-11993, 2001.
Morik/2000a Morik, Katharina. The Representation Race - Preprocessing for Handling Time Phenomena. In Ramon L\'opez de M\'antaras and Enric Plaza (editors), ECML '00: Proceedings of the 11th European Conference on Machine Learning, Vol. 1810, pages 4--19, Berlin, Heidelberg, New York, Springer, 2000.
Morik/Liedtke/2000a Morik, Katharina and Liedtke, Harald. Learning about Time. No. D3, IST Project MiningMart, IST-11993, 2000.
Morik/Brockhausen/97a Morik, Katharina and Brockhausen, Peter. A Multistrategy Approach to Relational Knowledge Discovery in Databases. In Machine Learning Journal, Vol. 27, No. 3, pages 287--312, Kluwer, 1997.