Title | Concept Drift Detection with Data Summarization |
---|---|
Description |
Concept Drift Erkennung mit Hilfe von Data Summarization
- Wie sollte eine Vergleichsfunktion für zwei Zusammenfassungen aussehen? - Welche Eigenschaften muss eine Zusammenfassung erfüllen, damit man Concept Drift erkennen kann?
- "Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams" by Norouzi-Fard etal. 2018 (http://proceedings.mlr.press/v80/norouzi-fard18a/norouzi-fard18a.pdf) - "Maintenance for Case Streams: A Streaming Approach to Competence-Based Deletion" by Zhang etal. 2017 (https://link.springer.com/chapter/10.1007/978-3-319-61030-6_29) - "Characterizing concept drift" by Webb etal. 2016 (https://link.springer.com/content/pdf/10.1007%2Fs10618-015-0448-4.pdf)
Concept Drift Detection with Data Summarization Monitoring plays a major role in many real-world systems which run long periods of time without interruption, such as e.g. server-management or goods-production.To ensure long uptimes, one has to maintain these systems during/in production leading to the central question: What is the current status of the system - does everything run as expected or is something changing and do we need to intervene with parts of the system?Sensors monitoring certain aspects of the system can be used to determine the current status of the system. Concept Drift describes the change of measurements over time in these sensors, e.g. when a sensors breaks we may observe a sudden change in its measurements, while on the other hand the attrition of a machine results in a more subtle change in the sensor data.In this bachelor thesis we want to explore the possibility to use data summarization techniques to detect concept drift. Data summarization techniques are used to extract a small, but expressive summary of a data set in an online setting where measurements are continuously performed. Thus, the overall goal is to use small summaries to detect concept drift, instead of looking at all measurements. Among others, we want to tackle the following question: - Are summaries enough to detect a concept drift? - How can we compare two summaries with each other? - What properties should a summary fulfil, so that we detect concept drift?
- "Streaming Submodular Maximization: Massive Data Summarization on the Fly" by Badanidiyuru etal. 2014 (https://las.inf.ethz.ch/files/badanidiyuru14streaming.pdf) - "Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams" by Norouzi-Fard etal. 2018 (http://proceedings.mlr.press/v80/norouzi-fard18a/norouzi-fard18a.pdf) - "Maintenance for Case Streams: A Streaming Approach to Competence-Based Deletion" by Zhang etal. 2017 (https://link.springer.com/chapter/10.1007/978-3-319-61030-6_29) - "Characterizing concept drift" by Webb etal. 2016 (https://link.springer.com/content/pdf/10.1007%2Fs10618-015-0448-4.pdf) |
Second Tutor | Buschjäger, Sebastian |
Status | Vorgemerkt |
---|