A framework for integration of real-time anomaly detection algorithms in event-driven software systems

Problem

Typical use cases of real-time anomaly detection algorithms are event-driven systems, where events model occurrences in the real world. IoT devices for example produce massive amounts of events.  A statistical deviation of an event attribute or the occurrence of a certain event itself could signal safety-critical circumstances. The identification of these anomalies depends on user-defined rules, patterns, or the used detection algorithm. Anomaly detection algorithms are mostly designed by data scientists or mathematicians, with only basic knowledge of the principles of software engineering. Therefore, the focus in the design of these algorithms lays on increasing precision or recall. Attributes like fault tolerance, recoverability, maintainability, and availability are usually not the focus. Furthermore, the algorithms are written and published in the programming language that the data scientist prefers.

 

 

Goal

The implementation complexity of anomaly detection algorithms for event-driven systems should be reduced by constructing a framework that provides data scientists an easy way to integrate their algorithms and rules. The framework should be evaluated using 3 existing open-source anomaly detection algorithms.

 

 

Tasks

  • Requirement Analysis: e.g. identification of a common strategy/pattern in anomaly detection algorithms, quality requirements for framework, architectural requirements for framework
  • Architecture Design: based on requirement analysis, e.g. use of CEP Engine for application of algorithms on specified subset of events, management of anomaly detection rules, management of anomaly detection models
  • Prototype Implementation including 3 different anomaly detection algorithms

 

 

Involved Technologies

  • Required: Java
  • OptionalPython, Kafka (Consumer, Producer, Processor), Complex Event Processing (Drools)

 

 

Literature

  • Golmohammadi, S. K. (2016). Time series contextual anomaly detection for detecting stock market manipulation. Doctoral dissertation, University of Alberta.
  • Ted Dunning and Otmar Ertl. (2019). Computing Extremely Accurate Quantiles Using t-Digests (Implementation: https://github.com/tdunning/t-digest)
  • Ahmad, Subutai & Lavin, Alexander & Purdy, Scott & Agha, Zuha. (2017). Unsupervised real-time anomaly detection for streaming data. (Implementation: https://github.com/smirmik/CAD)

  • Type: Master Thesis
  • Status: Open

Supervisor