Visual Anomaly Detection in Event Sequence Data

Abstract

Anomaly detection is a common analytical task that aims to identify rare cases that differ from the typical cases that make up the majority of a dataset. Anomaly detection takes many forms, but is frequently applied to the analysis of event sequence data in addressing real-world problems such as error diagnosis, fraud detection, and vital sign monitoring. With general event sequence data, however, the task of anomaly detection can be complex because the sequential and temporal nature of such data results in diverse definitions and flexible forms of anomalies. This, in turn, increases the difficulty in interpreting detected anomalies, a critical element in raising human confidence with the analysis results. In this paper, we propose an unsupervised anomaly detection algorithm based on Variational AutoEncoders (VAE). The model learns latent representations for all sequences in the dataset and detects anomalies that deviate from the overall distribution. Moreover, the model can estimate an underlying normal progression for each given sequence represented as occurrence probabilities of events along the sequence progression. Events in violation of their occurrence probability (i.e., event occurrences with small occurrence probability, and absent events with large occurrence probability) are identified as abnormal. We also introduce a visualization system, EventThread3, to support interactive exploration of the analysis result. The system facilitates interpretations of anomalies within the context of normal sequence progressions in the dataset through comprehensive one-to-many sequence comparison. Finally, we quantitatively evaluate the performance of our anomaly detection algorithm and demonstrate the effectiveness of our system through case studies in three different application domains and report feedback collected from study participants and expert users.

Publication
IEEE Big Data