top of page
Robotec_logo_2022-RGB (1) (2).png

Monika Wysoczańska May 06, 2021 - Artificial Intelligence

Anomaly detection in time series


Authors: Monika Wysoczańska, Paweł Kotowski. Image source: [4]


In the previous post we gave you a brief introduction to anomaly detection. We reviewed there some potential real-life applications and outlined common Machine Learning techniques that are being used for this task. In this post we cover a specific type of data regarding anomaly detection, which is time series.


The data generated by many applications is a continuous temporal process, therefore acquired and represented as a series of events. In such cases the temporal component plays a key role in outlier analysis. This scenario arises in the context of many applications such as sensor data, mechanical system diagnosis, medical data, network intrusion data or financial posts.

Let’s look at a simple example. Imagine you have a device, and you monitor its CPU Usage – as a result you obtain the data as on the plot below.



As you can see, there is an event with suspiciously high CPU Usage. This may indicate some potential issues at your device working time. Therefore, detecting such anomalous events may turn out to be crucial for your device maintenance.


Type of outliers in time series

Due to the temporal character of time series data, we can divide types of outliers mainly into the three groups [3]:

  • Point outliers — a point outlier is a data point that behaves unusually in a specific time instant when compared either to the other values in the time series or to its neighbouring points. Our device monitoring case plot shows the examples of such local outlier.

  • Subsequence outliers — in this case, an anomaly are consecutive points in time whose joint behaviour is unusual, although each observation individually may not necessarily be a point outlier. Within our device monitoring context, you can see such exemplary case in the plot below.



  • Outlier time series — in this scenario an entire time series can also be outliers. This on the other hand can only be detected when we look at multiple features.


This brings us to another aspect of anomaly detection in time series which is a type of input data. To this end, we can divide applied approaches as follows [3]:

  • Univariate — we have only one variable to measure and base our decision on it – such as in our simple machine maintenance example, where we only looked at the temperature.

  • Multivariate — we report multiple such variables in parallel. This however imposes another challenge of analyzing the relationships between measured variables.


With many features, the situation becomes complicated, since there can be outliers that do not appear as outlying observations when considering each dimension separately and therefore will not be detected from the univariate criterion. Thus, all the features need to be considered together using a multivariate approach.


In practice, we are mostly given multiple features to search for an anomalous behaviour, although it is highly dependent on a particular problem. In some cases, we may also be given hundreds of such variables to analyze, but allowed to finally use only a few of them in production environments. This problem especially arises in terms of computationally constrained environments. In such cases a robust feature selection method is crucial for choosing a final set of features that will be used in a production environment.


In this post we outlined the most important aspects of anomaly detection in time series data. In the next one, we will introduce some specific applications of outlier detection methods in Automotive and show how we leverage Machine Learning to solve this task in Robotec.ai.


References:

[2] Charu C. Aggarwal. 2016. Outlier Analysis (2nd ed.). Springer Publishing Company, Incorporated.

[3] A review on outlier/anomaly detection in time series data: https://arxiv.org/abs/2002.04236


bottom of page