Toller Maximilian, Geiger Bernhard, Kern Roman
Distance-based classification is among the most competitive classification methods for time series data. The most critical componentof distance-based classification is the selected distance function.Past research has proposed various different distance metrics ormeasures dedicated to particular aspects of real-world time seriesdata, yet there is an important aspect that has not been considered so far: Robustness against arbitrary data contamination. In thiswork, we propose a novel distance metric that is robust against arbitrarily “bad” contamination and has a worst-case computationalcomplexity of O(n logn). We formally argue why our proposedmetric is robust, and demonstrate in an empirical evaluation thatthe metric yields competitive classification accuracy when appliedin k-Nearest Neighbor time series classification.
Toller Maximilian, Kern Roman
The in-depth analysis of time series has gained a lot of re-search interest in recent years, with the identification of pe-riodic patterns being one important aspect. Many of themethods for identifying periodic patterns require time series’season length as input parameter. There exist only a few al-gorithms for automatic season length approximation. Manyof these rely on simplifications such as data discretization.This paper presents an algorithm for season length detec-tion that is designed to be sufficiently reliable to be used inpractical applications. The algorithm estimates a time series’season length by interpolating, filtering and detrending thedata. This is followed by analyzing the distances betweenzeros in the directly corresponding autocorrelation function.Our algorithm was tested against a comparable algorithmand outperformed it by passing 122 out of 165 tests, whilethe existing algorithm passed 83 tests. The robustness of ourmethod can be jointly attributed to both the algorithmic ap-proach and also to design decisions taken at the implemen-tational level.