Algorithms for Segmenting Time Series

Authors

1 School of Engineering, Damghan University, Damghan, Iran.

2 School of Computer Engineering and Information Technology, Shahrood University, Shahrood, Iran.

Abstract

As with most computer science problems, representation of the data is the key to ecient and e ective solutions. Piecewise linear representation has been used for the representation of the data. This representation has been used by various researchers to support clustering, classi cation, indexing and association rule mining of time series data. A variety of algorithms have been proposed to obtain this representation, with several algorithms having been independently rediscovered several times. In this paper, we examine the techniques and then introduce the best-known algorithm.

Keywords


1. R. Agrawal, C. Faloutsos, A. Swami, E_cient similarity search in sequence databases, Proceedings of the 4th Conference on Foundations of Data Organization and Algorithms, 69-84 (1993).
2. E. Keogh, K. Chakrabarti, M. Pazzani, Dimensionality reduction for fast similarity search in large time series databases, Journal of Knowledge and Information Systems, 3(3), 263-286 (2000).
3. K. Chan, W. Fu, E_cient time series matching by wavelets, Proceedings of the 15th IEEE International Conference on Data Engineering, (1999).
4. R. Agrawal, K.I. Lin, H.S. Sawhney, K. Shim, Fast similarity search in the presence of noise, scaling, and translation in times-series databases, Proceedings of 21th International Conference on Very Large Data Bases, 490-500 (1995).
5. X. Ge and P. Smyth, Segmental Semi-Markov Models for Endpoint Detection in Plasma Etching, IEEE Transactions on Semiconductor Engineering, (2001).
6. G. Das, K. Lin, H. Mannila, G. Renganathan, P. Smyth, Rule discovery from time series, Proceedings of the 3th International Conference of Knowledge Discovery and Data Mining, 16-22 (1998).
7. C. Perng, H. Wang, S. Zhang, S. Parker, Landmarks: a new model for similarity-based pattern querying in time series databases, Proceedings of 16th International Conference on Data Engineering, (2000).
8. P.C. Chang, C.Y. Fan, C.H. Liu, Integrating a piecewise linear representation method and a neural network model for stock trading points prediction, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 39 (1), 80-92 (2009).
9. H. Shatkay, Approximate Queries and Representations for Large Data Sequences, Technical Report in Department of Computer Science, Brown University (1995).
10. H. Shatkay and S. Zdonik, Approximate queries and representations for large data sequences, Proceedings of the 12th IEEE International Conference on Data Engineering, 546-553 (1996).
11. E. Keogh and M. Pazzani, An enhanced representation of time series which allows fast and accurate classi_cation, clustering and relevance feedback, Proceedings of the 41th International Conference of Knowledge Discovery and Data Mining, 239 241(1998).
12. H. Wu, B. Salzberg, D. Zhang, Online event-driven subsequence matching over financial data streams, in: Proceeding of the SIGMOD, Stream Management, 23-34 (2004).
13. C. Li, P. Yu, V. Castelli , A framework for mining sequence database at multiple abstraction levels, Proceedings of the 9th International Conference on Information and Knowledge Management, 267-272 (1998).
14. E. Keogh, S. Chu, D. Hart, M. Pazzani, An online algorithm for segmenting time series, in: Proceedings of the IEEE International Conference on Data Mining, 289-296 (2001).
15. E. Keogh, M. Pazzani, Relevance feedback retrieval of time series data, Proceedings of the 22th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 183-190 (1999).
16. V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, J. Allan, Mining of concurrent text and time series, Proceedings of the 61th International Conference on Knowledge Discovery and Data Mining, 37-44 (2000).
17. N. Sugiura, R.T. Ogden, Testing change- points with linear trend communications , Simulation and Computation, 23, 287-322 (1994).
18. P.S. Heckbert, M. Garland, Survey of polygonal surface simpli_cation algorithms, Multiresolution Surface Modeling Course, Proceedings of the 24th International Conference on Computer Graphics and Interactive Techniques, 1-24 (1997).
19. E. Keogh, S. Chu, D. Hart, M. Pazzani, Segmenting Time Series: A Aurvey and Novel Approach, Data Mining in Time Series Databases, World Scientific Publishing Company, (2004).
20. A. Koski, M. Juhola, M. Meriste, Syntactic recognition of ECG signals by attributed finite automata, Pattern Recognition, 28 (12), 1927-1940 (1995).
21. H.J.L.M. Vullings, M.H.G. Verhaegen, H.B. Verbruggen, ECG segmentation using time- warping. Proceedings of the 2th International Symposium on Intelligent Data Analysis, 275-285 (1997).
22. D.H. Douglas, T.K. Peucker, Algorithms for the reduction of the number of points required to represent a digitized line or its caricature, Canadian Cartographer, 10(2), 112-122 (1973).
23. U. Ramer, An iterative procedure for the polygonal approximation of planar curves, Computer Graphics and Image Processing, 244-256 (1972).