Abstract
With the development of event-driven applications, event stream processing has received more and more attentions in database community. However, little work has focused on the problem of data mining and similarity analysis among event streams. As the foundation for the data mining such as frequent or abnormal event pattern detection, efficient similarity search is desired to be first executed. In this paper, we attempt to take the first step into the similarity search in the context of vast event streams. We propose a simple but effective model to improve the efficiency of the similarity search. To avoid redundant pair-wise comparison, we adopt the definition of sharing extent to dramatically filter dissimilar event streams and speed up the calculation of similarity. Extensive simulated experiments have demonstrated that our model and algorithm can lead to higher efficiency when guaranteeing expected accuracy.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wu, E., Diao, Y., Rizvi, S.: High-performance complex event processing over streams. In: Proc. of SIGMOD, pp. 407–418. ACM press, New York (2006)
Wang, F., Liu, S., Liu, P., et al.: Bridge physical and virtual worlds: complex event processing for RFID data streams. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 588–607. Springer, Heidelberg (2006)
Chakravarthy, S., Adaikkalavan, R.: Events and Streams: Harnessing and Unleashing Their Synergy! In: Proc. of DEBS, pp. 1–12. ACM press, New York (2008)
Rozsnyai, S., Schiefer, J., Schatten, A.: Concepts and Models for Typing Events for Event-Based Systems. In: Proc. of DEBS, pp. 62–70. ACM press, New York (2007)
Barga, R.S., Goldstein, J., Ali, M., Hong, M.: Consistent streaming through time: A vision for event stream processing. In: Proc. of CIDR, pp. 363–373 (2007)
Brenna, L., Demers, A., Gehrke, J., et al.: Cayuga: a high-performance event processing engine. In: Proc. of CIDR, pp. 1100–1102. ACM Press, New York (2007)
Mannila, H., Ronkainen, P.: Similarity of Event Sequences. In: Temporal Representation and Reasoning, pp. 136–139. IEEE press, Dayton Beach (1997)
Goodman, I.R.: Similarity Measures of Events, Relational Event Algebra, and Extensions to Fuzzy Logic. In: Fuzzy Information Processing Society, pp. 187–191. IEEE press, Berkeley (1997)
Ünal, A., Saygin, Y., Ulüsoy, Ö.: Processing count queries over event streams at multiple time granularities. Information Sciences 176, 2066–2096 (2005)
Gravano, L., Ipeirotis, P.G., Jagadish, H.V., et al.: Approximate String Joins in a Database (Almost) for Free. In: Proc. of VLDB, pp. 491–500. VLDB Endowment, Italy (2001)
Arasu, A., Ganti, V., Kaushik, R.: Efficient Exact Set-Similarity Joins. In: Proc. of VLDB, pp. 918–929. VLDB Endowment, Seoul (2006)
Sarawagi, S., Kirpal, A.: Efficient set joins on similarity predicates. In: Proc. of SIGMOD, pp. 743–754. ACM Press, New York (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Y., Yu, G., Zhang, T., Yue, D., Gu, Y., Hu, X. (2009). Effective Similarity Analysis over Event Streams Based on Sharing Extent. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, QM. (eds) Advances in Data and Web Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00672-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-00672-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00671-5
Online ISBN: 978-3-642-00672-2
eBook Packages: Computer ScienceComputer Science (R0)