Finding Heavy Hitters over the Sliding Window of a Weighted Data Stream | SpringerLink
Skip to main content

Finding Heavy Hitters over the Sliding Window of a Weighted Data Stream

  • Conference paper
LATIN 2008: Theoretical Informatics (LATIN 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4957))

Included in the following conference series:

Abstract

We study the problem of identifying items with heavy weights in the sliding window of a weighted data stream. We give a deterministic algorithm that solves the problem within error bound ε, uses \(O(\frac{R}{\epsilon})\) space and supports \(O(\frac{R}{\epsilon})\) query and update times. Here, R is the maximum item weight. We also show that the space can be reduced substantially in practice by showing for any c > 0, we can construct an \(O(\frac{c \log R}{\epsilon^2})\)-space algorithm, which returns correct answers provided that the ratio between the total weights of any two adjacent sliding windows is not greater than c. We also give a randomized algorithm that solves the problem with success probability 1 − δ using \(O(\frac{1}{\epsilon^2}\log R \log D \log \frac{\log D}{\delta\epsilon})\) space where D is the number of distinct items in the data stream.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Arasu, A., Manku, G.S.: Approximate counts and quantiles over sliding windows. In: Symposium on Principles of Database Systems (PODS), pp. 286–296 (2004)

    Google Scholar 

  2. Cormode, G., Muthukrishnan, S.: An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms 55(1), 58–75 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  3. Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 635–644 (2002)

    Google Scholar 

  4. Demaine, E.D., López-Ortiz, A., Munro, J.I.: Frequency estimation of internet packet streams with limited space. In: European Symposium on Algorithms (ESA), pp. 348–360 (2002)

    Google Scholar 

  5. Ganguly, S., Majumder, A.: CR-precis: A deterministic summary structure for update data streams. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 48–59. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Gibbons, P.B., Tirthapura, S.: Distributed streams algorithms for sliding windows. In: ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 63–72 (2002)

    Google Scholar 

  7. Golab, L., DeHaan, D., López-Ortiz, A., Demaine, E.D.: Finding frequent items in sliding windows with multinomially-distributed item frequencies. In: International Conference on Scientific and Statistical Database Management (SSDBM), pp. 425–426 (2004)

    Google Scholar 

  8. Karp, R.M., Shenker, S., Papadimitriou, C.H.: A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS) 28(1), 51–55 (2003)

    Article  Google Scholar 

  9. Lee, L.K., Ting, H.F.: Maintaining significant stream statistics over sliding windows. In: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms (SODA), pp. 724–732 (2006)

    Google Scholar 

  10. Lee, L.K., Ting, H.F.: A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In: Symposium on Principles of Database Systems (PODS), pp. 290–297 (2006)

    Google Scholar 

  11. Lin, X., Lu, H., Xu, J., Yu, J.X.: Continuously maintaining quantile summaries of the most recent N elements over a data stream. In: International Conference on Data Engineering (ICDE), pp. 362–374 (2004)

    Google Scholar 

  12. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Very Large Data Bases (VLDB) Conference, pp. 346–357 (2002)

    Google Scholar 

  13. Misra, J., Gries, D.: Finding repeated elements. Science of Computer Programming 2, 143–152 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  14. Xu, J., Lin, X., Zhou, X.: Space Efficient Quantile Summary for Constrained Sliding Windows on a Data Stream. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 34–44. Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Eduardo Sany Laber Claudson Bornstein Loana Tito Nogueira Luerbio Faria

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hung, R.Y.S., Ting, H.F. (2008). Finding Heavy Hitters over the Sliding Window of a Weighted Data Stream. In: Laber, E.S., Bornstein, C., Nogueira, L.T., Faria, L. (eds) LATIN 2008: Theoretical Informatics. LATIN 2008. Lecture Notes in Computer Science, vol 4957. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78773-0_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78773-0_60

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78772-3

  • Online ISBN: 978-3-540-78773-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics