Bounded-wait combining: constructing robust and high-throughput shared objects

Hendler, Danny; Kutten, Shay

doi:10.1007/s00446-009-0078-4

Bounded-wait combining: constructing robust and high-throughput shared objects

Published: 31 January 2009

Volume 21, pages 405–431, (2009)
Cite this article

Distributed Computing Aims and scope Submit manuscript

Danny Hendler¹ &
Shay Kutten²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Shared counters are among the most basic coordination structures in distributed computing. Known implementations of shared counters are either blocking, non-linearizable, or have a sequential bottleneck. We present the first counter algorithm that is both linearizable, non-blocking, and can provably achieve high throughput in k-synchronous executions—executions in which process speeds vary by at most a constant factor k. The algorithm is based on a novel variation of the software combining paradigm that we call bounded-wait combining (BWC). It can thus be used to obtain implementations, possessing the same properties, of any object that supports combinable operations, such as a stack or a queue. Unlike previous combining algorithms where processes may have to wait for each other indefinitely, in the BWC algorithm, a process only waits for other processes for a bounded period of time and then “takes destiny in its own hands”. In order to reason rigorously about the parallelism attainable by our algorithm, we define a novel metric for measuring the throughput of shared objects, which we believe is interesting in its own right. We use this metric to prove that our algorithm achieves throughput of Ω(N/ log N) in k-synchronous executions, where N is the number of processes that can participate in the algorithm. Our algorithm uses two tools that we believe may prove useful for obtaining highly parallel non-blocking implementation of additional objects. The first are “synchronous locks”, locks that are respected by processes only in k-synchronous executions and are disregarded otherwise; the second are “pseduo-transactions”—a weakening of regular transactions that allows higher parallelism.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Long-lived counters with polylogarithmic amortized step complexity

Article 29 November 2022

Lower Bounds on the Amortized Time Complexity of Shared Objects

Article 12 July 2024

Distributed Universality

Article 19 August 2015

References

Aspnes J., Herlihy M., Shavit N.: Counting networks. J. ACM 41(5), 1020–1048 (1994)
Article MATH MathSciNet Google Scholar
Attiya H., Dagan A.: Improved implementations of binary universal operations. J. ACM 48(5), 1013–1037 (2001)
Article MathSciNet Google Scholar
Attiya, H., Guerraoui, R., Kouznetsov, P.: Computing with reads and writes in the absence of step contention. In: Proceedings of the 19th International Symposium on Distributed Computing (DISC’05) (2005)
Dwork C., Herlihy M., Waarts O.: Contention in shared memory algorithms. J. ACM 44(6), 779–805 (1997)
Article MATH MathSciNet Google Scholar
Ellen, F., Lev, Y., Luchangco, V., Moir, M.: Snzi: scalable nonzero indicators. In: Proceedings of the 27th Annual ACM Symposium on Principles of Distributed Computing (PODC) (2007)
Fich, F.E., Hendler, D., Shavit, N.: Linear lower bounds on real-world implementations of concurrent objects. In: Proceedings of the 46th Annual Symposium on Foundations of Computer Science (FOCS) (2005)
Goodman, J.R., Vernon, M.K., Woest, P.J.: Efficent synchronization primitives for large-scale cache-coherent multiprocessors. In: ASPLOS, pp. 64–75 (1989)
Gottlieb, A., Grishman, R., Kruskal, C.P., McAuliffe, K.P., Rudolph, L., Snir, M.: The nyu ultracomputer designing a mimd, shared-memory parallel machine. In: ISCA ’98: 25 years of the international symposia on Computer architecture (selected papers), pp. 239–254, New York, NY, USA. ACM Press, New York (1998)
Greenwald, M.: Two-handed emulation: how to build non-blocking implementations of complex data-structures using dcas. In: PODC ’02: Proceedings of the twenty-first annual symposium on Principles of distributed computing, pp. 260–269, New York, NY, USA. ACM Press, New York (2002)
Ha, P.H., Papatriantafilou, M., Tsigas, P.: Self-tuning reactive distributed trees for counting and balancing. In: OPODIS, pp. 213–228 (2004)
Hendler, D., Kutten, S.: Bounded-wait combining: Constructing robust and high-throughput shared objects. In: Proceedings of the 20th International Symposium on Distributed Computing (DISC’06), pp. xx–xx (2006)
Hendler, D., Shavit, N., Yerushalmi, L.: A scalable lock-free stack algorithm. In: SPAA ’04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures, pp. 206–215, New York, NY, USA. ACM Press, New York (2004)
Herlihy M.: Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13(1), 124–149 (1991)
Article Google Scholar
Herlihy M.: A methodology for implementing highly concurrent objects. ACM Trans. Program. Lang. Syst. 15(5), 745–770 (1993)
Article Google Scholar
Herlihy M., Shavit N., Waarts O.: Linearizable counting networks. Distributed Comput. 9(4), 193–203 (1996)
Article Google Scholar
Herlihy M., Wing J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)
Article Google Scholar
Kruskal C.P., Rudolph L., Snir M.: Efficient synchronization on multiprocessors with shared memory. ACM Trans. Program. Lang. Syst. 10(4), 579–601 (1988)
Article MATH Google Scholar
Moir, M., Nussbaum, D., Shalev, O., Shavit, N.: Using elimination to implement scalable and lock-free fifo queues. In: SPAA’05: Proceedings of the 17th annual ACM symposium on Parallelism in algorithms and architectures, pp. 253–262, NY, USA. ACM Press, New York (2005)
Shavit N., Zemach A.: Diffracting trees. ACM Trans. Comput. Syst. 14(4), 385–428 (1996)
Article Google Scholar
Shavit N., Zemach A.: Combining funnels: a dynamic approach to software combining. J. Parallel Distributed Comput. 60(11), 1355–1387 (2000)
Article MATH Google Scholar
Wattenhofer R., Widmayer P.: The counting pyramid: an adaptive distributed counting scheme. J. Parallel Distributed Comput. 64(4), 449–460 (2004)
Article MATH Google Scholar
Yew P.-C., Tzeng N.-F., Lawrie D.H.: Distributing hot-spot addressing in large-scale multiprocessors. IEEE Trans. Comput. 36(4), 388–395 (1987)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Ben-Gurion University, Beer-Sheva, Israel
Danny Hendler
Faculty of Industrial Engineering and Management, Technion, Haifa, Israel
Shay Kutten

Authors

Danny Hendler
View author publications
You can also search for this author in PubMed Google Scholar
Shay Kutten
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danny Hendler.

Additional information

A preliminary version of this paper appeared in [11].

D. Hendler is supported in part by a grant from the Israel Science Foundation. S. Kutten is supported in part by a grant from the Israel Science Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hendler, D., Kutten, S. Bounded-wait combining: constructing robust and high-throughput shared objects. Distrib. Comput. 21, 405–431 (2009). https://doi.org/10.1007/s00446-009-0078-4

Download citation

Received: 19 July 2007
Accepted: 16 December 2008
Published: 31 January 2009
Issue Date: March 2009
DOI: https://doi.org/10.1007/s00446-009-0078-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Bounded-wait combining: constructing robust and high-throughput shared objects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Long-lived counters with polylogarithmic amortized step complexity

Lower Bounds on the Amortized Time Complexity of Shared Objects

Distributed Universality

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Bounded-wait combining: constructing robust and high-throughput shared objects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Long-lived counters with polylogarithmic amortized step complexity

Lower Bounds on the Amortized Time Complexity of Shared Objects

Distributed Universality

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation