Abstract
Highly-available datastores are widely deployed for Internet-based applications. However, many Internet-based applications are not contented with the simple data access interface provided by highly-available datastores. Distributed transaction support is demanded by applications such as massive online payment used by Alipay, Paypal or Baidu Wallet. Current solutions to distributed transaction can spend more than half of the whole transaction processing time in distributed commit. The culprits are the multiple write-ahead logging steps and communication roundtrips in the commit process. This paper presents the HACommit protocol, a logless one-phase commit protocol for highly-available datastores. HACommit has transaction participants vote for a commit before the client decides to commit or abort the transaction; in comparison, the state-of-the-art practice for distributed commit is to have the client decide before participants vote. The change enables the removal of both the participant’s write-ahead logging and the coordinator’s write-ahead logging steps in the distributed commit process; it also makes possible that, after the client initiates the transaction commit, the transaction data is visible to other transactions within one communication roundtrip time (i.e., one phase). In the evaluation with extensive experiments, HACommit outperforms recent atomic commit solutions for highly-available datastores under different workloads. In the best case, HACommit can commit in one fifth of the time the widely-used two-phase commit (2PC) does.
Similar content being viewed by others
References
Shute, J., Vingralek, R., Samwel, B., Handy, B., Whipkey, C., Rollins, E., Oancea, M., Littlefield, K., Menestrina, D., Ellner, S., Cieslewicz, J., Rae, I., Stancescu, T., Apte, H.: F1: a distributed sql database that scales. Proc. VLDB Endow. 6(11), 1068–1079 (2013)
Amazon cloud goes down friday night, taking netflix, instagram and pinterest with it (2012). http://www.forbes.com/sites/anthonykosner/2012/06/30/amazon-cloud-goes-down-friday-night-taking-netflix-instagram-and-pinterest-with-it/
Nishtala, R., Fugal, H., Grimm, S., Kwiatkowski, M., Lee, H., Li, H.C., McElroy, R., Paleczny, M., Peek, D., Saab, P., et al.: Scaling memcache at facebook. In: NSDI, vol. 13, 385–398 (2013)
Mu, S., Nelson, L., Lloyd, W., Li, J.: Consolidating concurrency control and consensus for commits under conflicts. Proceedings OSDI (2016)
Nawab, F., Arora, V., Agrawal, D., El Abbadi, A.: Minimizing commit latency of transactions in geo-replicated data stores. In: Proceedings of SIGMOD’15, pp. 1279–1294. ACM (2015)
Kraska, T., Pang, G., Franklin, M.J., Madden, S.: Mdcc: Multi-data center consistency. In: Eurosys (2013)
Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., et al.: Spanner: Google’s globally-distributed database. In: Proceedings of OSDI p. 1 (2012)
Lee, J., Muehle, M., May, N., Faerber, F., Sikka, V., Plattner, H., Krueger, J., Grund, M.: High-performance transaction processing in SAP HANA. IEEE Data Eng. Bull. 36(2), 28–33 (2013)
Diaconu, C., Freedman, C., Ismert, E., Larson, P.A., Mittal, P., Stonecipher, R., Verma, N., Zwilling, M.: Hekaton: Sql server’s memory-optimized oltp engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1243–1254. ACM (2013)
Paypal. https://www.paypal.com/
Alipay. https://www.alipay.com/
Baidu wallet. https://www.baifubao.com/
Peng, D., Dabek, F.: Large-scale incremental processing using distributed transactions and notifications. OSDI 10, 1–15 (2010)
Goldstein, J., Larson, P.Å.: Optimizing queries using materialized views: a practical, scalable solution. In: ACM SIGMOD Record, vol. 30, pp. 331–342. ACM (2001)
Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems, vol. 370. Addison-Wesley, New York (1987)
Zhang, I., Sharma, N.K., Szekeres, A., Krishnamurthy, A., Ports, D.R.: Building consistent transactions with inconsistent replication. In: Proceedings of SOSP ’15. ACM, New York (2015)
Glendenning, L., Beschastnikh, I., Krishnamurthy, A., Anderson, T.: Scalable consistency in scatter. In: Proceedings of SOSP, pp. 15–28. ACM (2011)
Mahmoud, H.A., Pucher, A., Nawab, F., Agrawal, D., Abbadi, A.E.: Low latency multi-datacenter databases using replicated commits. In: Proceedings of the VLDB Endowment (2013)
Mohan, C., Lindsay, B., Obermarck, R.: Transaction management in the r* distributed database management system. ACM Trans. Database Syst. 11(4), 378–396 (1986)
Abdallah, M., Guerraoui, R., Pucheral, P.: One-phase commit: does it make sense? In: IEEE Proceedings of International Conference on Parallel and Distributed Systems, pp. 182–192 (1998)
Jones, E.P., Abadi, D.J., Madden, S.: Low overhead concurrency control for partitioned main memory databases. In: Proceedings of SIGMOD, pp. 603–614. ACM (2010)
Pang, G., Kraska, T., Franklin, M.J., Fekete, A.: Planet: making progress with commit processing in unpredictable environments. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 3–14. ACM (2014)
Bodik, P., Fox, A., Franklin, M.J., Jordan, M.I., Patterson, D.A.: Characterizing, modeling, and generating workload spikes for stateful services. In: Proceedings of the 1st ACM symposium on Cloud computing, pp. 241–252. ACM (2010)
Schad, J., Dittrich, J., Quiané-Ruiz, J.A.: Runtime measurements in the cloud: observing, analyzing, and reducing variance. Proc. VLDB Endow. 3(1–2), 460–471 (2010)
Cristian, F.: Synchronous and asynchronous. Commun. ACM 39(4), 88–97 (1996)
Aguilera, M.K.: Stumbling over consensus research: misunderstandings and issues. In: Replication, pp. 59–72. Springer, Berlin (2010)
Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)
Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proceedings of OSDI, pp. 335–350. USENIX Association (2006)
Chandra, T., Griesemer, R., Redstone, J.: Paxos made live-an engineering perspective (2006 invited talk). In: Proceedings of PODC’07, vol. 7 (2007)
Guerraoui, R.: Revisiting the relationship between non-blocking atomic commitment and consensus. In: Distributed Algorithms, pp. 87–100. Springer, Berlin (1995)
Lamport, L.: Paxos made simple. ACM Sigact News 32(4), 18–25 (2001)
Dean, J., Barroso, L.A.: The tail at scale. Commun. ACM 56(2), 74–80 (2013)
Gray, J., Reuter, A.: Transaction Processing. Morgan Kaufíann Publishers, San Francisco (1993)
Malviya, N., Weisberg, A., Madden, S., Stonebraker, M.: Rethinking main memory oltp recovery. In: Proceedings of ICDE, pp. 604–615 (2014)
Baker, J., Bond, C., Corbett, J., Furman, J., Khorlin, A., Larson, J., Léon, J.M., Li, Y., Lloyd, A., Yushprakh, V.: Megastore: Providing scalable, highly available storage for interactive services. In: Proceedings of CIDR, pp. 223–234 (2011)
Leach, P., Mealling, M., Salz, R.: Rfc 4122—a universally unique identifier (UUID) URN namespace (2005). Internet Engineering Task Force
Reynal, M.: A short introduction to failure detectors for asynchronous distributed systems. ACM SIGACT News 36(1), 53–70 (2005)
Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: high availability via asynchronous virtual machine replication. In: Proceedings of NSDI’08, pp. 161–174. San Francisco (2008)
Berenson, H., Bernstein, P., Gray, J., Melton, J., O’Neil, E., O’Neil, P.: A critique of ansi sql isolation levels. ACM SIGMOD Record 24(2), 1–10 (1995)
An implementation of the mdcc protocol. https://github.com/hiranya911/mdcc
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st SoCC. ACM (2010)
Gray, J., Lamport, L.: Consensus on transaction commit. ACM Trans. Database Syst. 31(1), 133–160 (2006)
Guerraoui, R., Larrea, M., Schiper, A.: Reducing the cost for non-blocking in atomic commitment. In: IEEE Proceedings of ICDCS, pp. 692–697 (1996)
Guerraoui, R., Schiper, A.: The decentralized non-blocking atomic commitment protocol. In: Proceedings of IEEE Symposium on Parallel and Distributed Processing, pp. 2–9 (1995)
Sovran, Y., Power, R., Aguilera, M.K., Li, J.: Transactional storage for geo-replicated systems. In: Proceedings of SOSP’11, pp. 385–400
Mu, S., Cui, Y., Zhang, Y., Lloyd, W., Li, J.: Extracting more concurrency from distributed transactions. In: Proceedings of OSDI (2014)
Skeen, D.: Nonblocking commit protocols. In: Proceedings of SIGMOD, pp. 133–142. ACM (1981)
Stamos, J.W., Cristian, F.: Coordinator log transaction execution protocol. Distrib. Parallel Databases 1(4), 383–408 (1993)
Nawab, F., Agrawal, D., Abbadi, A.E.: Message futures: Fast commitment of transactions in multi-datacenter environments. In: Proceedings of CIDR (2013)
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Harizopoulos, S., Abadi, D.J., Madden, S., Stonebraker, M.: Oltp through the looking glass, and what we found there. In: Proceedings of SIGMOD, pp. 981–992. ACM (2008)
Acknowledgements
This work is supported in part by the State Key Development Program for Basic Research of China (Grant No. 2014CB340402), the National Key R&D Program of China (No. 2016YFB1000201), the National Natural Science Foundation of China (Grant No. 61303054 and 61420106013), and Youth Innovation Promotion Association of Chinese Academy of Sciences.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, Y., Yu, P.S., Yi, G. et al. Logless one-phase commit made possible for highly-available datastores. Distrib Parallel Databases 38, 101–126 (2020). https://doi.org/10.1007/s10619-019-07261-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-019-07261-2