{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,11]],"date-time":"2024-09-11T12:34:04Z","timestamp":1726058044480},"reference-count":59,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2023,3,15]],"date-time":"2023-03-15T00:00:00Z","timestamp":1678838400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62025202"],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["J Software Evolu Process"],"published-print":{"date-parts":[[2024,4]]},"abstract":"Abstract<\/jats:title>Internet\u2010scale distributed systems often replicate data at multiple geographic locations to provide low latency and high availability, despite node and network failures. According to the CAP theorem, low latency and high availability can only be achieved at the cost of accepting weak consistency. The conflict\u2010free replicated data type (CRDT) is a framework that provides a principled approach to maintaining eventual consistency among data replicas. CRDTs have been notoriously difficult to design and implement correctly. Subtle deep bugs lie in the complex and tedious handling of all possible cases of conflicting data updates. We argue that the CRDT design should be formally specified and model checked, to uncover deep bugs which are beyond human reasoning. The implementation further needs to be systematically tested. On the one hand, the testing needs to inherit the exhaustive nature of the model checking and ensures the coverage of testing. On the other hand, the testing is expected to find coding errors which cannot be detected by design level verification. Toward the challenges above, we propose the model\u2010checking\u2010driven explorative testing (MET<\/jats:sc>) framework. At the design level, MET<\/jats:sc> uses TLA+ to specify and model check CRDT designs. At the implementation level, MET<\/jats:sc> conducts model\u2010checking\u2010driven explorative testing, in the sense that the test cases are automatically generated from the model\u2010checking traces. The system execution is controlled to proceed deterministically, following the model\u2010checking trace. The explorative testing systematically controls and permutes all nondeterministic choices of message reorderings. We apply MET<\/jats:sc> in our practical development of CRDTs. The bugs in both designs and implementations of CRDTs are found. As for bugs which can be found by traditional testing techniques, MET<\/jats:sc> greatly reduces the cost of fixing the bugs. Moreover, MET<\/jats:sc> can find subtle deep bugs which cannot be found by existing techniques at a reasonable cost. Based on our practical use of MET<\/jats:sc>, we discuss how MET<\/jats:sc> provides us with sufficient confidence in the correctness of our CRDT designs and implementations.<\/jats:p>Conflict\u2010free replicated data type (CRDT) is a framework that provides a principled approach to maintaining eventual consistency among data replicas in distributed systems. CRDTs have been notoriously difficult to design and implement correctly. We propose model\u2010checking\u2010driven explorative testing (MET<\/jats:sc>) framework for dealing with such problem. We apply MET<\/jats:sc> in our practical development of CRDTs. MET<\/jats:sc> successfully finds subtle deep bugs and provides us with sufficient confidence in the correctness of our CRDT designs and implementations.<\/jats:p>","DOI":"10.1002\/smr.2555","type":"journal-article","created":{"date-parts":[[2023,3,15]],"date-time":"2023-03-15T08:29:58Z","timestamp":1678868998000},"update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Model\u2010checking\u2010driven explorative testing of CRDT designs and implementations"],"prefix":"10.1002","volume":"36","author":[{"ORCID":"http:\/\/orcid.org\/0000-0003-1416-9379","authenticated-orcid":false,"given":"Yuqi","family":"Zhang","sequence":"first","affiliation":[{"name":"State Key Laboratory for Novel Software Technology Nanjing University Nanjing China"}]},{"given":"Yu","family":"Huang","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology Nanjing University Nanjing China"}]},{"given":"Hengfeng","family":"Wei","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology Nanjing University Nanjing China"}]},{"given":"Xiaoxing","family":"Ma","sequence":"additional","affiliation":[{"name":"State Key Laboratory for Novel Software Technology Nanjing University Nanjing China"}]}],"member":"311","published-online":{"date-parts":[[2023,3,15]]},"reference":[{"key":"e_1_2_11_2_1","doi-asserted-by":"crossref","unstructured":"BurckhardtS GotsmanA YangH ZawirskiM.Replicated data types: Specification verification optimality. In: Proceedings of the 41st ACM SIGPLAN\u2010SIGACT Symposium on Principles of Programming Languages. POPL '14.ACM;2014:271\u2010284.","DOI":"10.1145\/2535838.2535848"},{"key":"e_1_2_11_3_1","doi-asserted-by":"crossref","unstructured":"ShapiroM Pregui\u00e7aN BaqueroC ZawirskiM.Conflict\u2010free replicated data types. In: Proceedings of the 13th International Conference on Stabilization Safety and Security of Distributed Systems. SSS'11.Springer\u2010Verlag;2011:386\u2010400.http:\/\/dl.acm.org\/citation.cfm?id=2050613.2050642","DOI":"10.1007\/978-3-642-24550-3_29"},{"key":"e_1_2_11_4_1","doi-asserted-by":"crossref","unstructured":"EnesV AlmeidaPS BaqueroC LeitaoJ.Efficient synchronization of state\u2010based crdts. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE).IEEE;2019:148\u2010159.","DOI":"10.1109\/ICDE.2019.00022"},{"key":"e_1_2_11_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2012.37"},{"key":"e_1_2_11_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2011.389"},{"key":"e_1_2_11_7_1","unstructured":"Riak.Riak enterprise NoSQL database scalable database solutions;2022.https:\/\/riak.com\/"},{"key":"e_1_2_11_8_1","unstructured":"Redis.Redis enterprise database built for hybrid applications;2022.https:\/\/redis.com\/redis-enterprise\/"},{"key":"e_1_2_11_9_1","unstructured":"Microsoft Azure.Azure Cosmos DB\u2014NoSQL database Microsoft Azure;2022.https:\/\/azure.microsoft.com\/en-us\/services\/cosmos-db\/"},{"key":"e_1_2_11_10_1","unstructured":"LeesatapornwongsaT HaoM JoshiP LukmanJF GunawiHS.SAMC: semantic\u2010aware model checking for fast discovery of deep bugs in cloud systems. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. OSDI'14.USENIX Association;2014:399\u2010414.http:\/\/dl.acm.org\/citation.cfm?id=2685048.2685080"},{"key":"e_1_2_11_11_1","unstructured":"The TLA+ home page;2022.https:\/\/lamport.azurewebsites.net\/tla\/tla.html"},{"key":"e_1_2_11_12_1","unstructured":"GitHub.Conflict\u2010free replicated data types based on Redis;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis"},{"key":"e_1_2_11_13_1","doi-asserted-by":"crossref","unstructured":"ZhangY WeiH HuangY.Remove\u2010win: a design framework for conflict\u2010free replicated data types. In: Proceedings of the 27th IEEE International Conference on Parallel and Distributed Systems. ICPADS'21.IEEE;2021.","DOI":"10.1109\/ICPADS53394.2021.00081"},{"key":"e_1_2_11_14_1","unstructured":"BrownMA.Traffic Control HOWTO;2020. Accessed September 30 2020.http:\/\/tldp.org\/HOWTO\/Traffic-Control-HOWTO\/index.html"},{"key":"e_1_2_11_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/2699417"},{"key":"e_1_2_11_16_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #1;2022.https:\/\/github.com\/elem-azarunis\/CRDT-Redis\/issues\/1"},{"key":"e_1_2_11_17_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #4;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis\/issues\/4"},{"key":"e_1_2_11_18_1","unstructured":"XuT JinX HuangP ZhouY LuS JinL PasupathyS.Early detection of configuration errors to reduce failure damage. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. OSDI'16.USENIX Association;2016:619\u2010634."},{"key":"e_1_2_11_19_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #2;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis\/issues\/2"},{"key":"e_1_2_11_20_1","unstructured":"ShapiroM Pregui\u00e7aN BaqueroC ZawirskiM.A comprehensive study of convergent and commutative replicated data types. RR\u20107506 Inria\u2014Centre Paris\u2010Rocquencourt; INRIA;2011. Research Report.https:\/\/hal.inria.fr\/inria-00555588"},{"key":"e_1_2_11_21_1","volume-title":"Software Sbstractions: Logic, Language, and Analysis","author":"Jackson D","year":"2012"},{"key":"e_1_2_11_22_1","unstructured":"YuanD LuoY ZhuangX et al.Simple testing can prevent most critical failures: an analysis of production failures in distributed data\u2010intensive systems. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation. OSDI'14.USENIX Association;2014:249\u2010265."},{"key":"e_1_2_11_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-63390-9_12"},{"key":"e_1_2_11_24_1","unstructured":"HanceT HeuleM MartinsR ParnoB.Finding invariants of distributed systems: it's a small (enough) world after all. In: 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI '21).USENIX Association;2021:115\u2010131.https:\/\/www.usenix.org\/conference\/nsdi21\/presentation\/hance"},{"key":"e_1_2_11_25_1","unstructured":"AndoniA DaniliucD KhurshidS MarinovD.Evaluating the \u201csmall scope hypothesis\u201d.Citeseer;2002."},{"key":"e_1_2_11_26_1","unstructured":"OetschJ PrischinkM P\u00fchrerJ SchwengererM TompitsH.On the small\u2010scope hypothesis for testing answer\u2010set programs. In: Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning. KR'12.AAAI Press;2012:43\u201053."},{"key":"e_1_2_11_27_1","doi-asserted-by":"crossref","unstructured":"SunC XuY AgustinaA.Exhaustive search of puzzles in operational transformation. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing. CSCW '14.Association for Computing Machinery;2014:519\u2010529.","DOI":"10.1145\/2531602.2531630"},{"key":"e_1_2_11_28_1","doi-asserted-by":"crossref","unstructured":"XuY SunC LiM.Achieving convergence in operational transformation: conditions mechanisms and systems. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing. CSCW '14.Association for Computing Machinery;2014:505\u2010518.","DOI":"10.1145\/2531602.2531629"},{"key":"e_1_2_11_29_1","unstructured":"LamportL MerzS.Auxiliary variables in TLA+. CoRR abs\/1703.05121;2017.http:\/\/arxiv.org\/abs\/1703.05121"},{"key":"e_1_2_11_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03466-4_2"},{"key":"e_1_2_11_31_1","unstructured":"GitHub.TLA+\u2014the brontinus release;2022.https:\/\/github.com\/tlaplus\/tlaplus\/releases\/tag\/v1.7.1"},{"key":"e_1_2_11_32_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #3;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis\/issues\/3"},{"key":"e_1_2_11_33_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #5;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis\/issues\/5"},{"key":"e_1_2_11_34_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #6;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis\/issues\/6"},{"key":"e_1_2_11_35_1","unstructured":"GitHub.CRDT\u2010Redis: Issue #7;2022.https:\/\/github.com\/elem-azar-unis\/CRDT-Redis\/issues\/7"},{"key":"e_1_2_11_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2009.173"},{"key":"e_1_2_11_37_1","doi-asserted-by":"crossref","unstructured":"GarlanD.Software engineering in an uncertain world. In: Proceedings of the FSE\/SDP Workshop on Future of Software Engineering Research. FoSER '10.Association for Computing Machinery;2010:125\u2010128.","DOI":"10.1145\/1882362.1882389"},{"key":"e_1_2_11_38_1","doi-asserted-by":"crossref","unstructured":"EsfahaniN KouroshfarE MalekS.Taming uncertainty in self\u2010adaptive software. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. ESEC\/FSE '11.Association for Computing Machinery;2011:234\u2010244.","DOI":"10.1145\/2025113.2025147"},{"key":"e_1_2_11_39_1","doi-asserted-by":"crossref","unstructured":"ElbaumS RosenblumDS.Known unknowns: testing in the presence of uncertainty. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. FSE 2014.Association for Computing Machinery;2014:833\u2010836.","DOI":"10.1145\/2635868.2666608"},{"key":"e_1_2_11_40_1","doi-asserted-by":"publisher","DOI":"10.14778\/3397230.3397233"},{"key":"e_1_2_11_41_1","doi-asserted-by":"crossref","unstructured":"FonsecaP ZhangK WangX KrishnamurthyA.An empirical study on the correctness of formally verified distributed systems. In: Proceedings of the Twelfth European Conference on Computer Systems. EuroSys '17.Association for Computing Machinery;2017:328\u2010343.https:\/\/doi.org\/10.1145\/3064176.3064183","DOI":"10.1145\/3064176.3064183"},{"key":"e_1_2_11_42_1","unstructured":"EricksonJ.More algorithms lecture notes;2022.http:\/\/jeffe.cs.illinois.edu\/teaching\/algorithms\/"},{"key":"e_1_2_11_43_1","unstructured":"NewcombeC.Debugging designs using exhaustively testable pseudo\u2010code. In: Proceedings of the International Workshop on High Performance Transaction Systems.HPTS;2011.http:\/\/hpts.ws\/papers\/2011\/agenda.html"},{"key":"e_1_2_11_44_1","unstructured":"ZaveP.How to make chord correct (using a stable base). CoRR abs\/1502.06461;2015.http:\/\/arxiv.org\/abs\/1502.06461"},{"key":"e_1_2_11_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2017.2655056"},{"key":"e_1_2_11_46_1","unstructured":"GitHub.Dr. TLA+ Series\u2014Paxos (Andrew Helwer);2022.https:\/\/github.com\/tlaplus\/DrTLAPlus\/tree\/master\/Paxos"},{"key":"e_1_2_11_47_1","unstructured":"GitHub.Formal TLA+ specification of the raft consensus algorithm;2022.https:\/\/github.com\/ongardie\/raft.tla"},{"key":"e_1_2_11_48_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-662-43652-3_3"},{"key":"e_1_2_11_49_1","unstructured":"GitHub.High\u2010level TLA+ specifications of the five consistency levels offered by Azure Cosmos DB;2022.https:\/\/github.com\/Azure\/azure-cosmos-tla"},{"key":"e_1_2_11_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-90870-6_42"},{"key":"e_1_2_11_51_1","doi-asserted-by":"crossref","unstructured":"MusuvathiM ParkDYW ChouA EnglerDR DillDL.CMC: a pragmatic approach to model checking real code. In: Proceedings of the 5th Symposium on Operating Systems Design and Implementation. OSDI '02.USENIX Association;2002:75\u201088.","DOI":"10.1145\/1060289.1060297"},{"key":"e_1_2_11_52_1","unstructured":"YangJ ChenT WuM et al.MODIST: transparent model checking of unmodified distributed systems. In: NSDI'09.USENIX;2009:213\u2010228."},{"key":"e_1_2_11_53_1","doi-asserted-by":"crossref","unstructured":"GuoH WuM ZhouL HuG YangJ ZhangL.Practical software model checking via dynamic interface reduction. In: Proceedings of the Twenty\u2010Third ACM Symposium on Operating Systems Principles. SOSP '11.ACM;2011:265\u2010278.","DOI":"10.1145\/2043556.2043582"},{"key":"e_1_2_11_54_1","unstructured":"GunawiHS DoT JoshiP et al.FATE and DESTINI: a framework for cloud recovery testing. In: Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. NSDI'11.USENIX Association;2011:238\u2010252."},{"key":"e_1_2_11_55_1","doi-asserted-by":"crossref","unstructured":"AlvaroP RosenJ HellersteinJM.Lineage\u2010driven fault injection. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. SIGMOD '15.Association for Computing Machinery;2015:331\u2010346.","DOI":"10.1145\/2723372.2723711"},{"key":"e_1_2_11_56_1","doi-asserted-by":"crossref","unstructured":"LuJ LiuC LiL et al.CrashTuner: detecting crash\u2010recovery bugs in cloud systems via meta\u2010info analysis. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles. SOSP '19.Association for Computing Machinery;2019:114\u2010130.","DOI":"10.1145\/3341301.3359645"},{"key":"e_1_2_11_57_1","doi-asserted-by":"crossref","unstructured":"ChenH DouW WangD QinF.CoFI: consistency\u2010guided fault injection for cloud systems. In: Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering. ASE '20.Association for Computing Machinery;2020:536\u2010547.","DOI":"10.1145\/3324884.3416548"},{"key":"e_1_2_11_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3276530"},{"key":"e_1_2_11_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3360606"},{"key":"e_1_2_11_60_1","doi-asserted-by":"crossref","unstructured":"YuanX YangJ.Effective concurrency testing for distributed systems. In: Proceedings of the Twenty\u2010Fifth International Conference on Architectural Support for Programming Languages and Operating Systems.Association for Computing Machinery;2020:1141\u20101156.","DOI":"10.1145\/3373376.3378484"}],"container-title":["Journal of Software: Evolution and Process"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/smr.2555","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/smr.2555","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/smr.2555","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,11]],"date-time":"2024-04-11T07:52:26Z","timestamp":1712821946000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/smr.2555"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,15]]},"references-count":59,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,4]]}},"alternative-id":["10.1002\/smr.2555"],"URL":"https:\/\/doi.org\/10.1002\/smr.2555","archive":["Portico"],"relation":{},"ISSN":["2047-7473","2047-7481"],"issn-type":[{"value":"2047-7473","type":"print"},{"value":"2047-7481","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,15]]},"assertion":[{"value":"2022-09-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-18","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-03-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}