{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,3]],"date-time":"2024-08-03T06:31:51Z","timestamp":1722666711187},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2016,6,30]],"date-time":"2016-06-30T00:00:00Z","timestamp":1467244800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput. Syst."],"published-print":{"date-parts":[[2016,9,17]]},"abstract":"Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data, and daily Twitter feeds, where the datasets of interest are 5TB to 20TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to 256GB of DRAM, to accommodate all the data in DRAM. On the other hand, such datasets could be stored easily in the flash memory of a rack-sized cluster. Flash storage has much better random access performance than hard disks, which makes it desirable for analytics workloads. However, currently available off-the-shelf flash storage packaged as SSDs does not make effective use of flash storage because it incurs a great amount of additional overhead during flash device management and network access. In this article, we present BlueDBM, a new system architecture that has flash-based storage with in-store processing capability and a low-latency high-throughput intercontroller network between storage devices. We show that BlueDBM outperforms a flash-based system without these features by a factor of 10 for some important applications. While the performance of a DRAM-centric system falls sharply even if only 5% to 10% of the references are to secondary storage, this sharp performance degradation is not an issue in BlueDBM. BlueDBM presents an attractive point in the cost\/performance tradeoff for Big Data analytics.<\/jats:p>","DOI":"10.1145\/2898996","type":"journal-article","created":{"date-parts":[[2016,7,5]],"date-time":"2016-07-05T14:08:13Z","timestamp":1467727693000},"page":"1-31","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["BlueDBM"],"prefix":"10.1145","volume":"34","author":[{"given":"Sang-Woo","family":"Jun","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"given":"Ming","family":"Liu","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"given":"Sungjin","family":"Lee","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"given":"Jamey","family":"Hicks","sequence":"additional","affiliation":[{"name":"Quanta Research Cambridge"}]},{"given":"John","family":"Ankcorn","sequence":"additional","affiliation":[{"name":"Quanta Research Cambridge"}]},{"given":"Myron","family":"King","sequence":"additional","affiliation":[{"name":"Quanta Research Cambridge"}]},{"given":"Shuotao","family":"Xu","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]},{"family":"Arvind","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology"}]}],"member":"320","published-online":{"date-parts":[[2016,6,30]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Anurag Acharya Mustafa Uysal and Joel Saltz. 1998. Active Disks. Technical Report. Santa Barbara CA. Anurag Acharya Mustafa Uysal and Joel Saltz. 1998. Active Disks. Technical Report. Santa Barbara CA.","DOI":"10.1145\/291069.291026"},{"key":"e_1_2_1_2_1","volume-title":"USENIX 2008 Annual Technical Conference on Annual Technical Conference (ATC\u201908)","author":"Agrawal Nitin","year":"2008"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1863122.1863123"},{"key":"e_1_2_1_4_1","unstructured":"Infiniband Trade Association. 2014 (accessed November 18 2014). Infiniband. http:\/\/www.infinibandta.org. Infiniband Trade Association. 2014 (accessed November 18 2014). Infiniband. http:\/\/www.infinibandta.org."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1979.1675381"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2010.33"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508148.2485962"},{"key":"e_1_2_1_8_1","volume-title":"Proceedings of the 1st Workshop on Near-Data Processing.","author":"Cho Benjamin Y.","year":"2013"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485945"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1807128.1807152"},{"key":"e_1_2_1_11_1","first-page":"14","article-title":"Toward efficient provisioning and performance tuning for Hadoop","volume":"2010","author":"Dai Jason","year":"2010","journal-title":"Proc. Apache Asia Roadshow"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465295"},{"key":"e_1_2_1_13_1","unstructured":"FusionIO. 2012 (Accessed November 22 2014). Using HBase with ioMemory. http:\/\/www.fusionio.com\/white-papers\/using-hbase-with-iomemory. FusionIO. 2012 (Accessed November 22 2014). Using HBase with ioMemory. http:\/\/www.fusionio.com\/white-papers\/using-hbase-with-iomemory."},{"key":"e_1_2_1_14_1","unstructured":"FusionIO. 2014 (Accessed November 18 2014). FusionIO. http:\/\/www.fusionio.com. FusionIO. 2014 (Accessed November 18 2014). FusionIO. http:\/\/www.fusionio.com."},{"key":"e_1_2_1_15_1","first-page":"518","article-title":"Similarity search in high dimensions via hashing","volume":"99","author":"Gionis Aristides","year":"1999","journal-title":"VLDB"},{"key":"e_1_2_1_16_1","unstructured":"Google. 2011 (Accessed November 18 2014). Google Flu Trends. http:\/\/www.google.org\/flutrends. Google. 2011 (Accessed November 18 2014). Google Flu Trends. http:\/\/www.google.org\/flutrends."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/2536274.2536295"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2602204.2602212"},{"key":"e_1_2_1_19_1","unstructured":"Intel. 2014 (Accessed November 18 2014). Intel Solid-State Drive Data Center Family for PCIe. http:\/\/www.intel.com\/content\/www\/us\/en\/solid-state-drives\/intel-ssd-dc-fa mily-for-pcie.html. Intel. 2014 (Accessed November 18 2014). Intel Solid-State Drive Data Center Family for PCIe. http:\/\/www.intel.com\/content\/www\/us\/en\/solid-state-drives\/intel-ssd-dc-fa mily-for-pcie.html."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2012.65"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2612174"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI\u201914)","author":"Jeong Eun Young","year":"2014"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1837915.1837922"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2554688.2554789"},{"key":"e_1_2_1_25_1","article-title":"A case for flash memory SSD in hadoop applications","volume":"6","author":"Kang Seok-Hoon","year":"2013","journal-title":"Int. J. Control Autom."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSST.2013.6558444"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/290593.290602"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684746.2689064"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540748"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2014.2329423"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/2930583.2930609"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1376616.1376723"},{"key":"e_1_2_1_33_1","volume-title":"4th International Conference on Very Large Data Bases. 280--287","author":"Leilich Hans-Otto","year":"1978"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/782814.782855"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HOTOS\u201915)","author":"McSherry Frank"},{"key":"e_1_2_1_36_1","unstructured":"Violin Memory. 2014 (Accessed November 18 2014). Violin Memory. http:\/\/www.violin-memory.com. Violin Memory. 2014 (Accessed November 18 2014). Violin Memory. http:\/\/www.violin-memory.com."},{"key":"e_1_2_1_37_1","unstructured":"James Morris Jr. and Vaughan Pratt. 1970. A Linear Pattern-Matching Algorithm. TR-40 Comptr Ctr. U of California Berkeley Calif. James Morris Jr. and Vaughan Pratt. 1970. A Linear Pattern-Matching Algorithm. TR-40 Comptr Ctr. U of California Berkeley Calif."},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687654"},{"key":"e_1_2_1_39_1","unstructured":"Oracle. 2014 (Accessed November 18 2014). Exadata Database Machine. https:\/\/www.oracle.com\/engineered-systems\/exadata\/index.html. Oracle. 2014 (Accessed November 18 2014). Exadata Database Machine. https:\/\/www.oracle.com\/engineered-systems\/exadata\/index.html."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/1713254.1713276"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/1499949.1500024"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2678373.2665678"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPSW.2013.238"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597652.2597684"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.5555\/2591305.2591307"},{"key":"e_1_2_1_46_1","unstructured":"SanDisk. 2014 (Accessed November 22 2014). Sandisk ZetaScale Software. http:\/\/www.sandisk.com\/enterprise\/zetascale\/. SanDisk. 2014 (Accessed November 22 2014). Sandisk ZetaScale Software. http:\/\/www.sandisk.com\/enterprise\/zetascale\/."},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201914)","author":"Seshadri Sudharsan","year":"2014"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.5555\/2093889.2093965"},{"key":"e_1_2_1_49_1","first-page":"086","article-title":"Method and system for anticipatory package shipping. (Dec. 27 2011)","volume":"8","author":"Spiegel Joel R.","year":"2011","journal-title":"US Patent"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370816.2370874"},{"key":"e_1_2_1_51_1","unstructured":"Diablo Technologies. 2014 (Accessed November 18 2014). Diablo Technologies. http:\/\/www.diablo-technologies.com\/. Diablo Technologies. 2014 (Accessed November 18 2014). Diablo Technologies. http:\/\/www.diablo-technologies.com\/."},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1007\/11846802_18"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/FPL.2013.6645619"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.14778\/2732967.2732972"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541940.2541961"}],"container-title":["ACM Transactions on Computer Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2898996","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,31]],"date-time":"2022-12-31T09:29:35Z","timestamp":1672478975000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2898996"}},"subtitle":["Distributed Flash Storage for Big Data Analytics"],"short-title":[],"issued":{"date-parts":[[2016,6,30]]},"references-count":55,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2016,9,17]]}},"alternative-id":["10.1145\/2898996"],"URL":"https:\/\/doi.org\/10.1145\/2898996","relation":{},"ISSN":["0734-2071","1557-7333"],"issn-type":[{"value":"0734-2071","type":"print"},{"value":"1557-7333","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,6,30]]},"assertion":[{"value":"2016-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-03-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-06-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}