{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,4]],"date-time":"2025-04-04T20:27:24Z","timestamp":1743798444801,"version":"3.37.3"},"publisher-location":"New York, NY, USA","reference-count":36,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,4,26]],"date-time":"2022-04-26T00:00:00Z","timestamp":1650931200000},"content-version":"vor","delay-in-days":365,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF (National Science Foundation)","doi-asserted-by":"publisher","award":["CNS-1651570"],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100015599","name":"Toyota Research Institute","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100015599","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,4,26]]},"DOI":"10.1145\/3437984.3458829","type":"proceedings-article","created":{"date-parts":[[2021,4,25]],"date-time":"2021-04-25T09:56:04Z","timestamp":1619344564000},"page":"15-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["DistIR"],"prefix":"10.1145","author":[{"given":"Keshav","family":"Santhanam","sequence":"first","affiliation":[{"name":"Stanford University, USA"}]},{"given":"Siddharth","family":"Krishna","sequence":"additional","affiliation":[{"name":"Microsoft, UK"}]},{"given":"Ryota","family":"Tomioka","sequence":"additional","affiliation":[{"name":"Microsoft, UK"}]},{"given":"Andrew","family":"Fitzgibbon","sequence":"additional","affiliation":[{"name":"Microsoft, UK"}]},{"given":"Tim","family":"Harris","sequence":"additional","affiliation":[{"name":"Microsoft, UK"}]}],"member":"320","published-online":{"date-parts":[[2021,4,26]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"NVIDIA DGX Datasheet. URL: https:\/\/images.nvidia.com\/aem-dam\/Solutions\/Data-Center\/nvidia-dgx-a100-datasheet.pdf. NVIDIA DGX Datasheet. URL: https:\/\/images.nvidia.com\/aem-dam\/Solutions\/Data-Center\/nvidia-dgx-a100-datasheet.pdf."},{"key":"e_1_3_2_1_2_1","unstructured":"NVIDIA Titan V. URL: https:\/\/www.nvidia.com\/es-la\/titan\/titan-v\/. NVIDIA Titan V. URL: https:\/\/www.nvidia.com\/es-la\/titan\/titan-v\/."},{"key":"e_1_3_2_1_3_1","first-page":"265","volume-title":"TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)","author":"Abadi Mart\u00edn","year":"2016","unstructured":"Mart\u00edn Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) , pages 265 -- 283 , 2016 . Mart\u00edn Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. TensorFlow: A System for Large-Scale Machine Learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265--283, 2016."},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Tal Ben-Nun Johannes de Fine Licht Alexandros N Ziogas Timo Schneider and Torsten Hoefler. Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures. In Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis pages 1--14 2019. Tal Ben-Nun Johannes de Fine Licht Alexandros N Ziogas Timo Schneider and Torsten Hoefler. Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures. In Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis pages 1--14 2019.","DOI":"10.1145\/3295500.3356173"},{"key":"e_1_3_2_1_5_1","volume-title":"Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165","author":"Brown Tom B","year":"2020","unstructured":"Tom B Brown , Benjamin Mann , Nick Ryder , Melanie Subbiah , Jared Kaplan , Prafulla Dhariwal , Arvind Neelakantan , Pranav Shyam , Girish Sastry , Amanda Askell , Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165 , 2020 . Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language Models are Few-Shot Learners. arXiv preprint arXiv:2005.14165, 2020."},{"key":"e_1_3_2_1_6_1","first-page":"578","volume-title":"TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)","author":"Chen Tianqi","year":"2018","unstructured":"Tianqi Chen , Thierry Moreau , Ziheng Jiang , Lianmin Zheng , Eddie Yan , Haichen Shen , Meghan Cowan , Leyuan Wang , Yuwei Hu , Luis Ceze , TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) , pages 578 -- 594 , 2018 . Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 578--594, 2018."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/512950.512973"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/567752.567778"},{"key":"e_1_3_2_1_9_1","first-page":"1223","volume-title":"Advances in Neural Information Processing Systems","author":"Dean Jeffrey","year":"2012","unstructured":"Jeffrey Dean , Greg Corrado , Rajat Monga , Kai Chen , Matthieu Devin , Mark Mao , Marc'aurelio Ranzato , Andrew Senior , Paul Tucker , Ke Yang , Large Scale Distributed Deep Networks . In Advances in Neural Information Processing Systems , pages 1223 -- 1231 , 2012 . Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, et al. Large Scale Distributed Deep Networks. In Advances in Neural Information Processing Systems, pages 1223--1231, 2012."},{"key":"e_1_3_2_1_10_1","volume-title":"Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv preprint arXiv:2101.03961","author":"Fedus William","year":"2021","unstructured":"William Fedus , Barret Zoph , and Noam Shazeer . Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv preprint arXiv:2101.03961 , 2021 . William Fedus, Barret Zoph, and Noam Shazeer. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv preprint arXiv:2101.03961, 2021."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3368089.3417050"},{"key":"e_1_3_2_1_12_1","volume-title":"Algorithm 799: Revolve: An Implementation of Checkpointing for the Reverse or Adjoint Mode of Computational Differentiation. ACM Transactions on Mathematical Software (TOMS), 26(1):19--45","author":"Griewank Andreas","year":"2000","unstructured":"Andreas Griewank and Andrea Walther . Algorithm 799: Revolve: An Implementation of Checkpointing for the Reverse or Adjoint Mode of Computational Differentiation. ACM Transactions on Mathematical Software (TOMS), 26(1):19--45 , 2000 . Andreas Griewank and Andrea Walther. Algorithm 799: Revolve: An Implementation of Checkpointing for the Reverse or Adjoint Mode of Computational Differentiation. ACM Transactions on Mathematical Software (TOMS), 26(1):19--45, 2000."},{"key":"e_1_3_2_1_13_1","volume-title":"A Language for Describing Optimization Strategies. arXiv preprint arXiv:2002.02268","author":"Hagedorn Bastian","year":"2020","unstructured":"Bastian Hagedorn , Johannes Lenfers , Thomas Koehler , Sergei Gorlatch , and Michel Steuwer . A Language for Describing Optimization Strategies. arXiv preprint arXiv:2002.02268 , 2020 . Bastian Hagedorn, Johannes Lenfers, Thomas Koehler, Sergei Gorlatch, and Michel Steuwer. A Language for Describing Optimization Strategies. arXiv preprint arXiv:2002.02268, 2020."},{"key":"e_1_3_2_1_14_1","first-page":"103","volume-title":"Advances in Neural Information Processing Systems","author":"Huang Yanping","year":"2019","unstructured":"Yanping Huang , Youlong Cheng , Ankur Bapna , Orhan Firat , Dehao Chen , Mia Chen , HyoukJoong Lee , Jiquan Ngiam , Quoc V Le , Yonghui Wu , : Efficient Training of Giant Neural Networks using Pipeline Parallelism . In Advances in Neural Information Processing Systems , pages 103 -- 112 , 2019 . Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V Le, Yonghui Wu, et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. In Advances in Neural Information Processing Systems, pages 103--112, 2019."},{"key":"e_1_3_2_1_15_1","first-page":"497","volume-title":"Proceedings of Machine Learning and Systems","volume":"2","author":"Jain Paras","year":"2020","unstructured":"Paras Jain , Ajay Jain , Aniruddha Nrusimha , Amir Gholami , Pieter Abbeel , Joseph Gonzalez , Kurt Keutzer , and Ion Stoica . Checkmate : Breaking the Memory Wall with Optimal Tensor Rematerialization. In I. Dhillon, D. Papailiopoulos, and V. Sze, editors , Proceedings of Machine Learning and Systems , volume 2 , pages 497 -- 511 , 2020 . URL: https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/084b6fbb10729ed4da8c3d3f5a3ae7c9-Paper.pdf. Paras Jain, Ajay Jain, Aniruddha Nrusimha, Amir Gholami, Pieter Abbeel, Joseph Gonzalez, Kurt Keutzer, and Ion Stoica. Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization. In I. Dhillon, D. Papailiopoulos, and V. Sze, editors, Proceedings of Machine Learning and Systems, volume 2, pages 497--511, 2020. URL: https:\/\/proceedings.mlsys.org\/paper\/2020\/file\/084b6fbb10729ed4da8c3d3f5a3ae7c9-Paper.pdf."},{"key":"e_1_3_2_1_16_1","first-page":"47","volume-title":"Alex Aiken. TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles","author":"Jia Zhihao","year":"2019","unstructured":"Zhihao Jia , Oded Padon , James Thomas , Todd Warszawski , Matei Zaharia , and Alex Aiken. TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles , pages 47 -- 62 , 2019 . Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, pages 47--62, 2019."},{"key":"e_1_3_2_1_17_1","first-page":"1","volume-title":"Proceedings of Machine Learning and Systems","volume":"1","author":"Jia Zhihao","year":"2019","unstructured":"Zhihao Jia , Matei Zaharia , and Alex Aiken . Beyond Data and Model Parallelism for Deep Neural Networks. In A. Talwalkar, V. Smith, and M. Zaharia, editors , Proceedings of Machine Learning and Systems , volume 1 , pages 1 -- 13 , 2019 . URL: https:\/\/proceedings.mlsys.org\/paper\/2019\/file\/c74d97b01eae257e44aa9d5bade97baf-Paper.pdf. Zhihao Jia, Matei Zaharia, and Alex Aiken. Beyond Data and Model Parallelism for Deep Neural Networks. In A. Talwalkar, V. Smith, and M. Zaharia, editors, Proceedings of Machine Learning and Systems, volume 1, pages 1--13, 2019. URL: https:\/\/proceedings.mlsys.org\/paper\/2019\/file\/c74d97b01eae257e44aa9d5bade97baf-Paper.pdf."},{"key":"e_1_3_2_1_18_1","volume-title":"One Weird Trick for Parallelizing Convolutional Neural Networks. arXiv preprint arXiv:1404.5997","author":"Krizhevsky Alex","year":"2014","unstructured":"Alex Krizhevsky . One Weird Trick for Parallelizing Convolutional Neural Networks. arXiv preprint arXiv:1404.5997 , 2014 . Alex Krizhevsky. One Weird Trick for Parallelizing Convolutional Neural Networks. arXiv preprint arXiv:1404.5997, 2014."},{"key":"e_1_3_2_1_19_1","volume-title":"MLIR: A Compiler Infrastructure for the End of Moore's Law. arXiv preprint arXiv:2002.11054","author":"Lattner Chris","year":"2020","unstructured":"Chris Lattner , Jacques Pienaar , Mehdi Amini , Uday Bondhugula , River Riddle , Albert Cohen , Tatiana Shpeisman , Andy Davis , Nicolas Vasilache , and Oleksandr Zinenko . MLIR: A Compiler Infrastructure for the End of Moore's Law. arXiv preprint arXiv:2002.11054 , 2020 . Chris Lattner, Jacques Pienaar, Mehdi Amini, Uday Bondhugula, River Riddle, Albert Cohen, Tatiana Shpeisman, Andy Davis, Nicolas Vasilache, and Oleksandr Zinenko. MLIR: A Compiler Infrastructure for the End of Moore's Law. arXiv preprint arXiv:2002.11054, 2020."},{"key":"e_1_3_2_1_20_1","volume-title":"TensorFlow Dev Summit","author":"Leary Chris","year":"2017","unstructured":"Chris Leary and Todd Wang . XLA : TensorFlow, compiled . TensorFlow Dev Summit , 2017 . Chris Leary and Todd Wang. XLA: TensorFlow, compiled. TensorFlow Dev Summit, 2017."},{"key":"e_1_3_2_1_21_1","volume-title":"Zhifeng Chen. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. In International Conference on Learning Representations","author":"Lepikhin Dmitry","year":"2021","unstructured":"Dmitry Lepikhin , HyoukJoong Lee , Yuanzhong Xu , Dehao Chen , Orhan Firat , Yanping Huang , Maxim Krikun , Noam Shazeer , and Zhifeng Chen. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. In International Conference on Learning Representations , 2021 . URL: https:\/\/openreview.net\/forum?id=qrwe7XHTmYb. Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, and Zhifeng Chen. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. In International Conference on Learning Representations, 2021. URL: https:\/\/openreview.net\/forum?id=qrwe7XHTmYb."},{"key":"e_1_3_2_1_22_1","unstructured":"Microsoft. ONNX Runtime. URL: https:\/\/microsoft.github.io\/onnxruntime\/. Microsoft. ONNX Runtime. URL: https:\/\/microsoft.github.io\/onnxruntime\/."},{"key":"e_1_3_2_1_23_1","first-page":"2430","volume-title":"International Conference on Machine Learning","author":"Mirhoseini Azalia","year":"2017","unstructured":"Azalia Mirhoseini , Hieu Pham , Quoc V Le , Benoit Steiner , Rasmus Larsen , Yuefeng Zhou , Naveen Kumar , Mohammad Norouzi , Samy Bengio , and Jeff Dean . Device placement optimization with reinforcement learning . In International Conference on Machine Learning , pages 2430 -- 2439 . PMLR, 2017 . Azalia Mirhoseini, Hieu Pham, Quoc V Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean. Device placement optimization with reinforcement learning. In International Conference on Machine Learning, pages 2430--2439. PMLR, 2017."},{"key":"e_1_3_2_1_24_1","first-page":"1","volume-title":"Matei Zaharia. PipeDream: Generalized Pipeline Parallelism for DNN Training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles","author":"Narayanan Deepak","year":"2019","unstructured":"Deepak Narayanan , Aaron Harlap , Amar Phanishayee , Vivek Seshadri , Nikhil R Devanur , Gregory R Ganger , Phillip B Gibbons , and Matei Zaharia. PipeDream: Generalized Pipeline Parallelism for DNN Training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles , pages 1 -- 15 , 2019 . Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R Devanur, Gregory R Ganger, Phillip B Gibbons, and Matei Zaharia. PipeDream: Generalized Pipeline Parallelism for DNN Training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, pages 1--15, 2019."},{"key":"e_1_3_2_1_25_1","volume-title":"Memory-Efficient Pipeline-Parallel DNN Training. arXiv preprint arXiv:2006.09503","author":"Narayanan Deepak","year":"2020","unstructured":"Deepak Narayanan , Amar Phanishayee , Kaiyu Shi , Xie Chen , and Matei Zaharia . Memory-Efficient Pipeline-Parallel DNN Training. arXiv preprint arXiv:2006.09503 , 2020 . Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, and Matei Zaharia. Memory-Efficient Pipeline-Parallel DNN Training. arXiv preprint arXiv:2006.09503, 2020."},{"key":"e_1_3_2_1_26_1","first-page":"8026","volume-title":"PyTorch: An Imperative Style","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke , Sam Gross , Francisco Massa , Adam Lerer , James Bradbury , Gregory Chanan , Trevor Killeen , Zeming Lin , Natalia Gimelshein , Luca Antiga , PyTorch: An Imperative Style , High-Performance Deep Learning Library . In Advances in Neural Information Processing Systems, pages 8026 -- 8037 , 2019 . Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems, pages 8026--8037, 2019."},{"key":"e_1_3_2_1_27_1","first-page":"1","article-title":"Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer","volume":"21","author":"Raffel Colin","year":"2020","unstructured":"Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , and Peter J Liu . Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer . Journal of Machine Learning Research , 21 : 1 -- 67 , 2020 . Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21:1--67, 2020.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3150211"},{"key":"e_1_3_2_1_29_1","volume-title":"ZeRO: Memory Optimization towards Training a Trillion Parameter Models. arXiv preprint arXiv:1910.02054","author":"Rajbhandari Samyam","year":"2019","unstructured":"Samyam Rajbhandari , Jeff Rasley , Olatunji Ruwase , and Yuxiong He . ZeRO: Memory Optimization towards Training a Trillion Parameter Models. arXiv preprint arXiv:1910.02054 , 2019 . Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. ZeRO: Memory Optimization towards Training a Trillion Parameter Models. arXiv preprint arXiv:1910.02054, 2019."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3406703"},{"key":"e_1_3_2_1_31_1","volume-title":"Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism. arXiv preprint arXiv:1909.08053","author":"Shoeybi Mohammad","year":"2019","unstructured":"Mohammad Shoeybi , Mostofa Patwary , Raul Puri , Patrick LeGresley , Jared Casper , and Bryan Catanzaro . Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism. arXiv preprint arXiv:1909.08053 , 2019 . Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, and Bryan Catanzaro. Megatron-LM: Training Multi-Billion Parameter Language Models Using GPU Model Parallelism. arXiv preprint arXiv:1909.08053, 2019."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/3049832.3049841"},{"key":"e_1_3_2_1_33_1","first-page":"33","article-title":"Efficient Algorithms for Device Placement of DNN Graph Operators","author":"Tarnawski Jakub M","year":"2020","unstructured":"Jakub M Tarnawski , Amar Phanishayee , Nikhil Devanur , Divya Mahajan , and Fanny Nina Paravecino . Efficient Algorithms for Device Placement of DNN Graph Operators . Advances in Neural Information Processing Systems , 33 , 2020 . Jakub M Tarnawski, Amar Phanishayee, Nikhil Devanur, Divya Mahajan, and Fanny Nina Paravecino. Efficient Algorithms for Device Placement of DNN Graph Operators. Advances in Neural Information Processing Systems, 33, 2020.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_34_1","volume-title":"Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads. arXiv preprint arXiv:2007.04069","author":"Wang Siyu","year":"2020","unstructured":"Siyu Wang , Yi Rong , Shiqing Fan , Zhen Zheng , LanSong Diao , Guoping Long , Jun Yang , Xiaoyong Liu , and Wei Lin . Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads. arXiv preprint arXiv:2007.04069 , 2020 . Siyu Wang, Yi Rong, Shiqing Fan, Zhen Zheng, LanSong Diao, Guoping Long, Jun Yang, Xiaoyong Liu, and Wei Lin. Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads. arXiv preprint arXiv:2007.04069, 2020."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190551"},{"key":"e_1_3_2_1_36_1","first-page":"337","volume-title":"2020 USENIX Annual Technical Conference (USENIX ATC 20)","author":"Zhu Hongyu","year":"2020","unstructured":"Hongyu Zhu , Amar Phanishayee , and Gennady Pekhimenko . Daydream : Accurately estimating the efficacy of optimizations for DNN training . In 2020 USENIX Annual Technical Conference (USENIX ATC 20) , pages 337 -- 352 . USENIX Association , July 2020 . URL: https:\/\/www. usenix.org\/conference\/atc20\/presentation\/zhu-hongyu. Hongyu Zhu, Amar Phanishayee, and Gennady Pekhimenko. Daydream: Accurately estimating the efficacy of optimizations for DNN training. In 2020 USENIX Annual Technical Conference (USENIX ATC 20), pages 337--352. USENIX Association, July 2020. URL: https:\/\/www. usenix.org\/conference\/atc20\/presentation\/zhu-hongyu."}],"event":{"name":"EuroSys '21: Sixteenth European Conference on Computer Systems","sponsor":["SIGOPS ACM Special Interest Group on Operating Systems"],"location":"Online United Kingdom","acronym":"EuroSys '21"},"container-title":["Proceedings of the 1st Workshop on Machine Learning and Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3437984.3458829","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3437984.3458829","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,10]],"date-time":"2023-01-10T12:41:57Z","timestamp":1673354517000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3437984.3458829"}},"subtitle":["An Intermediate Representation for Optimizing Distributed Neural Networks"],"short-title":[],"issued":{"date-parts":[[2021,4,26]]},"references-count":36,"alternative-id":["10.1145\/3437984.3458829","10.1145\/3437984"],"URL":"https:\/\/doi.org\/10.1145\/3437984.3458829","relation":{},"subject":[],"published":{"date-parts":[[2021,4,26]]},"assertion":[{"value":"2021-04-26","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}