{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,4]],"date-time":"2025-04-04T04:23:36Z","timestamp":1743740616849},"reference-count":66,"publisher":"Association for Computing Machinery (ACM)","issue":"9","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2021,5]]},"abstract":"\n Query optimizers rely on accurate cardinality estimation (CardEst) to produce good execution plans. The core problem of CardEst is how to model the rich joint distribution of attributes in an accurate and compact manner. Despite decades of research, existing methods either over-simplify the models only using independent factorization which leads to inaccurate estimates, or over-complicate them by lossless conditional factorization without any independent assumption which results in slow probability computation. In this paper, we propose FLAT, a CardEst method that is simultaneously\n <u>f<\/u>ast<\/jats:italic>\n in probability computation,\n <u>l<\/u>ightweight<\/jats:italic>\n in model size and\n <u>a<\/u>ccurate<\/jats:italic>\n in es<u>t<\/u>imation quality. The key idea of FLAT is a novel unsupervised graphical model, called FSPN. It utilizes both independent and conditional factorization to adaptively model different levels of attributes correlations, and thus combines their advantages. FLAT supports efficient online probability computation in near linear time on the underlying FSPN model, provides effective offline model construction and enables incremental model updates. It can estimate cardinality for both single table queries and multi-table join queries. Extensive experimental study demonstrates the superiority of FLAT over existing CardEst methods: FLAT achieves 1--5 orders of magnitude better accuracy, 1--3 orders of magnitude faster probability computation speed and 1--2 orders of magnitude lower storage cost. We also integrate FLAT into Postgres to perform an end-to-end test. It improves the query execution time by 12.9% on the well-known IMDB benchmark workload, which is very close to the optimal result 14.2% using the true cardinality.\n <\/jats:p>","DOI":"10.14778\/3461535.3461539","type":"journal-article","created":{"date-parts":[[2021,10,22]],"date-time":"2021-10-22T22:22:49Z","timestamp":1634941369000},"page":"1489-1502","source":"Crossref","is-referenced-by-count":41,"title":["FLAT"],"prefix":"10.14778","volume":"14","author":[{"given":"Rong","family":"Zhu","sequence":"first","affiliation":[{"name":"Alibaba Group"}]},{"given":"Ziniu","family":"Wu","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Yuxing","family":"Han","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Kai","family":"Zeng","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Andreas","family":"Pfadler","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Zhengping","family":"Qian","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Jingren","family":"Zhou","sequence":"additional","affiliation":[{"name":"Alibaba Group"}]},{"given":"Bin","family":"Cui","sequence":"additional","affiliation":[{"name":"Peking University"}]}],"member":"320","published-online":{"date-parts":[[2021,10,22]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2742797"},{"key":"e_1_2_1_2_1","volume-title":"Stochastic Gradient Descent Tricks. Neural networks: Tricks of the trade","author":"Bottou L\u00e9on","year":"2012","unstructured":"L\u00e9on Bottou . 2012. Stochastic Gradient Descent Tricks. Neural networks: Tricks of the trade ( 2012 ), 421--436. L\u00e9on Bottou. 2012. Stochastic Gradient Descent Tricks. Neural networks: Tricks of the trade (2012), 421--436."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/376284.375686"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1968.1054142"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/0004-3702(93)90036-B"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-019-05813-2"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/376284.375685"},{"key":"e_1_2_1_8_1","volume-title":"Row Estimation Examples. https:\/\/www.postgresql.org\/docs\/current\/row-estimation-examples.html","author":"Postgresql Documentation","year":"2020","unstructured":"Postgresql Documentation 12. 2020. Chapter 70.1. Row Estimation Examples. https:\/\/www.postgresql.org\/docs\/current\/row-estimation-examples.html ( 2020 ). Postgresql Documentation 12. 2020. Chapter 70.1. Row Estimation Examples. https:\/\/www.postgresql.org\/docs\/current\/row-estimation-examples.html (2020)."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.14778\/3329772.3329780"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2006.07.013"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.5555\/3042817.3043034"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/376284.375727"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/645478.757691"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335448"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-003-0090-4"},{"key":"e_1_2_1_16_1","volume-title":"An approach based on bayesian networks for query selectivity estimation. DASFAA 2","author":"Halford Max","year":"2019","unstructured":"Max Halford , Philippe Saint-Pierre , and Franck Morvan . 2019. An approach based on bayesian networks for query selectivity estimation. DASFAA 2 ( 2019 ). Max Halford, Philippe Saint-Pierre, and Franck Morvan. 2019. An approach based on bayesian networks for query selectivity estimation. DASFAA 2 (2019)."},{"key":"e_1_2_1_17_1","unstructured":"Shohedul Hasan Saravanan Thirumuruganathan Jees Augustine Nick Koudas and Gautam Das. 2019. Multi-attribute selectivity estimation using deep learning. In SIGMOD. Shohedul Hasan Saravanan Thirumuruganathan Jees Augustine Nick Koudas and Gautam Das. 2019. Multi-attribute selectivity estimation using deep learning. In SIGMOD."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2749438"},{"key":"e_1_2_1_19_1","volume-title":"Github repository: deepdb public. https:\/\/github.com\/DataManagementLab\/deepdb-public","author":"Hilprecht Benjamin","year":"2019","unstructured":"Benjamin Hilprecht . 2019. Github repository: deepdb public. https:\/\/github.com\/DataManagementLab\/deepdb-public ( 2019 ). Benjamin Hilprecht. 2019. Github repository: deepdb public. https:\/\/github.com\/DataManagementLab\/deepdb-public (2019)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.14778\/3384345.3384349"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/119995.115835"},{"key":"e_1_2_1_22_1","first-page":"2377","article-title":"Improving accuracy and robustness of self-tuning histograms by subspace clustering","volume":"27","author":"Khachatryan Andranik","year":"2015","unstructured":"Andranik Khachatryan , Emmanuel M\u00fcller , Christian Stier , and Klemens B\u00f6hm . 2015 . Improving accuracy and robustness of self-tuning histograms by subspace clustering . IEEE TKDE 27 , 9 (2015), 2377 -- 2389 . Andranik Khachatryan, Emmanuel M\u00fcller, Christian Stier, and Klemens B\u00f6hm. 2015. Improving accuracy and robustness of self-tuning histograms by subspace clustering. IEEE TKDE 27, 9 (2015), 2377--2389.","journal-title":"IEEE TKDE"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.14778\/3151106.3151112"},{"key":"e_1_2_1_24_1","unstructured":"Andreas Kipf Thomas Kipf Bernhard Radke Viktor Leis Peter Boncz and Alfons Kemper. 2019. Learned cardinalities: Estimating correlated joins with deep learning. In CIDR. Andreas Kipf Thomas Kipf Bernhard Radke Viktor Leis Peter Boncz and Alfons Kemper. 2019. Learned cardinalities: Estimating correlated joins with deep learning. In CIDR."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3196909"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/3477.764879"},{"key":"e_1_2_1_27_1","volume-title":"Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196","author":"Krishnan Sanjay","year":"2018","unstructured":"Sanjay Krishnan , Zongheng Yang , Ken Goldberg , Joseph Hellerstein , and Ion Stoica . 2018. Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196 ( 2018 ). Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, and Ion Stoica. 2018. Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196 (2018)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.14778\/2850583.2850594"},{"key":"e_1_2_1_29_1","unstructured":"Viktor Leis Bernhard Radke Andrey Gubichev Alfons Kemper and Thomas Neumann. 2017. Cardinality Estimation Done Right: Index-Based Join Sampling. In CIDR. Viktor Leis Bernhard Radke Andrey Gubichev Alfons Kemper and Thomas Neumann. 2017. Cardinality Estimation Done Right: Index-Based Join Sampling. In CIDR."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-017-0480-7"},{"key":"e_1_2_1_31_1","unstructured":"Eric Liang Zongheng Yang Ion Stoica Pieter Abbeel Yan Duan and Peter Chen. 2020. Variable Skipping for Autoregressive Range Density Estimation. In ICML. 6040--6049. Eric Liang Zongheng Yang Ion Stoica Pieter Abbeel Yan Duan and Peter Chen. 2020. Variable Skipping for Autoregressive Range Density Estimation. In ICML. 6040--6049."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/2886444.2886453"},{"key":"e_1_2_1_33_1","volume-title":"Github repository: scikit-learn. https:\/\/github.com\/scikit-learn\/scikit-learn","author":"Liu Luch","year":"2020","unstructured":"Luch Liu . 2020. Github repository: scikit-learn. https:\/\/github.com\/scikit-learn\/scikit-learn ( 2020 ). Luch Liu. 2020. Github repository: scikit-learn. https:\/\/github.com\/scikit-learn\/scikit-learn (2020)."},{"key":"e_1_2_1_34_1","volume-title":"Sql docs: cardinality estimation (SQL Server). https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/performance\/cardinality-estimation-sql-server?view=sql-server-ver15","author":"Lopes Pedro","year":"2019","unstructured":"Pedro Lopes , Craig Guyer , and Milener Gene . 2019. Sql docs: cardinality estimation (SQL Server). https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/performance\/cardinality-estimation-sql-server?view=sql-server-ver15 ( 2019 ). Pedro Lopes, Craig Guyer, and Milener Gene. 2019. Sql docs: cardinality estimation (SQL Server). https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/performance\/cardinality-estimation-sql-server?view=sql-server-ver15 (2019)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999611.2999612"},{"key":"e_1_2_1_36_1","volume-title":"Github repository: neurocard project. https:\/\/github.com\/neurocard\/neurocard","author":"Luan Frank","year":"2020","unstructured":"Frank Luan , Amog Kamsetty , Eric Liang , and Zongheng Yang . 2020. Github repository: neurocard project. https:\/\/github.com\/neurocard\/neurocard ( 2020 ). Frank Luan, Amog Kamsetty, Eric Liang, and Zongheng Yang. 2020. Github repository: neurocard project. https:\/\/github.com\/neurocard\/neurocard (2020)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341302.3342080"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.14778\/3342263.3342644"},{"key":"e_1_2_1_39_1","volume-title":"On the expressive efficiency of sum product networks. arXiv:1411.7717","author":"Martens James","year":"2014","unstructured":"James Martens and Venkatesh Medabalimi . 2014. On the expressive efficiency of sum product networks. arXiv:1411.7717 ( 2014 ). James Martens and Venkatesh Medabalimi. 2014. On the expressive efficiency of sum product networks. arXiv:1411.7717 (2014)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1970.10481147"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3318464.3380579"},{"key":"e_1_2_1_42_1","volume-title":"snowmobile, and boat registrations. https:\/\/catalog.data.gov\/dataset\/vehicle-snowmobile-and-boat-registrations","author":"State of New York. 2020. Vehicle","year":"2020","unstructured":"State of New York. 2020. Vehicle , snowmobile, and boat registrations. https:\/\/catalog.data.gov\/dataset\/vehicle-snowmobile-and-boat-registrations ( 2020 ). State of New York. 2020. Vehicle, snowmobile, and boat registrations. https:\/\/catalog.data.gov\/dataset\/vehicle-snowmobile-and-boat-registrations (2020)."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064013"},{"key":"e_1_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Matthew Perron Zeyuan Shang Tim Kraska and Michael Stonebraker. 2019. How I learned to stop worrying and love re-optimization. In ICDE. 1758--1761. Matthew Perron Zeyuan Shang Tim Kraska and Michael Stonebraker. 2019. How I learned to stop worrying and love re-optimization. In ICDE. 1758--1761.","DOI":"10.1109\/ICDE.2019.00191"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2011.6130310"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.5555\/645923.673638"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.5555\/3009657.3009736"},{"key":"e_1_2_1_48_1","volume-title":"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/innodb-statistics-estimation.html","author":"Reference Manual SQL","year":"2020","unstructured":"My SQL 8.0 Reference Manual . 2020. Chapter 15.8.10.2 Configuring Non-Persistent Optimizer Statistics Parameters . https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/innodb-statistics-estimation.html ( 2020 ). MySQL 8.0 Reference Manual. 2020. Chapter 15.8.10.2 Configuring Non-Persistent Optimizer Statistics Parameters. https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/innodb-statistics-estimation.html (2020)."},{"key":"e_1_2_1_49_1","volume-title":"Gas sensor array temperature modulation Data Set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/Gas+sensor+array+temperature+modulation","author":"Repository UCI ML","year":"2020","unstructured":"UCI ML Repository . 2020. Gas sensor array temperature modulation Data Set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/Gas+sensor+array+temperature+modulation ( 2020 ). UCI ML Repository. 2020. Gas sensor array temperature modulation Data Set. https:\/\/archive.ics.uci.edu\/ml\/datasets\/Gas+sensor+array+temperature+modulation (2020)."},{"key":"e_1_2_1_50_1","volume-title":"A survey on Bayesian network structure learning from data. Progress in Artificial Intelligence","author":"Scanagatta Mauro","year":"2019","unstructured":"Mauro Scanagatta , Antonio Salmer\u00f3n , and Fabio Stella . 2019. A survey on Bayesian network structure learning from data. Progress in Artificial Intelligence ( 2019 ), 1--15. Mauro Scanagatta, Antonio Salmer\u00f3n, and Fabio Stella. 2019. A survey on Bayesian network structure learning from data. Progress in Artificial Intelligence (2019), 1--15."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/582095.582099"},{"key":"e_1_2_1_52_1","volume-title":"Statistics for optimizing queries: InnoDB persistent statistics. https:\/\/mariadb.com\/kb\/en\/innodb-persistent-statistics\/","author":"Server Documentation DB","year":"2020","unstructured":"Maria DB Server Documentation . 2020. Statistics for optimizing queries: InnoDB persistent statistics. https:\/\/mariadb.com\/kb\/en\/innodb-persistent-statistics\/ ( 2020 ). MariaDB Server Documentation. 2020. Statistics for optimizing queries: InnoDB persistent statistics. https:\/\/mariadb.com\/kb\/en\/innodb-persistent-statistics\/ (2020)."},{"key":"e_1_2_1_53_1","volume-title":"Presto: Sql on everything. In ICDE. 1802--1813.","author":"Sethi Raghav","year":"2019","unstructured":"Raghav Sethi , Martin Traverso , Dain Sundstrom , David Phillips , Wenlei Xie , Yutian Sun , Nezih Yegitbasi , Haozhun Jin , Eric Hwang , Nileema Shingte , 2019 . Presto: Sql on everything. In ICDE. 1802--1813. Raghav Sethi, Martin Traverso, Dain Sundstrom, David Phillips, Wenlei Xie, Yutian Sun, Nezih Yegitbasi, Haozhun Jin, Eric Hwang, Nileema Shingte, et al. 2019. Presto: Sql on everything. In ICDE. 1802--1813."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2006.84"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.5555\/645927.672349"},{"key":"e_1_2_1_56_1","doi-asserted-by":"crossref","unstructured":"Saravanan Thirumuruganathan Shohedul Hasan Nick Koudas and Gautam Das. 2020. Approximate query processing for data exploration using deep generative models. In ICDE. 1309--1320. Saravanan Thirumuruganathan Shohedul Hasan Nick Koudas and Gautam Das. 2020. Approximate query processing for data exploration using deep generative models. In ICDE. 1309--1320.","DOI":"10.1109\/ICDE48307.2020.00117"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402707.3402724"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3064029"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.5555\/961322.961374"},{"key":"e_1_2_1_60_1","unstructured":"Xiaoying Wang Changbo Qu Weiyuan Wu Jiannan Wang and Qingqing Zhou. 2020. Are we ready for learned cardinality estimation? arXiv:2012.06743 [cs.DB] Xiaoying Wang Changbo Qu Weiyuan Wu Jiannan Wang and Qingqing Zhou. 2020. Are we ready for learned cardinality estimation? arXiv:2012.06743 [cs.DB]"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.14778\/3291264.3291267"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.14778\/3421424.3421432"},{"key":"e_1_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.14778\/3368289.3368294"},{"key":"e_1_2_1_64_1","volume-title":"Github repository: naru project. https:\/\/github.com\/naru-project\/naru","author":"Yang Zongheng","year":"2019","unstructured":"Zongheng Yang and Chenggang Wu. 2019. Github repository: naru project. https:\/\/github.com\/naru-project\/naru ( 2019 ). Zongheng Yang and Chenggang Wu. 2019. Github repository: naru project. https:\/\/github.com\/naru-project\/naru (2019)."},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3183739"},{"key":"e_1_2_1_66_1","volume-title":"FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation [Technical Report]. arXiv preprint arXiv:2011.09022","author":"Zhu Rong","year":"2020","unstructured":"Rong Zhu , Ziniu Wu , Yuxing Han , Kai Zeng , Andreas Pfadler , Zhengping Qian , Jingren Zhou , and Bin Cui . 2020 . FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation [Technical Report]. arXiv preprint arXiv:2011.09022 (2020). Rong Zhu, Ziniu Wu, Yuxing Han, Kai Zeng, Andreas Pfadler, Zhengping Qian, Jingren Zhou, and Bin Cui. 2020. FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation [Technical Report]. arXiv preprint arXiv:2011.09022 (2020)."}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3461535.3461539","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T09:46:01Z","timestamp":1672220761000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3461535.3461539"}},"subtitle":["fast, lightweight and accurate method for cardinality estimation"],"short-title":[],"issued":{"date-parts":[[2021,5]]},"references-count":66,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2021,5]]}},"alternative-id":["10.14778\/3461535.3461539"],"URL":"https:\/\/doi.org\/10.14778\/3461535.3461539","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2021,5]]}}}