Abstract
Learned index is a novel index structure and changed the way we treat the traditional field of DBMS index. It views index as models and uses a learning-based approach to fit the distribution of stored data. The models input the key and output the predicted location of the target keys. To achieve higher query throughput, we propose WELGOR. We train the linear regression model with priority of the keys. To improve the mapping ability of the model, we use a hybrid model which adds the design of a simple linear model to better indexing keys. Besides, we also optimize the space allocation for gap design in node while achieving comparable throughput. Experiments show that WELGOR achieves 23% to 93% improvement in throughput compared with state-of-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amazon sales rank data for print and kindle books.: (2019). https://www.kaggle.com/ucffool/ amazon-sales-rank-data-for-print-and-kindle-books
AWS, A.: Openstreetmap on AWS (2021). https://registry.opendata.aws/osm
benchmark, R.: A repository to test PGMs and RMIs on different platforms using a much simpler benchmark harness than SOSD (2020). https://github.com/RyanMarcus/rmi_pgm
cpp btree: The C++ B-Tree library implemented by Google (2011). https://code.google.com/archive/p/cpp-btree
Chen, Z., Hua, Y., Ding, B., Zuo, P.: Lock-free concurrent level hashing for persistent memory. In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp. 799–812 (2020)
Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1082–1090 (2011)
Ding, J., et al.: Alex: an updatable adaptive learned index. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 969–984 (2020)
Ding, J., Nathan, V., Alizadeh, M., Kraska, T.: Tsunami: a learned multi-dimensional index for correlated data and skewed workloads. Proc. VLDB Endowment 14(2), 74–86 (2020)
Ferragina, P., Vinciguerra, G.: The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endowment 13(8), 1162–1175 (2020)
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: a data-aware index structure. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1189–1206 (2019)
Ge, J., et al.: SALI: a scalable adaptive learned index framework based on probability models. Proc. ACM Manag. Data 1(4), 1–25 (2023)
Kipf, A., et al.: RadixSpline: a single-pass learned index. In: Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, pp. 1–5 (2020)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, pp. 489–504 (2018)
Li, P., Hua, Y., Jia, J., Zuo, P.: FINEdex: a fine-grained learned index scheme for scalable and concurrent memory systems. Proc. VLDB Endowment 15(2), 321–334 (2021)
Li, P., Lu, H., Zhu, R., Ding, B., Yang, L., Pan, G.: DILI: A distribution-driven learned index. arXiv preprint arXiv:2304.08817 (2023)
Lu, B., Ding, J., Lo, E., Minhas, U.F., Wang, T.: APEX: a high-performance learned index on persistent memory. Proc. VLDB Endowment 15(3), 597–610 (2021)
Mao, Y., Kohler, E., Morris, R.T.: Cache craftiness for fast multicore key-value storage. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 183–196 (2012)
Nathan, V., Ding, J., Alizadeh, M., Kraska, T.: Learning multi-dimensional indexes. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 985–1000 (2020)
Pandey, V., Kipf, A., Neumann, T., Kemper, A.: How good are modern spatial analytics systems? Proc. VLDB Endowment 11(11), 1661–1673 (2018)
S2 Geometry.: (2019). https://s2geometry.io/
Wu, J., Zhang, Y., Chen, S., Chen, Y., Wang, J., Xing, C.: Updatable learned index with precise positions. Proc. VLDB Endow. 14(8), 1276–1288 (2021). https://doi.org/10.14778/3457390.3457393
Wu, S., Cui, Y., Yu, J., Sun, X., Kuo, T., Xue, C.J.: NFL: robust learned index via distribution transformation. Proc. VLDB Endow. 15(10), 2188–2200 (2022). https://doi.org/10.14778/3547305.3547322
Acknowledgment
This work is supported by the Research Foundation of Science and Technology Plan Project of Guangzhou City (2023B01J0001, 2024B01W0004), the National Natural Science Foundation of China 62102463, and the Natural Science Foundation of Guangdong Province of China No. 2022A1515011135.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sun, H., Zheng, L., Yin, J. (2025). Weighted Linear Regression with Optimized Gap for Learned Index. In: Barhamgi, M., Wang, H., Wang, X. (eds) Web Information Systems Engineering – WISE 2024. WISE 2024. Lecture Notes in Computer Science, vol 15439. Springer, Singapore. https://doi.org/10.1007/978-981-96-0573-6_2
Download citation
DOI: https://doi.org/10.1007/978-981-96-0573-6_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0572-9
Online ISBN: 978-981-96-0573-6
eBook Packages: Computer ScienceComputer Science (R0)