Weighted Linear Regression with Optimized Gap for Learned Index

Sun, Hongtao; Zheng, Libin; Yin, Jian

doi:10.1007/978-981-96-0573-6_2

Hongtao Sun¹⁰,
Libin Zheng¹⁰ &
Jian Yin¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15439))

Included in the following conference series:

International Conference on Web Information Systems Engineering

184 Accesses

Abstract

Learned index is a novel index structure and changed the way we treat the traditional field of DBMS index. It views index as models and uses a learning-based approach to fit the distribution of stored data. The models input the key and output the predicted location of the target keys. To achieve higher query throughput, we propose WELGOR. We train the linear regression model with priority of the keys. To improve the mapping ability of the model, we use a hybrid model which adds the design of a simple linear model to better indexing keys. Besides, we also optimize the space allocation for gap design in node while achieving comparable throughput. Experiments show that WELGOR achieves 23% to 93% improvement in throughput compared with state-of-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 8465; Price includes VAT (Japan)

Softcover Book: JPY 10581; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes

An Error-Bounded Space-Efficient Hybrid Learned Index with High Lookup Performance

Learned index for non-key queries

Article 25 September 2024

References

Amazon sales rank data for print and kindle books.: (2019). https://www.kaggle.com/ucffool/ amazon-sales-rank-data-for-print-and-kindle-books
AWS, A.: Openstreetmap on AWS (2021). https://registry.opendata.aws/osm
benchmark, R.: A repository to test PGMs and RMIs on different platforms using a much simpler benchmark harness than SOSD (2020). https://github.com/RyanMarcus/rmi_pgm
cpp btree: The C++ B-Tree library implemented by Google (2011). https://code.google.com/archive/p/cpp-btree
Chen, Z., Hua, Y., Ding, B., Zuo, P.: Lock-free concurrent level hashing for persistent memory. In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp. 799–812 (2020)
Google Scholar
Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1082–1090 (2011)
Google Scholar
Ding, J., et al.: Alex: an updatable adaptive learned index. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 969–984 (2020)
Google Scholar
Ding, J., Nathan, V., Alizadeh, M., Kraska, T.: Tsunami: a learned multi-dimensional index for correlated data and skewed workloads. Proc. VLDB Endowment 14(2), 74–86 (2020)
Article Google Scholar
Ferragina, P., Vinciguerra, G.: The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endowment 13(8), 1162–1175 (2020)
Article Google Scholar
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: a data-aware index structure. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1189–1206 (2019)
Google Scholar
Ge, J., et al.: SALI: a scalable adaptive learned index framework based on probability models. Proc. ACM Manag. Data 1(4), 1–25 (2023)
Article Google Scholar
Kipf, A., et al.: RadixSpline: a single-pass learned index. In: Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, pp. 1–5 (2020)
Google Scholar
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, pp. 489–504 (2018)
Google Scholar
Li, P., Hua, Y., Jia, J., Zuo, P.: FINEdex: a fine-grained learned index scheme for scalable and concurrent memory systems. Proc. VLDB Endowment 15(2), 321–334 (2021)
Article Google Scholar
Li, P., Lu, H., Zhu, R., Ding, B., Yang, L., Pan, G.: DILI: A distribution-driven learned index. arXiv preprint arXiv:2304.08817 (2023)
Lu, B., Ding, J., Lo, E., Minhas, U.F., Wang, T.: APEX: a high-performance learned index on persistent memory. Proc. VLDB Endowment 15(3), 597–610 (2021)
Article Google Scholar
Mao, Y., Kohler, E., Morris, R.T.: Cache craftiness for fast multicore key-value storage. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 183–196 (2012)
Google Scholar
Nathan, V., Ding, J., Alizadeh, M., Kraska, T.: Learning multi-dimensional indexes. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 985–1000 (2020)
Google Scholar
Pandey, V., Kipf, A., Neumann, T., Kemper, A.: How good are modern spatial analytics systems? Proc. VLDB Endowment 11(11), 1661–1673 (2018)
Article Google Scholar
S2 Geometry.: (2019). https://s2geometry.io/
Wu, J., Zhang, Y., Chen, S., Chen, Y., Wang, J., Xing, C.: Updatable learned index with precise positions. Proc. VLDB Endow. 14(8), 1276–1288 (2021). https://doi.org/10.14778/3457390.3457393
Wu, S., Cui, Y., Yu, J., Sun, X., Kuo, T., Xue, C.J.: NFL: robust learned index via distribution transformation. Proc. VLDB Endow. 15(10), 2188–2200 (2022). https://doi.org/10.14778/3547305.3547322

Download references

Acknowledgment

This work is supported by the Research Foundation of Science and Technology Plan Project of Guangzhou City (2023B01J0001, 2024B01W0004), the National Natural Science Foundation of China 62102463, and the Natural Science Foundation of Guangdong Province of China No. 2022A1515011135.

Author information

Authors and Affiliations

Sun Yat-sen University, Guangzhou, China
Hongtao Sun, Libin Zheng & Jian Yin

Authors

Hongtao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Libin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Libin Zheng .

Editor information

Editors and Affiliations

Qatar University, Doha, Qatar
Mahmoud Barhamgi
Victoria University, Melbourne, VIC, Australia
Hua Wang
Tianjin University, Tianjin, China
Xin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, H., Zheng, L., Yin, J. (2025). Weighted Linear Regression with Optimized Gap for Learned Index. In: Barhamgi, M., Wang, H., Wang, X. (eds) Web Information Systems Engineering – WISE 2024. WISE 2024. Lecture Notes in Computer Science, vol 15439. Springer, Singapore. https://doi.org/10.1007/978-981-96-0573-6_2

Download citation

DOI: https://doi.org/10.1007/978-981-96-0573-6_2
Published: 27 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0572-9
Online ISBN: 978-981-96-0573-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Weighted Linear Regression with Optimized Gap for Learned Index

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes

An Error-Bounded Space-Efficient Hybrid Learned Index with High Lookup Performance

Learned index for non-key queries

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Weighted Linear Regression with Optimized Gap for Learned Index

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes

An Error-Bounded Space-Efficient Hybrid Learned Index with High Lookup Performance

Learned index for non-key queries

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation