Abstract
Many real-world optimization scenarios involve expensive evaluation with unknown and heterogeneous costs. Cost-aware Bayesian optimization stands out as a prominent solution in addressing these challenges. To approach the global optimum within a limited budget in a cost-efficient manner, the design of cost-aware acquisition functions (AFs) becomes a crucial step. However, traditional manual design paradigm typically requires extensive domain knowledge and involves a labor-intensive trial-and-error process. This paper introduces EvolCAF, a novel framework that integrates large language models (LLMs) with evolutionary computation (EC) to automatically design cost-aware AFs. Leveraging the crossover and mutation in the algorithmic space, EvolCAF offers a novel design paradigm, significantly reduces the reliance on domain expertise and model training. The designed cost-aware AF maximizes the utilization of available information from historical data, surrogate models and budget details. It introduces novel ideas not previously explored in the existing literature on acquisition function design, allowing for clear interpretations to provide insights into its behavior and decision-making process. In comparison to the well-known EIpu and EI-cool methods designed by human experts, our approach showcases remarkable efficiency and generalization across various tasks, including 12 synthetic problems and 3 real-world hyperparameter tuning test sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bai, T., Li, Y., Shen, Y., Zhang, X., Zhang, W., Cui, B.: Transfer learning for bayesian optimization: a survey. arXiv preprint arXiv:2302.05927 (2023)
Balandat, M., et al.: Botorch: a framework for efficient monte-carlo bayesian optimization. Adv. Neural. Inf. Process. Syst. 33, 21524–21538 (2020)
Bansal, A., Stoll, D., Janowski, M., Zela, A., Hutter, F.: JAHS-bench-201: a foundation for research on joint architecture and hyperparameter search. Adv. Neural. Inf. Process. Syst. 35, 38788–38802 (2022)
Chen, Y., et al.: Learning to learn without gradient descent by gradient descent. In: International Conference on Machine Learning, pp. 748–756. PMLR (2017)
Chen, Y., et al.: Towards learning universal hyperparameter optimizers with transformers. Adv. Neural. Inf. Process. Syst. 35, 32053–32068 (2022)
Frazier, P.I., Powell, W.B., Dayanik, S.: A knowledge-gradient policy for sequential information collection. SIAM J. Control. Optim. 47(5), 2410–2439 (2008)
Frazier, P.I., Wang, J.: Bayesian optimization for materials design. In: Lookman, T., Alexander, F.J., Rajan, K. (eds.) Information Science for Materials Discovery and Design. SSMS, vol. 225, pp. 45–75. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-23871-5_3
Garnett, R., Osborne, M.A., Roberts, S.J.: Bayesian optimization for sensor set selection. In: Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks, pp. 209–219 (2010)
Guinet, G., Perrone, V., Archambeau, C.: Pareto-efficient acquisition functions for cost-aware bayesian optimization. arXiv preprint arXiv:2011.11456 (2020)
Hahnloser, R.H., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947–951 (2000)
Hemberg, E., Moskal, S., O’Reilly, U.M.: Evolving code with a large language model. arXiv preprint arXiv:2401.07102 (2024)
Hsieh, B.J., Hsieh, P.C., Liu, X.: Reinforced few-shot acquisition function learning for bayesian optimization. Adv. Neural. Inf. Process. Syst. 34, 7718–7731 (2021)
Kather, J.N., Weis, C.A., Bianconi, F., Melchers, S.M., Schad, L.R., Gaiser, T., Marx, A., Zöllner, F.G.: Multi-class texture analysis in colorectal cancer histology. Sci. Rep. 6(1), 1–11 (2016)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)
Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Basic Eng. 86(1), 97–106 (1964)
Lee, E.H., Eriksson, D., Perrone, V., Seeger, M.: A nonmyopic approach to cost-constrained bayesian optimization. In: Uncertainty in Artificial Intelligence, pp. 568–577. PMLR (2021)
Lee, E.H., Perrone, V., Archambeau, C., Seeger, M.: Cost-aware bayesian optimization. arXiv preprint arXiv:2003.10870 (2020)
Lehman, J., Gordon, J., Jain, S., Ndousse, K., Yeh, C., Stanley, K.O.: Evolution through large models. In: Banzhaf, W., Machado, P., Zhang, M. (eds.) Handbook of Evolutionary Machine Learning. Genetic and Evolutionary Computation. Springer, Singapore (2022). https://doi.org/10.1007/978-981-99-3814-8_11
Liu, F., Tong, X., Yuan, M., Lin, X., Luo, F., Wang, Z., Lu, Z., Zhang, Q.: Evolution of heuristics: towards efficient automatic algorithm design using large language model. In: Proceedings of International Conference on Machine Learning (2024)
Liu, F., Tong, X., Yuan, M., Zhang, Q.: Algorithm evolution using large language model. arXiv preprint arXiv:2311.15249 (2023)
Liu, T., Astorga, N., Seedat, N., van der Schaar, M.: Large language models to enhance bayesian optimization. arXiv preprint arXiv:2402.03921 (2024)
Liventsev, V., Grishina, A., Härmä, A., Moonen, L.: Fully autonomous programming with large language models. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1146–1155 (2023)
Luong, P., Nguyen, D., Gupta, S., Rana, S., Venkatesh, S.: Adaptive cost-aware bayesian optimization. Knowl.-Based Syst. 232, 107481 (2021)
Maraval, A., Zimmer, M., Grosnit, A., Bou Ammar, H.: End-to-end meta-bayesian optimisation with transformer neural processes. In: Advances in Neural Information Processing Systems, vol. 36 (2024)
Močkus, J.: On bayesian methods for seeking the extremum. In: Marchuk, G.I. (ed.) Optimization Techniques 1974. LNCS, vol. 27, pp. 400–404. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07165-2_55
Müller, S.G., Hutter, F.: Trivialaugment: tuning-free yet state-of-the-art data augmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 774–782 (2021)
Negoescu, D.M., Frazier, P.I., Powell, W.B.: The knowledge-gradient algorithm for sequencing experiments in drug discovery. Informs J. Comput. 23(3), 346–363 (2011)
Qian, W., He, Z., Li, L., Liu, X., Gao, F.: Cobabo: a hyperparameter search method with cost budget awareness. In: 2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS), pp. 408–412. IEEE (2021)
Romera-Paredes, B., et al.: Mathematical discoveries from program search with large language models. Nature 625(7995), 468–475 (2024)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. arXiv preprint arXiv:0912.3995 (2009)
Turner, R., et al.: Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020. In: NeurIPS 2020 Competition and Demonstration Track, pp. 3–26. PMLR (2021)
TV, V., Malhotra, P., Narwariya, J., Vig, L., Shroff, G.: Meta-learning for black-box optimization. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11907, pp. 366–381. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46147-8_22
Volpp, M., et al.: Meta-learning acquisition functions for transfer learning in bayesian optimization. arXiv preprint arXiv:1904.02642 (2019)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Zhang, M.R., Desai, N., Bae, J., Lorraine, J., Ba, J.: Using large language models for hyperparameter optimization. In: NeurIPS 2023 Foundation Models for Decision Making Workshop (2023)
Acknowledgments
The work described in this paper was supported by the Research Grants Council of the Hong Kong Special Administrative Region, China [GRF Project No. CityU-11215723], the Natural Science Foundation of China (Project No: 62276223), and the Key Basic Research Foundation of Shenzhen, China (JCYJ20220818100005011).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors declare that they have no known competing interest that could appear to influence the work reported in this paper.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yao, Y., Liu, F., Cheng, J., Zhang, Q. (2024). Evolve Cost-Aware Acquisition Functions Using Large Language Models. In: Affenzeller, M., et al. Parallel Problem Solving from Nature – PPSN XVIII. PPSN 2024. Lecture Notes in Computer Science, vol 15149. Springer, Cham. https://doi.org/10.1007/978-3-031-70068-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-70068-2_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70067-5
Online ISBN: 978-3-031-70068-2
eBook Packages: Computer ScienceComputer Science (R0)