Abstract
Deep generative models excel at replicating the mechanisms that generate a specific set of sequential data. However, learning the underlying constraints preventing the generation of forbidden sequences poses a challenge. Recently, RL-Tuner, a reinforcement learning framework designed for the ad hoc fine-tuning of a neural model to adhere to given constraints, was enhanced to learn from the output of two constraint programming models. The first model computes a score representing the number of constraint violations from the currently generated token while the second model provides the marginal probability of that token being generated if no additional violation is allowed. In this paper, we significantly enhance the latter framework in three ways. First, we propose a simplified architecture that requires only a single constraint programming model. Second, we evaluate constraint violations in a more accurate and consistent manner. Third, we propose a reward signal based on belief propagation on this new model that further improves performance. Our experiments, conducted on the same learning task of music generation, demonstrate that our approach surpasses the previous framework both in terms of convergence speed during training and in post-training accuracy. Additionally, our approach exhibits superior generalization to longer sequences than those used during training.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, K., Teso, S., Chang, K., den Broeck, G.V., Vergari, A.: Semantic probabilistic layers for neuro-symbolic learning. In: NeurIPS (2022). http://papers.nips.cc/paper_files/paper/2022/hash/c182ec594f38926b7fcb827635b9a8f4-Abstract-Conference.html
Ahmed, K., Wang, E., Chang, K., den Broeck, G.V.: Neuro-symbolic entropy regularization. In: Cussens, J., Zhang, K. (eds.) Uncertainty in Artificial Intelligence, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI 2022, 1–5 August 2022, Eindhoven, The Netherlands. Proceedings of Machine Learning Research, vol. 180, pp. 43–53. PMLR (2022). https://proceedings.mlr.press/v180/ahmed22a.html
Babaki, B., Omrani, B., Pesant, G.: Combinatorial search in CP-based iterated belief propagation. In: Simonis, H. (ed.) CP 2020. LNCS, vol. 12333, pp. 21–36. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58475-7_2
Bai, Y., Chen, D., Gomes, C.P.: CLR-DRNets: curriculum learning with restarts to solve visual combinatorial games. In: 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2021)
Burlats, A., Pesant, G.: Exploiting entropy in constraint programming. In: Ciré, A.A. (ed.) CPAIOR 2023. LNCS, vol. 13884, pp. 320–335. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-33271-5_21
Chen, D., Bai, Y., Zhao, W., Ament, S., Gregoire, J., Gomes, C.: Deep reasoning networks for unsupervised pattern de-mixing with constraint reasoning. In: International Conference on Machine Learning, pp. 1500–1509. PMLR (2020)
Cuthbert, M.S., Ariza, C.: Music21: a toolkit for computer-aided musicology and symbolic music data. In: Downie, J.S., Veltkamp, R.C. (eds.) ISMIR, pp. 637–642. International Society for Music Information Retrieval (2010). http://dblp.uni-trier.de/db/conf/ismir/ismir2010.html#CuthbertA10
Defresne, M., Barbe, S., Schiex, T.: Scalable coupling of deep learning with logical reasoning. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19–25 August 2023, Macao, SAR, China, pp. 3615–3623. ijcai.org (2023). https://doi.org/10.24963/IJCAI.2023/402
Demassey, S., Pesant, G., Rousseau, L.M.: A cost-regular based hybrid column generation approach. Constraints 11(4), 315–333 (2006). https://doi.org/10.1007/s10601-006-9003-7
Dong, H., Mao, J., Lin, T., Wang, C., Li, L., Zhou, D.: Neural logic machines. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=B1xY-hRctX
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
van Hoeve, W.J., Pesant, G., Rousseau, L.: On global warming: flow-based soft global constraints. J. Heuristics 12(4–5), 347–373 (2006). https://doi.org/10.1007/s10732-006-6550-4
Jaques, N., Gu, S., Turner, R.E., Eck, D.: Tuning recurrent neural networks with reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Workshop Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=Syyv2e-Kx
Lafleur, D., Chandar, S., Pesant, G.: Combining reinforcement learning and constraint programming for sequence-generation tasks with hard constraints. In: Solnon, C. (ed.) 28th International Conference on Principles and Practice of Constraint Programming (CP 2022). Leibniz International Proceedings in Informatics (LIPIcs), vol. 235, pp. 30:1–30:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl (2022). https://doi.org/10.4230/LIPIcs.CP.2022.30
Leslie, D., Rossi, F.: ACM TechBrief: generative artificial intelligence. Technical report, Association for Computing Machinery, New York (2023).https://doi.org/10.1145/3626110
Manhaeve, R., Dumancic, S., Kimmig, A., Demeester, T., Raedt, L.D.: Neural probabilistic logic programming in deepproblog. Artif. Intell. 298, 103504 (2021). https://doi.org/10.1016/j.artint.2021.103504
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Waltz, D.L. (ed.) Proceedings of the National Conference on Artificial Intelligence, Pittsburgh, PA, USA, 18–20 August 1982, pp. 133–136. AAAI Press (1982). http://www.aaai.org/Library/AAAI/1982/aaai82-032.php
Pesant, G.: The MiniCPBP solver. https://github.com/PesantGilles/MiniCPBP
Pesant, G.: From support propagation to belief propagation in constraint programming. J. Artif. Intell. Res. 66 (2019). https://doi.org/10.1613/jair.1.11487
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, vol. 30 (2017)
Wang, P., Donti, P.L., Wilder, B., Kolter, J.Z.: SATNet: bridging deep learning and logical reasoning using a differentiable satisfiability solver. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 6545–6554. PMLR (2019). http://proceedings.mlr.press/v97/wang19e.html
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
Xu, J., Zhang, Z., Friedman, T., Liang, Y., den Broeck, G.V.: A semantic loss function for deep learning with symbolic knowledge. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 5498–5507. PMLR (2018). http://proceedings.mlr.press/v80/xu18h.html
Yang, Z., Ishay, A., Lee, J.: NeurASP: embracing neural networks into answer set programming. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp. 1755–1762. ijcai.org (2020). https://doi.org/10.24963/ijcai.2020/243
Yang, Z., Lee, J., Park, C.: Injecting logical constraints into neural networks via straight-through estimators. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., Sabato, S. (eds.) International Conference on Machine Learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research, vol. 162, pp. 25096–25122. PMLR (2022). https://proceedings.mlr.press/v162/yang22h.html
Acknowledgements
Financial support for this research was provided by NSERC Discovery Grant 05705/2023.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yin, C., Cappart, Q., Pesant, G. (2024). An Improved Neuro-Symbolic Architecture to Fine-Tune Generative AI Systems. In: Dilkina, B. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2024. Lecture Notes in Computer Science, vol 14743. Springer, Cham. https://doi.org/10.1007/978-3-031-60599-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-60599-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-60601-4
Online ISBN: 978-3-031-60599-4
eBook Packages: Computer ScienceComputer Science (R0)