An Improved Neuro-Symbolic Architecture to Fine-Tune Generative AI Systems | SpringerLink
Skip to main content

An Improved Neuro-Symbolic Architecture to Fine-Tune Generative AI Systems

  • Conference paper
  • First Online:
Integration of Constraint Programming, Artificial Intelligence, and Operations Research (CPAIOR 2024)

Abstract

Deep generative models excel at replicating the mechanisms that generate a specific set of sequential data. However, learning the underlying constraints preventing the generation of forbidden sequences poses a challenge. Recently, RL-Tuner, a reinforcement learning framework designed for the ad hoc fine-tuning of a neural model to adhere to given constraints, was enhanced to learn from the output of two constraint programming models. The first model computes a score representing the number of constraint violations from the currently generated token while the second model provides the marginal probability of that token being generated if no additional violation is allowed. In this paper, we significantly enhance the latter framework in three ways. First, we propose a simplified architecture that requires only a single constraint programming model. Second, we evaluate constraint violations in a more accurate and consistent manner. Third, we propose a reward signal based on belief propagation on this new model that further improves performance. Our experiments, conducted on the same learning task of music generation, demonstrate that our approach surpasses the previous framework both in terms of convergence speed during training and in post-training accuracy. Additionally, our approach exhibits superior generalization to longer sequences than those used during training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 14871
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 9437
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/ChYinn/RL_Tuner_CPBP.

References

  1. Ahmed, K., Teso, S., Chang, K., den Broeck, G.V., Vergari, A.: Semantic probabilistic layers for neuro-symbolic learning. In: NeurIPS (2022). http://papers.nips.cc/paper_files/paper/2022/hash/c182ec594f38926b7fcb827635b9a8f4-Abstract-Conference.html

  2. Ahmed, K., Wang, E., Chang, K., den Broeck, G.V.: Neuro-symbolic entropy regularization. In: Cussens, J., Zhang, K. (eds.) Uncertainty in Artificial Intelligence, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI 2022, 1–5 August 2022, Eindhoven, The Netherlands. Proceedings of Machine Learning Research, vol. 180, pp. 43–53. PMLR (2022). https://proceedings.mlr.press/v180/ahmed22a.html

  3. Babaki, B., Omrani, B., Pesant, G.: Combinatorial search in CP-based iterated belief propagation. In: Simonis, H. (ed.) CP 2020. LNCS, vol. 12333, pp. 21–36. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58475-7_2

    Chapter  Google Scholar 

  4. Bai, Y., Chen, D., Gomes, C.P.: CLR-DRNets: curriculum learning with restarts to solve visual combinatorial games. In: 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2021)

    Google Scholar 

  5. Burlats, A., Pesant, G.: Exploiting entropy in constraint programming. In: Ciré, A.A. (ed.) CPAIOR 2023. LNCS, vol. 13884, pp. 320–335. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-33271-5_21

    Chapter  Google Scholar 

  6. Chen, D., Bai, Y., Zhao, W., Ament, S., Gregoire, J., Gomes, C.: Deep reasoning networks for unsupervised pattern de-mixing with constraint reasoning. In: International Conference on Machine Learning, pp. 1500–1509. PMLR (2020)

    Google Scholar 

  7. Cuthbert, M.S., Ariza, C.: Music21: a toolkit for computer-aided musicology and symbolic music data. In: Downie, J.S., Veltkamp, R.C. (eds.) ISMIR, pp. 637–642. International Society for Music Information Retrieval (2010). http://dblp.uni-trier.de/db/conf/ismir/ismir2010.html#CuthbertA10

  8. Defresne, M., Barbe, S., Schiex, T.: Scalable coupling of deep learning with logical reasoning. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19–25 August 2023, Macao, SAR, China, pp. 3615–3623. ijcai.org (2023). https://doi.org/10.24963/IJCAI.2023/402

  9. Demassey, S., Pesant, G., Rousseau, L.M.: A cost-regular based hybrid column generation approach. Constraints 11(4), 315–333 (2006). https://doi.org/10.1007/s10601-006-9003-7

    Article  MathSciNet  Google Scholar 

  10. Dong, H., Mao, J., Lin, T., Wang, C., Li, L., Zhou, D.: Neural logic machines. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, 6–9 May 2019. OpenReview.net (2019). https://openreview.net/forum?id=B1xY-hRctX

  11. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  12. van Hoeve, W.J., Pesant, G., Rousseau, L.: On global warming: flow-based soft global constraints. J. Heuristics 12(4–5), 347–373 (2006). https://doi.org/10.1007/s10732-006-6550-4

    Article  Google Scholar 

  13. Jaques, N., Gu, S., Turner, R.E., Eck, D.: Tuning recurrent neural networks with reinforcement learning. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Workshop Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=Syyv2e-Kx

  14. Lafleur, D., Chandar, S., Pesant, G.: Combining reinforcement learning and constraint programming for sequence-generation tasks with hard constraints. In: Solnon, C. (ed.) 28th International Conference on Principles and Practice of Constraint Programming (CP 2022). Leibniz International Proceedings in Informatics (LIPIcs), vol. 235, pp. 30:1–30:16. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl (2022). https://doi.org/10.4230/LIPIcs.CP.2022.30

  15. Leslie, D., Rossi, F.: ACM TechBrief: generative artificial intelligence. Technical report, Association for Computing Machinery, New York (2023).https://doi.org/10.1145/3626110

  16. Manhaeve, R., Dumancic, S., Kimmig, A., Demeester, T., Raedt, L.D.: Neural probabilistic logic programming in deepproblog. Artif. Intell. 298, 103504 (2021). https://doi.org/10.1016/j.artint.2021.103504

  17. Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)

  18. Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Waltz, D.L. (ed.) Proceedings of the National Conference on Artificial Intelligence, Pittsburgh, PA, USA, 18–20 August 1982, pp. 133–136. AAAI Press (1982). http://www.aaai.org/Library/AAAI/1982/aaai82-032.php

  19. Pesant, G.: The MiniCPBP solver. https://github.com/PesantGilles/MiniCPBP

  20. Pesant, G.: From support propagation to belief propagation in constraint programming. J. Artif. Intell. Res. 66 (2019). https://doi.org/10.1613/jair.1.11487

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, vol. 30 (2017)

    Google Scholar 

  22. Wang, P., Donti, P.L., Wilder, B., Kolter, J.Z.: SATNet: bridging deep learning and logical reasoning using a differentiable satisfiability solver. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 6545–6554. PMLR (2019). http://proceedings.mlr.press/v97/wang19e.html

  23. Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)

    Article  Google Scholar 

  24. Xu, J., Zhang, Z., Friedman, T., Liang, Y., den Broeck, G.V.: A semantic loss function for deep learning with symbolic knowledge. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, 10–15 July 2018. Proceedings of Machine Learning Research, vol. 80, pp. 5498–5507. PMLR (2018). http://proceedings.mlr.press/v80/xu18h.html

  25. Yang, Z., Ishay, A., Lee, J.: NeurASP: embracing neural networks into answer set programming. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp. 1755–1762. ijcai.org (2020). https://doi.org/10.24963/ijcai.2020/243

  26. Yang, Z., Lee, J., Park, C.: Injecting logical constraints into neural networks via straight-through estimators. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvári, C., Niu, G., Sabato, S. (eds.) International Conference on Machine Learning, ICML 2022, 17–23 July 2022, Baltimore, Maryland, USA. Proceedings of Machine Learning Research, vol. 162, pp. 25096–25122. PMLR (2022). https://proceedings.mlr.press/v162/yang22h.html

Download references

Acknowledgements

Financial support for this research was provided by NSERC Discovery Grant 05705/2023.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gilles Pesant .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yin, C., Cappart, Q., Pesant, G. (2024). An Improved Neuro-Symbolic Architecture to Fine-Tune Generative AI Systems. In: Dilkina, B. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2024. Lecture Notes in Computer Science, vol 14743. Springer, Cham. https://doi.org/10.1007/978-3-031-60599-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-60599-4_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-60601-4

  • Online ISBN: 978-3-031-60599-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics