Abstract
Supervised fine-tuning of large language models (LMs) does not always provide good text-generation performance in terms of quality and diversity. This is because such models maximize the likelihood of correct subsequent words based on previous contexts encountered in the training phase, instead of evaluating the entire structure of the generated texts. In this context, fine-tuning methods for LMs using adversarial imitation learning (AIL) have been proposed to improve the trade-off relationship between quality and diversity. This method leverages the evaluations of the discriminators without requiring manually designed metrics. Previously proposed AIL methods cannot control the shapes of the reward functions and constrain updates of LMs using fixed ranges, independent of the quality, e.g., proximal policy optimization. This study proposes a combination of an AIL method and an approximation of mixture distributions (AMDAIL), synergizing with LMs for text generation. AMDAIL exhibits two features: (1) controlling the distribution of the bounded reward values by varying the shape of the bounded reward function, and (2) a variable constraint to promote updates using the confidence of the discriminator as the quality of the texts. The proposed method exhibits stable behavior in the training phases and improves the trade-off relationship between the quality and diversity in the inference phases.
The source code is available at https://github.com/zabu-nishiki/amdail.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kaplan, J., et al.: Scaling Laws for Neural Language Models (2020), arXiv:2001.08361
I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, G. Krueger, J. W. Kim, S. Kreps, M. McCain, A. Newhouse, J. Blazakis, K. McGuffie, and J. Wang, "Release Strategies and the Social Impacts of Language Models," 2019, arXiv:1908.09203
Brown, T.B., et al.: Language Models are Few-Shot Learners (2020), arXiv:2005.14165
Holtzman, A., Buys, J., Du, L., Forbes, M., Choi, Y.: The curious case of neural text degeneration. Presented at the International Conference on Learning Representations (2020)
Welleck, S., Kulikov, I., Roller, S., Dinan, E., Cho, K., Weston, J.: Neural Text Generation With Unlikelihood Training. Presented at the International Conference on Learning Representations (2020)
Ouyang, L.: Training language models to follow instructions with human feedback. In: Advances in Neural Information Processing Systems (2022)
Stiennon, N., et al.: Learning to summarize with human feedback. In Advances in Neural Information Processing Systems 33, pp. 3008–3021 (2020)
Torabi, F., Warnell, G., Stone, P.: Recent Advances in Imitation Learning from Observation In: Proceedings of 28th International Joint Conference on Artificial Intelligence, pp. 6325–6331 (2019)
Zhou, W., Ge, T., Xu, K., Wei, F., Zhou, M.: Self-Adversarial Learning with Comparative Discrimination for Text Generation. In: International Conference on Learning Representations (2020)
Wu, Q., Li, L., Yu, Z.: TextGAIL: generative adversarial imitation learning for text generation. In: Proceedings of AAAI Conference on Artificial Intelligence, vol. 35(16), pp. 14067–14075 (2021)
Lamprier, S., et al.: Generative cooperative networks for natural language generation. In: Proceedings of 39th International Conference on Machine Learning, vol. 162, pp. 11891–11905. PMLR (2022)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Lin, K., Li, D., He, X., Zhang, Z., Sun, M.: Adversarial Ranking for Language Generation. In Advances in Neural Information Processing Systems, vol. 30 (2017)
Guo, J., Lu, S., Cai, H., Zhang, W., Yu, Y., Wang, J.: Long text generation via adversarial training with leaked information. In: Proceedings of AAAI Conference on Artificial Intelligence, vol. 32(1) (2018)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms, arXiv:1707.06347 (2017)
Ho, J., Ermon, S.: Generative Adversarial Imitation Learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Kloek, T., van Dijk, H.K.: Bayesian estimates of equation system parameters: an application of integration by Monte Carlo. Econometrica 46(1), 1–19 (1978)
Fu, J., Luo, K., Levine, S.: Learning robust rewards with adverserial inverse reinforcement learning. In: International Conference on Learning Representations (2018)
Ghasemipour, S.K.S., Zemel, R.S., Gu, S.: A divergence minimization perspective on imitation learning methods, In: 3rd Annual Conference on Robot Learning, Proceedings of Machine Learning Research, vol. 100, pp. 1259–1277. PMLR (2019)
Zhao, M., Cong, Y., Dai, S., Carin, L.: Bridging maximum likelihood and adversarial learning via \(\alpha \)-divergence. In: Proceedings of AAAI Conference on Artificial Intelligence, vol. 34(04), pp. 6901–6908 (2020)
Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Proc. Nat. Academy Sci. 114(13), 3521–3526 (2017)
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: The 32nd International Conference on Machine Learning, In Proc. Machine Learning Research, vol. 37, pp. 1889–1897. PMLR (2015)
Lin, B.Y., et al.: CommonGen: a constrained text generation challenge for generative commonsense reasoning. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 1823–1840. Association for Computational Linguistics (2020)
Mostafazadeh, N., et al.: A corpus and cloze evaluation for deeper understanding of commonsense stories. In: Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 839–849. Association for Computational Linguistics (2016)
Wu, Y., et al.: Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, arXiv:1609.08144 (2016)
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: Proceedings of 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 110–119. Association for Computational Linguistics (2016)
Zhang, S.,et al.: OPT: Open Pre-trained Transformer Language Models, arxiv:2205.01068 (2022)
He, P., Liu, X., Gao, J., Chen, W.: DeBERTa: Decoding-enhanced BERT with Disentangled Attention, arxiv:2006.03654 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Ethical Statement
In this study, we utilized pretrained LMs for text generation. Pretrained LMs, e.g., GPTs, may contain personal data in the weight parameters. Thus, LMs should be tested in advance and the personal information should be removed from both datasets and models. In this study, we utilized public datasets and models to avoid the misuse of personal information. We confirm that the models did not generate any texts, including any personal information. However, this study can be utilized in deception involving fake documents, news, etc., even though we only intend to improve the quality and diversity of LMs for text generation. Hence, we intend to manage and survey the utilization of this study, to ensure that its negative influences are monitored and constrained.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nishikino, K., Kobayashi, K. (2023). Adversarial Imitation Learning with Controllable Rewards for Text Generation. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14169. Springer, Cham. https://doi.org/10.1007/978-3-031-43412-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-43412-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43411-2
Online ISBN: 978-3-031-43412-9
eBook Packages: Computer ScienceComputer Science (R0)