default search action
Nathan Grinsztajn
Person information
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c10]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Bill Wu, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. EMNLP 2024: 21353-21370 - [c9]Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence Illing Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphaël Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre:
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX. ICLR 2024 - [c8]Andries P. Smit, Nathan Grinsztajn, Paul Duckworth, Thomas D. Barrett, Arnu Pretorius:
Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs. ICML 2024 - [i14]Félix Chalumeau, Refiloe Shabe, Noah de Nicola, Arnu Pretorius, Thomas D. Barrett, Nathan Grinsztajn:
Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization. CoRR abs/2406.16424 (2024) - [i13]Yannis Flet-Berliac, Nathan Grinsztajn, Florian Strub, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Mohammad Gheshlaghi Azar, Olivier Pietquin, Matthieu Geist:
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion. CoRR abs/2406.19185 (2024) - [i12]Nathan Grinsztajn, Yannis Flet-Berliac, Mohammad Gheshlaghi Azar, Florian Strub, Bill Wu, Eugene Choi, Chris Cremer, Arash Ahmadian, Yash Chandak, Olivier Pietquin, Matthieu Geist:
Averaging log-likelihoods in direct alignment. CoRR abs/2406.19188 (2024) - [i11]John Dang, Shivalika Singh, Daniel D'souza, Arash Ahmadian, Alejandro Salamanca, Madeline Smith, Aidan Peppin, Sungjin Hong, Manoj Govindassamy, Terrence Zhao, Sandra Kublik, Meor Amer, Viraat Aryabumi, Jon Ander Campos, Yi Chern Tan, Tom Kocmi, Florian Strub, Nathan Grinsztajn, Yannis Flet-Berliac, Acyr Locatelli, Hangyu Lin, Dwarak Talupuru, Bharat Venkitesh, David Cairuz, Bowen Yang, Tim Chung, Wei-Yin Ko, Sylvie Shang Shi, Amir Shukayev, Sammie Bae, Aleksandra Piktus, Roman Castagné, Felipe Cruz-Salinas, Eddie Kim, Lucas Crawhall-Stein, Adrien Morisot, Sudip Roy, Phil Blunsom, Ivan Zhang, Aidan N. Gomez, Nick Frosst, Marzieh Fadaee, Beyza Ermis, Ahmet Üstün, Sara Hooker:
Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier. CoRR abs/2412.04261 (2024) - 2023
- [b1]Nathan Grinsztajn:
Reinforcement learning for combinatorial optimization : leveraging uncertainty, structure and priors. (Apprentissage par renforcement pour l'optimisation combinatoire : exploiter l'incertitude, les structures et les connaissances a priori). University of Lille, France, 2023 - [c7]Félix Chalumeau, Shikha Surana, Clément Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Tom Barrett:
Combinatorial Optimization with Policy Adaptation using Latent Space Search. NeurIPS 2023 - [c6]Nathan Grinsztajn, Daniel Furelos-Blanco, Shikha Surana, Clément Bonnet, Tom Barrett:
Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization. NeurIPS 2023 - [i10]Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Vincent Coyette, Paul Duckworth, Laurence I. Midgley, Tristan Kalloniatis, Sasha Abramowitz, Cemlyn N. Waters, Andries P. Smit, Nathan Grinsztajn, Ulrich A. Mbou Sob, Omayma Mahjoub, Elshadai Tegegn, Mohamed A. Mimouni, Raphaël Boige, Ruan de Kock, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre:
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX. CoRR abs/2306.09884 (2023) - [i9]Félix Chalumeau, Shikha Surana, Clément Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett:
Combinatorial Optimization with Policy Adaptation using Latent Space Search. CoRR abs/2311.13569 (2023) - [i8]Andries P. Smit, Paul Duckworth, Nathan Grinsztajn, Kale-ab Tessera, Thomas D. Barrett, Arnu Pretorius:
Are we going MAD? Benchmarking Multi-Agent Debate between Language Models for Medical Q&A. CoRR abs/2311.17371 (2023) - 2022
- [c5]Manh Hung Nguyen, Lisheng Sun-Hosoya, Nathan Grinsztajn, Isabelle Guyon:
Meta-learning from Learning Curves: Challenge Design and Baseline Results. IJCNN 2022: 1-8 - [i7]Manh Hung Nguyen, Lisheng Sun, Nathan Grinsztajn, Isabelle Guyon:
Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round. CoRR abs/2208.02821 (2022) - [i6]Nathan Grinsztajn, Daniel Furelos-Blanco, Thomas D. Barrett:
Population-Based Reinforcement Learning for Combinatorial Optimization. CoRR abs/2210.03475 (2022) - 2021
- [c4]Nathan Grinsztajn, Olivier Beaumont, Emmanuel Jeannot, Philippe Preux:
READYS: A Reinforcement Learning Based Strategy for Heterogeneous Dynamic Scheduling. CLUSTER 2021: 70-81 - [c3]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. NeurIPS 2021: 1898-1911 - [c2]Manh Hung Nguyen, Isabelle Guyon, Lisheng Sun-Hosoya, Nathan Grinsztajn:
MetaREVEAL: RL-based Meta-learning from Learning Curves. IAL@PKDD/ECML 2021: 1-20 - [i5]Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist:
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning. CoRR abs/2106.04480 (2021) - [i4]Nathan Grinsztajn, Louis Leconte, Philippe Preux, Edouard Oyallon:
Interferometric Graph Transform for Community Labeling. CoRR abs/2106.05875 (2021) - [i3]Nathan Grinsztajn, Philippe Preux, Edouard Oyallon:
Low-Rank Projections of GCNs Laplacian. CoRR abs/2106.07360 (2021) - [i2]Toby Johnstone, Nathan Grinsztajn, Johan Ferret, Philippe Preux:
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences. CoRR abs/2110.10632 (2021) - 2020
- [c1]Nathan Grinsztajn, Olivier Beaumont, Emmanuel Jeannot, Philippe Preux:
Geometric deep reinforcement learning for dynamic DAG scheduling. SSCI 2020: 258-265 - [i1]Nathan Grinsztajn, Olivier Beaumont, Emmanuel Jeannot, Philippe Preux:
Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling. CoRR abs/2011.04333 (2020)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-18 02:08 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint