Winner Takes It All: Training Performant RL Populations for Combinatorial Optimization Paper • 2210.03475 • Published Oct 7, 2022 • 1
One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient Reinforcement Learning Paper • 2111.00206 • Published Oct 30, 2021
Combinatorial Optimization with Policy Adaptation using Latent Space Search Paper • 2311.13569 • Published Nov 13, 2023
Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function Paper • 2211.10550 • Published Nov 19, 2022
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX Paper • 2306.09884 • Published Jun 16, 2023