arxiv:2304.11153

Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single

Published on Apr 21, 2023

Authors:

Abstract

We propose an evolution strategies-based algorithm for estimating gradients in unrolled computation graphs, called ES-Single. Similarly to the recently-proposed Persistent Evolution Strategies (PES), ES-Single is unbiased, and overcomes chaos arising from recursive function applications by smoothing the meta-loss landscape. ES-Single samples a single perturbation per particle, that is kept fixed over the course of an inner problem (e.g., perturbations are not re-sampled for each partial unroll). Compared to PES, ES-Single is simpler to implement and has lower variance: the variance of ES-Single is constant with respect to the number of truncated unrolls, removing a key barrier in applying ES to long inner problems using short truncations. We show that ES-Single is unbiased for quadratic <PRE_TAG>inner problems</POST_TAG>, and demonstrate empirically that its variance can be substantially lower than that of PES. ES-Single consistently outperforms PES on a variety of tasks, including a synthetic benchmark task, hyperparameter optimization, training recurrent neural networks, and training learned optimizers.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2304.11153 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2304.11153 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2304.11153 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.