arxiv:2212.06663

Quantum Policy Gradient Algorithm with Optimized Action Decoding

Published on Dec 13, 2022

Authors:

Abstract

Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose a specific action decoding procedure for a quantum policy gradient approach. We introduce a novel quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2212.06663 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2212.06663 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2212.06663 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.