arxiv:2307.16348

Rating-based Reinforcement Learning

Published on Jul 30, 2023

Authors:

Devin White ,

Abstract

This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2307.16348 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2307.16348 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2307.16348 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.