Papers
arxiv:2205.15701

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

Published on May 31, 2022
Authors:
,
,

Abstract

While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the <PRE_TAG>representation function</POST_TAG> is already known to the agent or from linear <PRE_TAG><PRE_TAG>function class</POST_TAG></POST_TAG>, since analyzing general <PRE_TAG>function class</POST_TAG> representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear <PRE_TAG><PRE_TAG>function class</POST_TAG></POST_TAG>, while real-world practice usually adopts general non-linear <PRE_TAG>representation function</POST_TAG>s like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general <PRE_TAG>function class</POST_TAG> representations. Specifically, we consider an agent playing M contextual <PRE_TAG>bandits</POST_TAG> (or MDPs) concurrently and extracting a shared <PRE_TAG>representation function</POST_TAG> phi from a specific <PRE_TAG>function class</POST_TAG> Phi using our proposed Generalized Functional Upper Confidence Bound algorithm (GFUCB). We theoretically validate the benefit of multitask representation learning within general <PRE_TAG>function class</POST_TAG> for bandits and linear MDP for the first time. Lastly, we conduct experiments to demonstrate the effectiveness of our algorithm with neural net representation.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2205.15701 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2205.15701 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2205.15701 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.