Negatively Correlated Ensemble RL

Verified environment

Python 3.9.6
JPype 1.3.0
dtw 1.4.0
scipy 1.7.2
torch 1.8.2+cu111
numpy 1.20.3
gym 0.21.0
scipy 1.7.2
Pillow 10.0.0
matplotlib 3.6.3
pandas 1.3.2
sklearn 1.0.1

How to use

All training are launched by running train.py with option and arguments. For example, execute python train.py ncesac --lbd 0.3 --m 5 will train NCERL with hyperparameters set as $\lambda = 0.3, m=5$. Plot script is plots.py

python train.py gan: to train a decoder which maps a continuous action to a game level segment.
python train.py sac: to train a standard SAC as the policy for online game level generation
python train.py asyncsac: to train a SAC with an asynchronous evaluation environment as the policy for online game level generation
python train.py ncesac: to train an NCERL based on SAC as the policy for online game level generation
python train.py egsac: to train an episodic generative SAC (see paper The fun facets of Mario: Multifaceted experience-driven PCG via reinforcement learning) as the policy for online game level generation
python train.py pmoe: to train an episodic generative SAC (see paper Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning) as the policy for online game level generation
python train.py sunrise: to train a SUNRISE (see paper SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning) as the policy for online game level generation
python train.py dvd: to train a DvD-SAC (see paper Effective Diversity in Population Based Reinforcement Learning) as the policy for online game level generation

For the training arguments, please refer to the help python train.py [option] --help