qgallouedec
/

ppo-InvertedPendulum-v2-288745441

Reinforcement Learning

stable-baselines3

InvertedPendulum-v2

deep-reinforcement-learning

InvertedPendulum-v4

Model card Files Files and versions Community

qgallouedec HF staff commited on Apr 17

Commit

55a7b63

•

1 Parent(s): d922331

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +1 -58

README.md CHANGED Viewed

@@ -5,6 +5,7 @@ tags:
 - deep-reinforcement-learning
 - reinforcement-learning
 - stable-baselines3
 model-index:
 - name: PPO
   results:
@@ -20,61 +21,3 @@ model-index:
       name: mean_reward
       verified: false
 ---
-# **PPO** Agent playing **InvertedPendulum-v2**
-This is a trained model of a **PPO** agent playing **InvertedPendulum-v2**
-using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3)
-and the [RL Zoo](https://github.com/DLR-RM/rl-baselines3-zoo).
-The RL Zoo is a training framework for Stable Baselines3
-reinforcement learning agents,
-with hyperparameter optimization and pre-trained agents included.
-## Usage (with SB3 RL Zoo)
-RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo<br/>
-SB3: https://github.com/DLR-RM/stable-baselines3<br/>
-SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
-Install the RL Zoo (with SB3 and SB3-Contrib):
-```bash
-pip install rl_zoo3
-```
-```
-# Download model and save it into the logs/ folder
-python -m rl_zoo3.load_from_hub --algo ppo --env InvertedPendulum-v2 -orga qgallouedec -f logs/
-python -m rl_zoo3.enjoy --algo ppo --env InvertedPendulum-v2  -f logs/
-```
-If you installed the RL Zoo3 via pip (`pip install rl_zoo3`), from anywhere you can do:
-```
-python -m rl_zoo3.load_from_hub --algo ppo --env InvertedPendulum-v2 -orga qgallouedec -f logs/
-python -m rl_zoo3.enjoy --algo ppo --env InvertedPendulum-v2  -f logs/
-```
-## Training (with the RL Zoo)
-```
-python -m rl_zoo3.train --algo ppo --env InvertedPendulum-v2 -f logs/
-# Upload the model and generate video (when possible)
-python -m rl_zoo3.push_to_hub --algo ppo --env InvertedPendulum-v2 -f logs/ -orga qgallouedec
-```
-## Hyperparameters
-```python
-OrderedDict([('batch_size', 64),
-             ('clip_range', 0.4),
-             ('ent_coef', 1.37976e-07),
-             ('gae_lambda', 0.9),
-             ('gamma', 0.999),
-             ('learning_rate', 0.000222425),
-             ('max_grad_norm', 0.3),
-             ('n_envs', 1),
-             ('n_epochs', 5),
-             ('n_steps', 32),
-             ('n_timesteps', 1000000.0),
-             ('normalize', True),
-             ('policy', 'MlpPolicy'),
-             ('vf_coef', 0.19816),
-             ('normalize_kwargs', {'norm_obs': True, 'norm_reward': False})])
-```

 - deep-reinforcement-learning
 - reinforcement-learning
 - stable-baselines3
+- InvertedPendulum-v4
 model-index:
 - name: PPO
   results:
       name: mean_reward
       verified: false
 ---