zjowowen commited on
Commit
b2dde2e
β€’
1 Parent(s): f2007bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -34,11 +34,11 @@ If you want to contact us & join us, you can βœ‰οΈ to our team : <opendilab@p
34
  | Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [LunarLanderContinuous](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
35
  | :-------------: | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
36
  | [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLanderContinuous-v2-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-PPO) |
37
- | [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-DQN) | πŸ”“ | πŸ”“ | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-DQN) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-DQN) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-DQN) | πŸ”“ | πŸ”“ | πŸ”“ |
38
- | [C51](https://arxiv.org/pdf/1707.06887.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-C51) | πŸ”“ | πŸ”“ | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-C51) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-C51) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-C51) | πŸ”“ | πŸ”“ | πŸ”“ |
39
- | [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-DDPG) | πŸ”“ | πŸ”“ | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-DDPG) |
40
- | [TD3](https://arxiv.org/pdf/1802.09477.pdf) | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-TD3) | πŸ”“ | πŸ”“ | πŸ”“ |[βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-TD3) |
41
- | [SAC](https://arxiv.org/pdf/1801.01290.pdf) | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-SAC) | πŸ”“ | πŸ”“ | πŸ”“ | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-SAC) |
42
  | [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-IMPALA) | | | | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-IMPALA) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-IMPALA) | | | | |
43
 
44
  </details>
@@ -49,13 +49,13 @@ If you want to contact us & join us, you can βœ‰οΈ to our team : <opendilab@p
49
 
50
  | Algo.\Env. | [CartPole](https://di-engine-docs.readthedocs.io/en/latest/13_envs/cartpole.html) | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [LunarLanderContinuous](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Breakout](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [MsPacman](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [TicTacToe]() | []() | []() |
51
  | :-------------: | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
52
- | [AlphaZero](https://www.science.org/doi/10.1126/science.aar6404) | | | | | | | | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-AlphaZero) | | |
53
- | [Sampled AlphaZero](https://www.science.org/doi/10.1126/science.aar6404) | | | | | | | | | | |
54
- | [Muzero](https://arxiv.org/abs/1911.08265) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero) | | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/BreakoutNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-MuZero) | | |
55
- | [EfficientZero](https://arxiv.org/abs/2111.00210) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-EfficientZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-EfficientZero) | | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-EfficientZero) | | | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-EfficientZero) | | | |
56
- | [Gumbel MuZero](https://openreview.net/pdf?id=bERaNdoegnO&) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-GumbelMuZero) | | | | | | | | | |
57
- | [Sampled EfficientZero](https://arxiv.org/abs/2104.06303) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-SampledEfficientZero) | | | | | | | | | |
58
- | [Stochastic MuZero](https://openreview.net/pdf?id=X6D9bAHhBQ1) | | | | | | | | | | |
59
 
60
  </details>
61
 
 
34
  | Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [LunarLanderContinuous](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
35
  | :-------------: | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
36
  | [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLanderContinuous-v2-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-PPO) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-PPO) |
37
+ | [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-DQN) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-DQN) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-DQN) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-DQN) | πŸ”’ | πŸ”’ | πŸ”’ |
38
+ | [C51](https://arxiv.org/pdf/1707.06887.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-C51) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-C51) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-C51) | [βœ…](https://huggingface.co/OpenDILabCommunity/QbertNoFrameskip-v4-C51) | πŸ”’ | πŸ”’ | πŸ”’ |
39
+ | [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-DDPG) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-DDPG) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-DDPG) |
40
+ | [TD3](https://arxiv.org/pdf/1802.09477.pdf) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-TD3) | πŸ”’ | πŸ”’ | πŸ”’ |[βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-TD3) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-TD3) |
41
+ | [SAC](https://arxiv.org/pdf/1801.01290.pdf) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/BipedalWalker-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-SAC) | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/HalfCheetah-v3-SAC) | [βœ…](https://huggingface.co/OpenDILabCommunity/Walker2d-v3-SAC) |
42
  | [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) | [βœ…](https://huggingface.co/OpenDILabCommunity/Lunarlander-v2-IMPALA) | | | | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-IMPALA) | [βœ…](https://huggingface.co/OpenDILabCommunity/SpaceInvadersNoFrameskip-v4-IMPALA) | | | | |
43
 
44
  </details>
 
49
 
50
  | Algo.\Env. | [CartPole](https://di-engine-docs.readthedocs.io/en/latest/13_envs/cartpole.html) | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [LunarLanderContinuous](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Breakout](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [MsPacman](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [TicTacToe]() | []() | []() |
51
  | :-------------: | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
52
+ | [AlphaZero](https://www.science.org/doi/10.1126/science.aar6404) | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-AlphaZero) | | |
53
+ | [Sampled AlphaZero](https://www.science.org/doi/10.1126/science.aar6404) | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | | | |
54
+ | [MuZero](https://arxiv.org/abs/1911.08265) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/PongNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/BreakoutNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-MuZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/TicTacToe-play-with-bot-MuZero) | | |
55
+ | [EfficientZero](https://arxiv.org/abs/2111.00210) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-EfficientZero) | [βœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-EfficientZero) | πŸ”’ | [βœ…](https://huggingface.co/OpenDILabCommunity/Pendulum-v1-EfficientZero) | | | [βœ…](https://huggingface.co/OpenDILabCommunity/MsPacmanNoFrameskip-v4-EfficientZero) | | | |
56
+ | [Gumbel MuZero](https://openreview.net/pdf?id=bERaNdoegnO&) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-GumbelMuZero) | | πŸ”’ | | | πŸ”’ | πŸ”’ | | | |
57
+ | [Sampled EfficientZero](https://arxiv.org/abs/2104.06303) | [βœ…](https://huggingface.co/OpenDILabCommunity/CartPole-v0-SampledEfficientZero) | | | | | | | πŸ”’ | | |
58
+ | [Stochastic MuZero](https://openreview.net/pdf?id=X6D9bAHhBQ1) | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ | πŸ”’ |
59
 
60
  </details>
61