sugarfreez commited on
Commit
df2b142
โ€ข
1 Parent(s): 3c336c3

style(nyz): add status emoji and env link

Browse files
Files changed (1) hide show
  1. README.md +12 -15
README.md CHANGED
@@ -25,22 +25,19 @@ If you want to contact us & join us, you can โœ‰๏ธ to our team : <opendilab@p
25
 
26
 
27
  # Overview of Model Zoo
28
-
29
- <sup>(1): "-" means that this algorithm doesn't support this environment.</sup>
30
- <sup>(2): "W" means that the corresponding model is in the upload waitinglist.</sup>
31
-
32
  ### Deep Reinforcement Learning
33
-
34
- | Algo.\Env. | LunarLander | BipedalWalker | Pendulum | Atari (Pong) | Atari (SpaceInvaders) | Atari (Qbert) | MuJoCo (Hopper) | MuJoCo (Halfcheetah) | MuJoCo (Walker2d) |
35
- | ------------- | ------------- | ------------------------ | ------------ | -------------- | ------------ | ------------------ | --------- | --------- | --------- |
36
- | [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [โˆš](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-ppo) | | | | | | | | |
37
- | [PG](https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf) | | | | | | | | | |
38
- | [A2C](https://arxiv.org/pdf/1602.01783.pdf) | | | | | | | | | |
39
- | [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) | | | | | | | | | |
40
- | [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | | | | | | | - | - | - |
41
- | [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | | | | - | - | - | | | |
42
- | [TD3](https://arxiv.org/pdf/1802.09477.pdf) | | | | - | - | - | | | |
43
- | [SAC](https://arxiv.org/pdf/1801.01290.pdf) | | | | - | - | - | | | |
44
 
45
 
46
  ### Multi-Agent Reinforcement Learning
 
25
 
26
 
27
  # Overview of Model Zoo
28
+ <sup>(1): "๐Ÿ”“" means that this algorithm doesn't support this environment.</sup>
29
+ <sup>(2): "๐ŸŽฎ" means that the corresponding model is in the upload waitinglist.</sup>
 
 
30
  ### Deep Reinforcement Learning
31
+ | Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
32
+ | :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
33
+ | [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [โœ…](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-ppo) | | | | | | [โœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v4-PPO) | | |
34
+ | [PG](https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf) | ๐ŸŽฎ | | | | | |๐ŸŽฎ | | |
35
+ | [A2C](https://arxiv.org/pdf/1602.01783.pdf) | ๐ŸŽฎ | | | | | | ๐ŸŽฎ | | |
36
+ | [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) |๐ŸŽฎ | | | | | | ๐ŸŽฎ | | |
37
+ | [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | ๐ŸŽฎ | | | | | | ๐Ÿ”“ | ๐Ÿ”“ | ๐Ÿ”“ |
38
+ | [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | ๐ŸŽฎ | | | ๐Ÿ”“ | ๐Ÿ”“ | ๐Ÿ”“ | ๐ŸŽฎ | | |
39
+ | [TD3](https://arxiv.org/pdf/1802.09477.pdf) | ๐ŸŽฎ | | | ๐Ÿ”“ | ๐Ÿ”“ | ๐Ÿ”“ |[โœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v4-TD3) | | |
40
+ | [SAC](https://arxiv.org/pdf/1801.01290.pdf) |๐ŸŽฎ | | | ๐Ÿ”“ | ๐Ÿ”“ | ๐Ÿ”“ | [โœ…](https://huggingface.co/OpenDILabCommunity/Hopper-v4-SAC) | | |
 
41
 
42
 
43
  ### Multi-Agent Reinforcement Learning