Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
CoreyMorris
/
ppo-Pixelcopter-PLE-v0
like
0
Reinforcement Learning
stable-baselines3
Pixelcopter-PLE-v0
deep-reinforcement-learning
Eval Results
Model card
Files
Files and versions
Community
Use this model
main
ppo-Pixelcopter-PLE-v0
/
Pixelcopter-PLE-v0_4
1 contributor
History:
1 commit
CoreyMorris
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
28a0b97
almost 2 years ago
_stable_baselines3_version
Safe
8 Bytes
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
almost 2 years ago
data
Safe
13.3 kB
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
almost 2 years ago
policy.optimizer.pth
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
,
"torch.FloatStorage"
What is a pickle import?
86 kB
LFS
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
almost 2 years ago
policy.pth
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
,
"torch.FloatStorage"
What is a pickle import?
42.4 kB
LFS
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
almost 2 years ago
pytorch_variables.pth
Safe
pickle
Pickle imports
No problematic imports detected
What is a pickle import?
431 Bytes
LFS
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
almost 2 years ago
system_info.txt
Safe
198 Bytes
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
almost 2 years ago