|
[2024-09-22 05:59:43,579][04746] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2024-09-22 05:59:43,581][04746] Rollout worker 0 uses device cpu |
|
[2024-09-22 05:59:43,582][04746] Rollout worker 1 uses device cpu |
|
[2024-09-22 05:59:43,584][04746] Rollout worker 2 uses device cpu |
|
[2024-09-22 05:59:43,586][04746] Rollout worker 3 uses device cpu |
|
[2024-09-22 05:59:43,587][04746] Rollout worker 4 uses device cpu |
|
[2024-09-22 05:59:43,589][04746] Rollout worker 5 uses device cpu |
|
[2024-09-22 05:59:43,590][04746] Rollout worker 6 uses device cpu |
|
[2024-09-22 05:59:43,592][04746] Rollout worker 7 uses device cpu |
|
[2024-09-22 05:59:43,714][04746] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 05:59:43,716][04746] InferenceWorker_p0-w0: min num requests: 2 |
|
[2024-09-22 05:59:43,750][04746] Starting all processes... |
|
[2024-09-22 05:59:43,753][04746] Starting process learner_proc0 |
|
[2024-09-22 05:59:44,501][04746] Starting all processes... |
|
[2024-09-22 05:59:44,507][04746] Starting process inference_proc0-0 |
|
[2024-09-22 05:59:44,508][04746] Starting process rollout_proc0 |
|
[2024-09-22 05:59:44,508][04746] Starting process rollout_proc1 |
|
[2024-09-22 05:59:44,509][04746] Starting process rollout_proc2 |
|
[2024-09-22 05:59:44,510][04746] Starting process rollout_proc3 |
|
[2024-09-22 05:59:44,510][04746] Starting process rollout_proc4 |
|
[2024-09-22 05:59:44,511][04746] Starting process rollout_proc5 |
|
[2024-09-22 05:59:44,520][04746] Starting process rollout_proc6 |
|
[2024-09-22 05:59:44,525][04746] Starting process rollout_proc7 |
|
[2024-09-22 05:59:48,487][06918] Worker 7 uses CPU cores [7] |
|
[2024-09-22 05:59:48,681][06893] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 05:59:48,681][06893] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2024-09-22 05:59:48,700][06893] Num visible devices: 1 |
|
[2024-09-22 05:59:48,736][06910] Worker 3 uses CPU cores [3] |
|
[2024-09-22 05:59:48,761][06893] Starting seed is not provided |
|
[2024-09-22 05:59:48,761][06893] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 05:59:48,762][06893] Initializing actor-critic model on device cuda:0 |
|
[2024-09-22 05:59:48,762][06893] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 05:59:48,765][06893] RunningMeanStd input shape: (1,) |
|
[2024-09-22 05:59:48,801][06893] ConvEncoder: input_channels=3 |
|
[2024-09-22 05:59:48,897][06906] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 05:59:48,898][06906] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2024-09-22 05:59:48,921][06906] Num visible devices: 1 |
|
[2024-09-22 05:59:48,976][06909] Worker 2 uses CPU cores [2] |
|
[2024-09-22 05:59:49,076][06908] Worker 1 uses CPU cores [1] |
|
[2024-09-22 05:59:49,172][06893] Conv encoder output size: 512 |
|
[2024-09-22 05:59:49,172][06893] Policy head output size: 512 |
|
[2024-09-22 05:59:49,177][06913] Worker 4 uses CPU cores [4] |
|
[2024-09-22 05:59:49,196][06907] Worker 0 uses CPU cores [0] |
|
[2024-09-22 05:59:49,230][06912] Worker 6 uses CPU cores [6] |
|
[2024-09-22 05:59:49,240][06893] Created Actor Critic model with architecture: |
|
[2024-09-22 05:59:49,240][06893] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2024-09-22 05:59:49,362][06911] Worker 5 uses CPU cores [5] |
|
[2024-09-22 05:59:49,667][06893] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2024-09-22 05:59:50,419][06893] No checkpoints found |
|
[2024-09-22 05:59:50,419][06893] Did not load from checkpoint, starting from scratch! |
|
[2024-09-22 05:59:50,419][06893] Initialized policy 0 weights for model version 0 |
|
[2024-09-22 05:59:50,423][06893] LearnerWorker_p0 finished initialization! |
|
[2024-09-22 05:59:50,424][06893] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 05:59:50,602][06906] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 05:59:50,603][06906] RunningMeanStd input shape: (1,) |
|
[2024-09-22 05:59:50,616][06906] ConvEncoder: input_channels=3 |
|
[2024-09-22 05:59:50,731][06906] Conv encoder output size: 512 |
|
[2024-09-22 05:59:50,731][06906] Policy head output size: 512 |
|
[2024-09-22 05:59:50,788][04746] Inference worker 0-0 is ready! |
|
[2024-09-22 05:59:50,789][04746] All inference workers are ready! Signal rollout workers to start! |
|
[2024-09-22 05:59:50,846][06911] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,847][06909] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,846][06908] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,848][06910] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,848][06918] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,847][06913] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,848][06907] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:50,848][06912] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 05:59:51,197][06910] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,197][06911] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,197][06909] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,324][06913] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,326][06907] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,457][06910] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:51,462][06908] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,639][06907] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:51,755][06909] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:51,781][06912] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:51,819][06910] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:51,842][06913] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:51,852][06908] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:51,915][06911] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:52,106][06918] Decorrelating experience for 0 frames... |
|
[2024-09-22 05:59:52,182][06907] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:52,251][06912] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:52,259][06908] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:52,265][06909] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:52,361][06918] Decorrelating experience for 32 frames... |
|
[2024-09-22 05:59:52,477][06913] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:52,547][06908] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:52,636][06907] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:52,736][06909] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:52,786][06912] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:52,797][06910] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:52,998][06918] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:53,031][04746] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-09-22 05:59:53,072][06911] Decorrelating experience for 64 frames... |
|
[2024-09-22 05:59:53,097][06912] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:53,296][06913] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:53,393][06918] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:53,421][06911] Decorrelating experience for 96 frames... |
|
[2024-09-22 05:59:55,184][06893] Signal inference workers to stop experience collection... |
|
[2024-09-22 05:59:55,192][06906] InferenceWorker_p0-w0: stopping experience collection |
|
[2024-09-22 05:59:58,031][04746] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 460.4. Samples: 2302. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-09-22 05:59:58,034][04746] Avg episode reward: [(0, '1.889')] |
|
[2024-09-22 05:59:58,912][06893] Signal inference workers to resume experience collection... |
|
[2024-09-22 05:59:58,913][06906] InferenceWorker_p0-w0: resuming experience collection |
|
[2024-09-22 06:00:01,784][06906] Updated weights for policy 0, policy_version 10 (0.0163) |
|
[2024-09-22 06:00:03,033][04746] Fps is (10 sec: 5324.0, 60 sec: 5324.0, 300 sec: 5324.0). Total num frames: 53248. Throughput: 0: 1365.4. Samples: 13656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:00:03,035][04746] Avg episode reward: [(0, '4.332')] |
|
[2024-09-22 06:00:03,705][04746] Heartbeat connected on Batcher_0 |
|
[2024-09-22 06:00:03,709][04746] Heartbeat connected on LearnerWorker_p0 |
|
[2024-09-22 06:00:03,721][04746] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2024-09-22 06:00:03,726][04746] Heartbeat connected on RolloutWorker_w0 |
|
[2024-09-22 06:00:03,729][04746] Heartbeat connected on RolloutWorker_w1 |
|
[2024-09-22 06:00:03,733][04746] Heartbeat connected on RolloutWorker_w2 |
|
[2024-09-22 06:00:03,737][04746] Heartbeat connected on RolloutWorker_w3 |
|
[2024-09-22 06:00:03,738][04746] Heartbeat connected on RolloutWorker_w4 |
|
[2024-09-22 06:00:03,748][04746] Heartbeat connected on RolloutWorker_w6 |
|
[2024-09-22 06:00:03,756][04746] Heartbeat connected on RolloutWorker_w5 |
|
[2024-09-22 06:00:03,758][04746] Heartbeat connected on RolloutWorker_w7 |
|
[2024-09-22 06:00:05,230][06906] Updated weights for policy 0, policy_version 20 (0.0017) |
|
[2024-09-22 06:00:08,031][04746] Fps is (10 sec: 11468.9, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 114688. Throughput: 0: 1535.2. Samples: 23028. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:00:08,034][04746] Avg episode reward: [(0, '4.432')] |
|
[2024-09-22 06:00:08,084][06893] Saving new best policy, reward=4.432! |
|
[2024-09-22 06:00:08,391][06906] Updated weights for policy 0, policy_version 30 (0.0014) |
|
[2024-09-22 06:00:11,361][06906] Updated weights for policy 0, policy_version 40 (0.0015) |
|
[2024-09-22 06:00:13,031][04746] Fps is (10 sec: 13109.1, 60 sec: 9216.0, 300 sec: 9216.0). Total num frames: 184320. Throughput: 0: 2153.9. Samples: 43078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:00:13,033][04746] Avg episode reward: [(0, '4.490')] |
|
[2024-09-22 06:00:13,036][06893] Saving new best policy, reward=4.490! |
|
[2024-09-22 06:00:14,385][06906] Updated weights for policy 0, policy_version 50 (0.0016) |
|
[2024-09-22 06:00:17,617][06906] Updated weights for policy 0, policy_version 60 (0.0014) |
|
[2024-09-22 06:00:18,034][04746] Fps is (10 sec: 13513.4, 60 sec: 9993.3, 300 sec: 9993.3). Total num frames: 249856. Throughput: 0: 2506.8. Samples: 62676. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:00:18,036][04746] Avg episode reward: [(0, '4.243')] |
|
[2024-09-22 06:00:20,589][06906] Updated weights for policy 0, policy_version 70 (0.0014) |
|
[2024-09-22 06:00:23,031][04746] Fps is (10 sec: 13516.8, 60 sec: 10649.6, 300 sec: 10649.6). Total num frames: 319488. Throughput: 0: 2431.7. Samples: 72952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:00:23,034][04746] Avg episode reward: [(0, '4.546')] |
|
[2024-09-22 06:00:23,037][06893] Saving new best policy, reward=4.546! |
|
[2024-09-22 06:00:23,601][06906] Updated weights for policy 0, policy_version 80 (0.0014) |
|
[2024-09-22 06:00:26,497][06906] Updated weights for policy 0, policy_version 90 (0.0015) |
|
[2024-09-22 06:00:28,031][04746] Fps is (10 sec: 13929.9, 60 sec: 11117.7, 300 sec: 11117.7). Total num frames: 389120. Throughput: 0: 2685.5. Samples: 93994. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:00:28,034][04746] Avg episode reward: [(0, '4.462')] |
|
[2024-09-22 06:00:29,428][06906] Updated weights for policy 0, policy_version 100 (0.0016) |
|
[2024-09-22 06:00:32,699][06906] Updated weights for policy 0, policy_version 110 (0.0015) |
|
[2024-09-22 06:00:33,031][04746] Fps is (10 sec: 13516.7, 60 sec: 11366.4, 300 sec: 11366.4). Total num frames: 454656. Throughput: 0: 2842.0. Samples: 113682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:00:33,033][04746] Avg episode reward: [(0, '4.527')] |
|
[2024-09-22 06:00:35,664][06906] Updated weights for policy 0, policy_version 120 (0.0017) |
|
[2024-09-22 06:00:38,031][04746] Fps is (10 sec: 13107.0, 60 sec: 11559.8, 300 sec: 11559.8). Total num frames: 520192. Throughput: 0: 2755.9. Samples: 124018. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:00:38,033][04746] Avg episode reward: [(0, '4.583')] |
|
[2024-09-22 06:00:38,060][06893] Saving new best policy, reward=4.583! |
|
[2024-09-22 06:00:38,640][06906] Updated weights for policy 0, policy_version 130 (0.0014) |
|
[2024-09-22 06:00:41,589][06906] Updated weights for policy 0, policy_version 140 (0.0016) |
|
[2024-09-22 06:00:43,031][04746] Fps is (10 sec: 13517.0, 60 sec: 11796.5, 300 sec: 11796.5). Total num frames: 589824. Throughput: 0: 3167.9. Samples: 144858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:00:43,034][04746] Avg episode reward: [(0, '4.505')] |
|
[2024-09-22 06:00:44,648][06906] Updated weights for policy 0, policy_version 150 (0.0015) |
|
[2024-09-22 06:00:47,953][06906] Updated weights for policy 0, policy_version 160 (0.0016) |
|
[2024-09-22 06:00:48,031][04746] Fps is (10 sec: 13516.7, 60 sec: 11915.6, 300 sec: 11915.6). Total num frames: 655360. Throughput: 0: 3341.2. Samples: 164004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:00:48,034][04746] Avg episode reward: [(0, '4.547')] |
|
[2024-09-22 06:00:50,901][06906] Updated weights for policy 0, policy_version 170 (0.0015) |
|
[2024-09-22 06:00:53,031][04746] Fps is (10 sec: 13516.8, 60 sec: 12083.2, 300 sec: 12083.2). Total num frames: 724992. Throughput: 0: 3367.3. Samples: 174558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:00:53,034][04746] Avg episode reward: [(0, '4.534')] |
|
[2024-09-22 06:00:53,858][06906] Updated weights for policy 0, policy_version 180 (0.0018) |
|
[2024-09-22 06:00:56,747][06906] Updated weights for policy 0, policy_version 190 (0.0019) |
|
[2024-09-22 06:00:58,031][04746] Fps is (10 sec: 13926.4, 60 sec: 13243.7, 300 sec: 12225.0). Total num frames: 794624. Throughput: 0: 3385.9. Samples: 195444. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:00:58,033][04746] Avg episode reward: [(0, '4.582')] |
|
[2024-09-22 06:00:59,870][06906] Updated weights for policy 0, policy_version 200 (0.0017) |
|
[2024-09-22 06:01:03,031][04746] Fps is (10 sec: 13107.3, 60 sec: 13380.6, 300 sec: 12229.5). Total num frames: 856064. Throughput: 0: 3377.6. Samples: 214660. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-09-22 06:01:03,033][04746] Avg episode reward: [(0, '4.789')] |
|
[2024-09-22 06:01:03,056][06893] Saving new best policy, reward=4.789! |
|
[2024-09-22 06:01:03,063][06906] Updated weights for policy 0, policy_version 210 (0.0017) |
|
[2024-09-22 06:01:06,026][06906] Updated weights for policy 0, policy_version 220 (0.0015) |
|
[2024-09-22 06:01:08,031][04746] Fps is (10 sec: 13107.3, 60 sec: 13516.8, 300 sec: 12342.6). Total num frames: 925696. Throughput: 0: 3381.0. Samples: 225098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:08,034][04746] Avg episode reward: [(0, '4.693')] |
|
[2024-09-22 06:01:09,098][06906] Updated weights for policy 0, policy_version 230 (0.0015) |
|
[2024-09-22 06:01:12,029][06906] Updated weights for policy 0, policy_version 240 (0.0017) |
|
[2024-09-22 06:01:13,031][04746] Fps is (10 sec: 13926.3, 60 sec: 13516.8, 300 sec: 12441.6). Total num frames: 995328. Throughput: 0: 3367.8. Samples: 245544. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) |
|
[2024-09-22 06:01:13,035][04746] Avg episode reward: [(0, '4.842')] |
|
[2024-09-22 06:01:13,039][06893] Saving new best policy, reward=4.842! |
|
[2024-09-22 06:01:15,348][06906] Updated weights for policy 0, policy_version 250 (0.0014) |
|
[2024-09-22 06:01:18,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13449.1, 300 sec: 12432.6). Total num frames: 1056768. Throughput: 0: 3350.1. Samples: 264436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:18,033][04746] Avg episode reward: [(0, '5.062')] |
|
[2024-09-22 06:01:18,043][06893] Saving new best policy, reward=5.062! |
|
[2024-09-22 06:01:18,583][06906] Updated weights for policy 0, policy_version 260 (0.0015) |
|
[2024-09-22 06:01:21,557][06906] Updated weights for policy 0, policy_version 270 (0.0016) |
|
[2024-09-22 06:01:23,031][04746] Fps is (10 sec: 13107.4, 60 sec: 13448.6, 300 sec: 12515.6). Total num frames: 1126400. Throughput: 0: 3346.1. Samples: 274590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:23,034][04746] Avg episode reward: [(0, '5.321')] |
|
[2024-09-22 06:01:23,036][06893] Saving new best policy, reward=5.321! |
|
[2024-09-22 06:01:24,489][06906] Updated weights for policy 0, policy_version 280 (0.0018) |
|
[2024-09-22 06:01:27,668][06906] Updated weights for policy 0, policy_version 290 (0.0014) |
|
[2024-09-22 06:01:28,031][04746] Fps is (10 sec: 13516.9, 60 sec: 13380.2, 300 sec: 12546.7). Total num frames: 1191936. Throughput: 0: 3335.5. Samples: 294956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:28,033][04746] Avg episode reward: [(0, '5.955')] |
|
[2024-09-22 06:01:28,042][06893] Saving new best policy, reward=5.955! |
|
[2024-09-22 06:01:30,955][06906] Updated weights for policy 0, policy_version 300 (0.0017) |
|
[2024-09-22 06:01:33,031][04746] Fps is (10 sec: 12697.5, 60 sec: 13312.0, 300 sec: 12533.8). Total num frames: 1253376. Throughput: 0: 3336.7. Samples: 314156. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:33,033][04746] Avg episode reward: [(0, '6.551')] |
|
[2024-09-22 06:01:33,055][06893] Saving new best policy, reward=6.551! |
|
[2024-09-22 06:01:33,984][06906] Updated weights for policy 0, policy_version 310 (0.0016) |
|
[2024-09-22 06:01:36,885][06906] Updated weights for policy 0, policy_version 320 (0.0014) |
|
[2024-09-22 06:01:38,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13380.3, 300 sec: 12600.1). Total num frames: 1323008. Throughput: 0: 3331.8. Samples: 324488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:01:38,035][04746] Avg episode reward: [(0, '7.361')] |
|
[2024-09-22 06:01:38,093][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000324_1327104.pth... |
|
[2024-09-22 06:01:38,180][06893] Saving new best policy, reward=7.361! |
|
[2024-09-22 06:01:39,935][06906] Updated weights for policy 0, policy_version 330 (0.0016) |
|
[2024-09-22 06:01:43,031][04746] Fps is (10 sec: 13516.6, 60 sec: 13312.0, 300 sec: 12623.1). Total num frames: 1388544. Throughput: 0: 3315.4. Samples: 344638. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:43,033][04746] Avg episode reward: [(0, '6.732')] |
|
[2024-09-22 06:01:43,143][06906] Updated weights for policy 0, policy_version 340 (0.0019) |
|
[2024-09-22 06:01:46,376][06906] Updated weights for policy 0, policy_version 350 (0.0014) |
|
[2024-09-22 06:01:48,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13312.0, 300 sec: 12644.2). Total num frames: 1454080. Throughput: 0: 3315.5. Samples: 363858. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-09-22 06:01:48,035][04746] Avg episode reward: [(0, '7.363')] |
|
[2024-09-22 06:01:48,043][06893] Saving new best policy, reward=7.363! |
|
[2024-09-22 06:01:49,345][06906] Updated weights for policy 0, policy_version 360 (0.0016) |
|
[2024-09-22 06:01:52,347][06906] Updated weights for policy 0, policy_version 370 (0.0018) |
|
[2024-09-22 06:01:53,031][04746] Fps is (10 sec: 13516.9, 60 sec: 13312.0, 300 sec: 12697.6). Total num frames: 1523712. Throughput: 0: 3316.4. Samples: 374334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:01:53,034][04746] Avg episode reward: [(0, '8.228')] |
|
[2024-09-22 06:01:53,038][06893] Saving new best policy, reward=8.228! |
|
[2024-09-22 06:01:55,338][06906] Updated weights for policy 0, policy_version 380 (0.0015) |
|
[2024-09-22 06:01:58,031][04746] Fps is (10 sec: 13516.7, 60 sec: 13243.7, 300 sec: 12714.0). Total num frames: 1589248. Throughput: 0: 3304.1. Samples: 394228. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-09-22 06:01:58,035][04746] Avg episode reward: [(0, '10.239')] |
|
[2024-09-22 06:01:58,041][06893] Saving new best policy, reward=10.239! |
|
[2024-09-22 06:01:58,675][06906] Updated weights for policy 0, policy_version 390 (0.0017) |
|
[2024-09-22 06:02:01,808][06906] Updated weights for policy 0, policy_version 400 (0.0015) |
|
[2024-09-22 06:02:03,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13312.0, 300 sec: 12729.1). Total num frames: 1654784. Throughput: 0: 3315.7. Samples: 413642. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:03,034][04746] Avg episode reward: [(0, '10.943')] |
|
[2024-09-22 06:02:03,037][06893] Saving new best policy, reward=10.943! |
|
[2024-09-22 06:02:04,782][06906] Updated weights for policy 0, policy_version 410 (0.0014) |
|
[2024-09-22 06:02:07,832][06906] Updated weights for policy 0, policy_version 420 (0.0015) |
|
[2024-09-22 06:02:08,031][04746] Fps is (10 sec: 13107.3, 60 sec: 13243.7, 300 sec: 12743.1). Total num frames: 1720320. Throughput: 0: 3319.8. Samples: 423980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:08,033][04746] Avg episode reward: [(0, '11.171')] |
|
[2024-09-22 06:02:08,041][06893] Saving new best policy, reward=11.171! |
|
[2024-09-22 06:02:10,963][06906] Updated weights for policy 0, policy_version 430 (0.0018) |
|
[2024-09-22 06:02:13,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13175.5, 300 sec: 12756.1). Total num frames: 1785856. Throughput: 0: 3302.7. Samples: 443576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:13,033][04746] Avg episode reward: [(0, '9.843')] |
|
[2024-09-22 06:02:14,230][06906] Updated weights for policy 0, policy_version 440 (0.0015) |
|
[2024-09-22 06:02:17,315][06906] Updated weights for policy 0, policy_version 450 (0.0015) |
|
[2024-09-22 06:02:18,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 12768.2). Total num frames: 1851392. Throughput: 0: 3307.7. Samples: 463004. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-09-22 06:02:18,033][04746] Avg episode reward: [(0, '11.811')] |
|
[2024-09-22 06:02:18,040][06893] Saving new best policy, reward=11.811! |
|
[2024-09-22 06:02:20,387][06906] Updated weights for policy 0, policy_version 460 (0.0019) |
|
[2024-09-22 06:02:23,031][04746] Fps is (10 sec: 13516.7, 60 sec: 13243.7, 300 sec: 12806.8). Total num frames: 1921024. Throughput: 0: 3302.8. Samples: 473116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:23,033][04746] Avg episode reward: [(0, '15.738')] |
|
[2024-09-22 06:02:23,037][06893] Saving new best policy, reward=15.738! |
|
[2024-09-22 06:02:23,336][06906] Updated weights for policy 0, policy_version 470 (0.0016) |
|
[2024-09-22 06:02:26,473][06906] Updated weights for policy 0, policy_version 480 (0.0017) |
|
[2024-09-22 06:02:28,031][04746] Fps is (10 sec: 13107.1, 60 sec: 13175.5, 300 sec: 12790.1). Total num frames: 1982464. Throughput: 0: 3297.9. Samples: 493042. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:28,033][04746] Avg episode reward: [(0, '17.847')] |
|
[2024-09-22 06:02:28,042][06893] Saving new best policy, reward=17.847! |
|
[2024-09-22 06:02:29,836][06906] Updated weights for policy 0, policy_version 490 (0.0016) |
|
[2024-09-22 06:02:32,843][06906] Updated weights for policy 0, policy_version 500 (0.0017) |
|
[2024-09-22 06:02:33,031][04746] Fps is (10 sec: 12697.7, 60 sec: 13243.7, 300 sec: 12800.0). Total num frames: 2048000. Throughput: 0: 3304.2. Samples: 512546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:33,033][04746] Avg episode reward: [(0, '18.660')] |
|
[2024-09-22 06:02:33,036][06893] Saving new best policy, reward=18.660! |
|
[2024-09-22 06:02:35,862][06906] Updated weights for policy 0, policy_version 510 (0.0014) |
|
[2024-09-22 06:02:38,031][04746] Fps is (10 sec: 13516.8, 60 sec: 13243.7, 300 sec: 12834.1). Total num frames: 2117632. Throughput: 0: 3298.8. Samples: 522778. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:02:38,034][04746] Avg episode reward: [(0, '17.260')] |
|
[2024-09-22 06:02:38,822][06906] Updated weights for policy 0, policy_version 520 (0.0015) |
|
[2024-09-22 06:02:42,162][06906] Updated weights for policy 0, policy_version 530 (0.0020) |
|
[2024-09-22 06:02:43,031][04746] Fps is (10 sec: 13107.1, 60 sec: 13175.5, 300 sec: 12818.1). Total num frames: 2179072. Throughput: 0: 3289.2. Samples: 542240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:43,033][04746] Avg episode reward: [(0, '18.519')] |
|
[2024-09-22 06:02:45,374][06906] Updated weights for policy 0, policy_version 540 (0.0014) |
|
[2024-09-22 06:02:48,031][04746] Fps is (10 sec: 13107.3, 60 sec: 13243.7, 300 sec: 12849.7). Total num frames: 2248704. Throughput: 0: 3299.2. Samples: 562106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:48,034][04746] Avg episode reward: [(0, '17.896')] |
|
[2024-09-22 06:02:48,298][06906] Updated weights for policy 0, policy_version 550 (0.0015) |
|
[2024-09-22 06:02:51,351][06906] Updated weights for policy 0, policy_version 560 (0.0017) |
|
[2024-09-22 06:02:53,031][04746] Fps is (10 sec: 13516.9, 60 sec: 13175.5, 300 sec: 12856.9). Total num frames: 2314240. Throughput: 0: 3293.2. Samples: 572176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:02:53,033][04746] Avg episode reward: [(0, '19.252')] |
|
[2024-09-22 06:02:53,037][06893] Saving new best policy, reward=19.252! |
|
[2024-09-22 06:02:54,461][06906] Updated weights for policy 0, policy_version 570 (0.0020) |
|
[2024-09-22 06:02:57,854][06906] Updated weights for policy 0, policy_version 580 (0.0016) |
|
[2024-09-22 06:02:58,031][04746] Fps is (10 sec: 12697.6, 60 sec: 13107.2, 300 sec: 12841.5). Total num frames: 2375680. Throughput: 0: 3285.8. Samples: 591436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:02:58,033][04746] Avg episode reward: [(0, '19.334')] |
|
[2024-09-22 06:02:58,043][06893] Saving new best policy, reward=19.334! |
|
[2024-09-22 06:03:00,813][06906] Updated weights for policy 0, policy_version 590 (0.0015) |
|
[2024-09-22 06:03:03,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13175.5, 300 sec: 12870.1). Total num frames: 2445312. Throughput: 0: 3301.7. Samples: 611580. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:03:03,033][04746] Avg episode reward: [(0, '19.455')] |
|
[2024-09-22 06:03:03,035][06893] Saving new best policy, reward=19.455! |
|
[2024-09-22 06:03:03,857][06906] Updated weights for policy 0, policy_version 600 (0.0014) |
|
[2024-09-22 06:03:06,865][06906] Updated weights for policy 0, policy_version 610 (0.0014) |
|
[2024-09-22 06:03:08,031][04746] Fps is (10 sec: 13516.7, 60 sec: 13175.5, 300 sec: 12876.1). Total num frames: 2510848. Throughput: 0: 3301.4. Samples: 621680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:03:08,034][04746] Avg episode reward: [(0, '21.272')] |
|
[2024-09-22 06:03:08,042][06893] Saving new best policy, reward=21.272! |
|
[2024-09-22 06:03:10,156][06906] Updated weights for policy 0, policy_version 620 (0.0016) |
|
[2024-09-22 06:03:13,031][04746] Fps is (10 sec: 12697.5, 60 sec: 13107.2, 300 sec: 12861.4). Total num frames: 2572288. Throughput: 0: 3277.0. Samples: 640508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:03:13,033][04746] Avg episode reward: [(0, '21.166')] |
|
[2024-09-22 06:03:13,485][06906] Updated weights for policy 0, policy_version 630 (0.0015) |
|
[2024-09-22 06:03:16,486][06906] Updated weights for policy 0, policy_version 640 (0.0015) |
|
[2024-09-22 06:03:18,031][04746] Fps is (10 sec: 13107.1, 60 sec: 13175.4, 300 sec: 12887.4). Total num frames: 2641920. Throughput: 0: 3288.1. Samples: 660510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:03:18,034][04746] Avg episode reward: [(0, '20.795')] |
|
[2024-09-22 06:03:19,482][06906] Updated weights for policy 0, policy_version 650 (0.0017) |
|
[2024-09-22 06:03:22,441][06906] Updated weights for policy 0, policy_version 660 (0.0017) |
|
[2024-09-22 06:03:23,031][04746] Fps is (10 sec: 13516.8, 60 sec: 13107.2, 300 sec: 12892.6). Total num frames: 2707456. Throughput: 0: 3287.4. Samples: 670712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:03:23,034][04746] Avg episode reward: [(0, '19.731')] |
|
[2024-09-22 06:03:25,766][06906] Updated weights for policy 0, policy_version 670 (0.0017) |
|
[2024-09-22 06:03:28,031][04746] Fps is (10 sec: 13107.3, 60 sec: 13175.5, 300 sec: 12897.6). Total num frames: 2772992. Throughput: 0: 3283.5. Samples: 689998. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:03:28,033][04746] Avg episode reward: [(0, '21.732')] |
|
[2024-09-22 06:03:28,043][06893] Saving new best policy, reward=21.732! |
|
[2024-09-22 06:03:28,876][06906] Updated weights for policy 0, policy_version 680 (0.0020) |
|
[2024-09-22 06:03:31,786][06906] Updated weights for policy 0, policy_version 690 (0.0016) |
|
[2024-09-22 06:03:33,031][04746] Fps is (10 sec: 13516.8, 60 sec: 13243.7, 300 sec: 12921.0). Total num frames: 2842624. Throughput: 0: 3303.7. Samples: 710774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:03:33,033][04746] Avg episode reward: [(0, '21.135')] |
|
[2024-09-22 06:03:34,764][06906] Updated weights for policy 0, policy_version 700 (0.0015) |
|
[2024-09-22 06:03:37,840][06906] Updated weights for policy 0, policy_version 710 (0.0014) |
|
[2024-09-22 06:03:38,031][04746] Fps is (10 sec: 13516.6, 60 sec: 13175.4, 300 sec: 12925.1). Total num frames: 2908160. Throughput: 0: 3307.5. Samples: 721014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:03:38,034][04746] Avg episode reward: [(0, '19.791')] |
|
[2024-09-22 06:03:38,044][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000710_2908160.pth... |
|
[2024-09-22 06:03:41,191][06906] Updated weights for policy 0, policy_version 720 (0.0017) |
|
[2024-09-22 06:03:43,031][04746] Fps is (10 sec: 13106.9, 60 sec: 13243.7, 300 sec: 12929.1). Total num frames: 2973696. Throughput: 0: 3304.0. Samples: 740116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:03:43,034][04746] Avg episode reward: [(0, '21.972')] |
|
[2024-09-22 06:03:43,037][06893] Saving new best policy, reward=21.972! |
|
[2024-09-22 06:03:44,083][06906] Updated weights for policy 0, policy_version 730 (0.0015) |
|
[2024-09-22 06:03:46,998][06906] Updated weights for policy 0, policy_version 740 (0.0014) |
|
[2024-09-22 06:03:48,031][04746] Fps is (10 sec: 13517.1, 60 sec: 13243.7, 300 sec: 12950.3). Total num frames: 3043328. Throughput: 0: 3329.4. Samples: 761404. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:03:48,033][04746] Avg episode reward: [(0, '26.497')] |
|
[2024-09-22 06:03:48,044][06893] Saving new best policy, reward=26.497! |
|
[2024-09-22 06:03:49,881][06906] Updated weights for policy 0, policy_version 750 (0.0015) |
|
[2024-09-22 06:03:52,913][06906] Updated weights for policy 0, policy_version 760 (0.0019) |
|
[2024-09-22 06:03:53,031][04746] Fps is (10 sec: 13926.7, 60 sec: 13312.0, 300 sec: 12970.7). Total num frames: 3112960. Throughput: 0: 3336.3. Samples: 771812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:03:53,033][04746] Avg episode reward: [(0, '27.469')] |
|
[2024-09-22 06:03:53,035][06893] Saving new best policy, reward=27.469! |
|
[2024-09-22 06:03:56,124][06906] Updated weights for policy 0, policy_version 770 (0.0021) |
|
[2024-09-22 06:03:58,031][04746] Fps is (10 sec: 13516.8, 60 sec: 13380.3, 300 sec: 12973.5). Total num frames: 3178496. Throughput: 0: 3348.5. Samples: 791188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:03:58,034][04746] Avg episode reward: [(0, '24.919')] |
|
[2024-09-22 06:03:59,087][06906] Updated weights for policy 0, policy_version 780 (0.0015) |
|
[2024-09-22 06:04:02,101][06906] Updated weights for policy 0, policy_version 790 (0.0017) |
|
[2024-09-22 06:04:03,031][04746] Fps is (10 sec: 13516.8, 60 sec: 13380.3, 300 sec: 12992.5). Total num frames: 3248128. Throughput: 0: 3368.8. Samples: 812106. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:04:03,034][04746] Avg episode reward: [(0, '24.168')] |
|
[2024-09-22 06:04:05,093][06906] Updated weights for policy 0, policy_version 800 (0.0017) |
|
[2024-09-22 06:04:08,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13312.0, 300 sec: 12978.7). Total num frames: 3309568. Throughput: 0: 3366.3. Samples: 822196. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:04:08,033][04746] Avg episode reward: [(0, '23.725')] |
|
[2024-09-22 06:04:08,410][06906] Updated weights for policy 0, policy_version 810 (0.0014) |
|
[2024-09-22 06:04:11,477][06906] Updated weights for policy 0, policy_version 820 (0.0015) |
|
[2024-09-22 06:04:13,031][04746] Fps is (10 sec: 13107.2, 60 sec: 13448.5, 300 sec: 12996.9). Total num frames: 3379200. Throughput: 0: 3366.3. Samples: 841482. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:04:13,034][04746] Avg episode reward: [(0, '23.472')] |
|
[2024-09-22 06:04:14,400][06906] Updated weights for policy 0, policy_version 830 (0.0014) |
|
[2024-09-22 06:04:17,234][06906] Updated weights for policy 0, policy_version 840 (0.0015) |
|
[2024-09-22 06:04:18,031][04746] Fps is (10 sec: 13926.3, 60 sec: 13448.5, 300 sec: 13014.5). Total num frames: 3448832. Throughput: 0: 3378.0. Samples: 862784. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:04:18,033][04746] Avg episode reward: [(0, '21.957')] |
|
[2024-09-22 06:04:20,179][06906] Updated weights for policy 0, policy_version 850 (0.0016) |
|
[2024-09-22 06:04:23,031][04746] Fps is (10 sec: 13516.9, 60 sec: 13448.5, 300 sec: 13016.2). Total num frames: 3514368. Throughput: 0: 3377.9. Samples: 873020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:04:23,034][04746] Avg episode reward: [(0, '22.854')] |
|
[2024-09-22 06:04:23,411][06906] Updated weights for policy 0, policy_version 860 (0.0016) |
|
[2024-09-22 06:04:26,393][06906] Updated weights for policy 0, policy_version 870 (0.0016) |
|
[2024-09-22 06:04:28,031][04746] Fps is (10 sec: 13516.5, 60 sec: 13516.7, 300 sec: 13032.7). Total num frames: 3584000. Throughput: 0: 3400.0. Samples: 893116. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:04:28,034][04746] Avg episode reward: [(0, '24.346')] |
|
[2024-09-22 06:04:29,222][06906] Updated weights for policy 0, policy_version 880 (0.0015) |
|
[2024-09-22 06:04:32,062][06906] Updated weights for policy 0, policy_version 890 (0.0016) |
|
[2024-09-22 06:04:33,031][04746] Fps is (10 sec: 14336.0, 60 sec: 13585.1, 300 sec: 13063.3). Total num frames: 3657728. Throughput: 0: 3406.0. Samples: 914676. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-09-22 06:04:33,034][04746] Avg episode reward: [(0, '25.572')] |
|
[2024-09-22 06:04:34,926][06906] Updated weights for policy 0, policy_version 900 (0.0015) |
|
[2024-09-22 06:04:38,031][04746] Fps is (10 sec: 13926.8, 60 sec: 13585.1, 300 sec: 13064.1). Total num frames: 3723264. Throughput: 0: 3406.2. Samples: 925090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:04:38,033][04746] Avg episode reward: [(0, '25.821')] |
|
[2024-09-22 06:04:38,157][06906] Updated weights for policy 0, policy_version 910 (0.0017) |
|
[2024-09-22 06:04:41,098][06906] Updated weights for policy 0, policy_version 920 (0.0017) |
|
[2024-09-22 06:04:43,031][04746] Fps is (10 sec: 13516.8, 60 sec: 13653.4, 300 sec: 13079.0). Total num frames: 3792896. Throughput: 0: 3424.8. Samples: 945304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:04:43,033][04746] Avg episode reward: [(0, '25.851')] |
|
[2024-09-22 06:04:43,917][06906] Updated weights for policy 0, policy_version 930 (0.0015) |
|
[2024-09-22 06:04:46,694][06906] Updated weights for policy 0, policy_version 940 (0.0014) |
|
[2024-09-22 06:04:48,031][04746] Fps is (10 sec: 14335.9, 60 sec: 13721.6, 300 sec: 13107.2). Total num frames: 3866624. Throughput: 0: 3445.4. Samples: 967150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:04:48,035][04746] Avg episode reward: [(0, '27.156')] |
|
[2024-09-22 06:04:49,569][06906] Updated weights for policy 0, policy_version 950 (0.0014) |
|
[2024-09-22 06:04:52,744][06906] Updated weights for policy 0, policy_version 960 (0.0016) |
|
[2024-09-22 06:04:53,031][04746] Fps is (10 sec: 13926.3, 60 sec: 13653.3, 300 sec: 13329.4). Total num frames: 3932160. Throughput: 0: 3449.2. Samples: 977410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:04:53,034][04746] Avg episode reward: [(0, '28.118')] |
|
[2024-09-22 06:04:53,073][06893] Saving new best policy, reward=28.118! |
|
[2024-09-22 06:04:55,653][06906] Updated weights for policy 0, policy_version 970 (0.0015) |
|
[2024-09-22 06:04:58,031][04746] Fps is (10 sec: 13926.5, 60 sec: 13789.8, 300 sec: 13398.8). Total num frames: 4005888. Throughput: 0: 3480.9. Samples: 998122. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:04:58,033][04746] Avg episode reward: [(0, '26.885')] |
|
[2024-09-22 06:04:58,467][06906] Updated weights for policy 0, policy_version 980 (0.0014) |
|
[2024-09-22 06:05:01,257][06906] Updated weights for policy 0, policy_version 990 (0.0014) |
|
[2024-09-22 06:05:03,031][04746] Fps is (10 sec: 14745.6, 60 sec: 13858.1, 300 sec: 13440.4). Total num frames: 4079616. Throughput: 0: 3496.9. Samples: 1020144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:05:03,034][04746] Avg episode reward: [(0, '26.385')] |
|
[2024-09-22 06:05:04,146][06906] Updated weights for policy 0, policy_version 1000 (0.0014) |
|
[2024-09-22 06:05:07,210][06906] Updated weights for policy 0, policy_version 1010 (0.0014) |
|
[2024-09-22 06:05:08,031][04746] Fps is (10 sec: 13926.4, 60 sec: 13926.4, 300 sec: 13426.5). Total num frames: 4145152. Throughput: 0: 3498.6. Samples: 1030456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:05:08,033][04746] Avg episode reward: [(0, '23.512')] |
|
[2024-09-22 06:05:10,028][06906] Updated weights for policy 0, policy_version 1020 (0.0014) |
|
[2024-09-22 06:05:12,812][06906] Updated weights for policy 0, policy_version 1030 (0.0016) |
|
[2024-09-22 06:05:13,031][04746] Fps is (10 sec: 13926.5, 60 sec: 13994.7, 300 sec: 13454.4). Total num frames: 4218880. Throughput: 0: 3524.5. Samples: 1051718. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) |
|
[2024-09-22 06:05:13,034][04746] Avg episode reward: [(0, '25.543')] |
|
[2024-09-22 06:05:15,610][06906] Updated weights for policy 0, policy_version 1040 (0.0018) |
|
[2024-09-22 06:05:18,031][04746] Fps is (10 sec: 14745.3, 60 sec: 14062.9, 300 sec: 13468.2). Total num frames: 4292608. Throughput: 0: 3532.4. Samples: 1073634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:05:18,033][04746] Avg episode reward: [(0, '26.777')] |
|
[2024-09-22 06:05:18,500][06906] Updated weights for policy 0, policy_version 1050 (0.0016) |
|
[2024-09-22 06:05:21,655][06906] Updated weights for policy 0, policy_version 1060 (0.0021) |
|
[2024-09-22 06:05:23,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14062.9, 300 sec: 13454.3). Total num frames: 4358144. Throughput: 0: 3519.3. Samples: 1083460. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:05:23,034][04746] Avg episode reward: [(0, '25.521')] |
|
[2024-09-22 06:05:24,450][06906] Updated weights for policy 0, policy_version 1070 (0.0014) |
|
[2024-09-22 06:05:27,263][06906] Updated weights for policy 0, policy_version 1080 (0.0014) |
|
[2024-09-22 06:05:28,031][04746] Fps is (10 sec: 13926.6, 60 sec: 14131.2, 300 sec: 13482.1). Total num frames: 4431872. Throughput: 0: 3545.8. Samples: 1104864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:05:28,033][04746] Avg episode reward: [(0, '25.581')] |
|
[2024-09-22 06:05:30,185][06906] Updated weights for policy 0, policy_version 1090 (0.0014) |
|
[2024-09-22 06:05:33,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14062.9, 300 sec: 13496.0). Total num frames: 4501504. Throughput: 0: 3531.2. Samples: 1126052. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2024-09-22 06:05:33,034][04746] Avg episode reward: [(0, '24.798')] |
|
[2024-09-22 06:05:33,088][06906] Updated weights for policy 0, policy_version 1100 (0.0019) |
|
[2024-09-22 06:05:36,247][06906] Updated weights for policy 0, policy_version 1110 (0.0015) |
|
[2024-09-22 06:05:38,031][04746] Fps is (10 sec: 13926.5, 60 sec: 14131.2, 300 sec: 13496.0). Total num frames: 4571136. Throughput: 0: 3521.8. Samples: 1135890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:05:38,034][04746] Avg episode reward: [(0, '22.580')] |
|
[2024-09-22 06:05:38,043][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001116_4571136.pth... |
|
[2024-09-22 06:05:38,124][06893] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000324_1327104.pth |
|
[2024-09-22 06:05:39,129][06906] Updated weights for policy 0, policy_version 1120 (0.0015) |
|
[2024-09-22 06:05:41,966][06906] Updated weights for policy 0, policy_version 1130 (0.0016) |
|
[2024-09-22 06:05:43,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14131.2, 300 sec: 13509.9). Total num frames: 4640768. Throughput: 0: 3534.8. Samples: 1157188. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:05:43,035][04746] Avg episode reward: [(0, '25.251')] |
|
[2024-09-22 06:05:44,780][06906] Updated weights for policy 0, policy_version 1140 (0.0016) |
|
[2024-09-22 06:05:47,685][06906] Updated weights for policy 0, policy_version 1150 (0.0016) |
|
[2024-09-22 06:05:48,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 13523.7). Total num frames: 4714496. Throughput: 0: 3521.1. Samples: 1178592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:05:48,034][04746] Avg episode reward: [(0, '25.028')] |
|
[2024-09-22 06:05:50,830][06906] Updated weights for policy 0, policy_version 1160 (0.0018) |
|
[2024-09-22 06:05:53,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14131.2, 300 sec: 13509.9). Total num frames: 4780032. Throughput: 0: 3511.0. Samples: 1188452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:05:53,033][04746] Avg episode reward: [(0, '25.848')] |
|
[2024-09-22 06:05:53,683][06906] Updated weights for policy 0, policy_version 1170 (0.0016) |
|
[2024-09-22 06:05:56,521][06906] Updated weights for policy 0, policy_version 1180 (0.0015) |
|
[2024-09-22 06:05:58,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14131.2, 300 sec: 13551.5). Total num frames: 4853760. Throughput: 0: 3520.9. Samples: 1210160. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:05:58,033][04746] Avg episode reward: [(0, '27.013')] |
|
[2024-09-22 06:05:59,297][06906] Updated weights for policy 0, policy_version 1190 (0.0014) |
|
[2024-09-22 06:06:02,256][06906] Updated weights for policy 0, policy_version 1200 (0.0014) |
|
[2024-09-22 06:06:03,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14062.9, 300 sec: 13551.5). Total num frames: 4923392. Throughput: 0: 3504.4. Samples: 1231332. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:06:03,034][04746] Avg episode reward: [(0, '26.650')] |
|
[2024-09-22 06:06:05,409][06906] Updated weights for policy 0, policy_version 1210 (0.0015) |
|
[2024-09-22 06:06:08,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14131.2, 300 sec: 13551.5). Total num frames: 4993024. Throughput: 0: 3504.7. Samples: 1241172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:06:08,034][04746] Avg episode reward: [(0, '28.227')] |
|
[2024-09-22 06:06:08,044][06893] Saving new best policy, reward=28.227! |
|
[2024-09-22 06:06:08,226][06906] Updated weights for policy 0, policy_version 1220 (0.0015) |
|
[2024-09-22 06:06:11,012][06906] Updated weights for policy 0, policy_version 1230 (0.0017) |
|
[2024-09-22 06:06:13,031][04746] Fps is (10 sec: 14336.1, 60 sec: 14131.2, 300 sec: 13593.2). Total num frames: 5066752. Throughput: 0: 3515.6. Samples: 1263066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:06:13,033][04746] Avg episode reward: [(0, '27.215')] |
|
[2024-09-22 06:06:13,844][06906] Updated weights for policy 0, policy_version 1240 (0.0016) |
|
[2024-09-22 06:06:16,820][06906] Updated weights for policy 0, policy_version 1250 (0.0014) |
|
[2024-09-22 06:06:18,031][04746] Fps is (10 sec: 13926.6, 60 sec: 13994.7, 300 sec: 13579.3). Total num frames: 5132288. Throughput: 0: 3506.6. Samples: 1283850. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:06:18,034][04746] Avg episode reward: [(0, '26.215')] |
|
[2024-09-22 06:06:19,919][06906] Updated weights for policy 0, policy_version 1260 (0.0016) |
|
[2024-09-22 06:06:22,675][06906] Updated weights for policy 0, policy_version 1270 (0.0015) |
|
[2024-09-22 06:06:23,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14131.2, 300 sec: 13607.1). Total num frames: 5206016. Throughput: 0: 3515.7. Samples: 1294098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:06:23,033][04746] Avg episode reward: [(0, '27.755')] |
|
[2024-09-22 06:06:25,477][06906] Updated weights for policy 0, policy_version 1280 (0.0017) |
|
[2024-09-22 06:06:28,031][04746] Fps is (10 sec: 14745.7, 60 sec: 14131.2, 300 sec: 13648.7). Total num frames: 5279744. Throughput: 0: 3536.1. Samples: 1316314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:06:28,033][04746] Avg episode reward: [(0, '28.448')] |
|
[2024-09-22 06:06:28,042][06893] Saving new best policy, reward=28.448! |
|
[2024-09-22 06:06:28,227][06906] Updated weights for policy 0, policy_version 1290 (0.0016) |
|
[2024-09-22 06:06:31,196][06906] Updated weights for policy 0, policy_version 1300 (0.0014) |
|
[2024-09-22 06:06:33,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14062.9, 300 sec: 13634.8). Total num frames: 5345280. Throughput: 0: 3520.6. Samples: 1337020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:06:33,033][04746] Avg episode reward: [(0, '29.035')] |
|
[2024-09-22 06:06:33,088][06893] Saving new best policy, reward=29.035! |
|
[2024-09-22 06:06:34,328][06906] Updated weights for policy 0, policy_version 1310 (0.0016) |
|
[2024-09-22 06:06:37,134][06906] Updated weights for policy 0, policy_version 1320 (0.0014) |
|
[2024-09-22 06:06:38,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14131.2, 300 sec: 13662.6). Total num frames: 5419008. Throughput: 0: 3534.7. Samples: 1347514. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:06:38,033][04746] Avg episode reward: [(0, '25.313')] |
|
[2024-09-22 06:06:39,945][06906] Updated weights for policy 0, policy_version 1330 (0.0014) |
|
[2024-09-22 06:06:42,784][06906] Updated weights for policy 0, policy_version 1340 (0.0013) |
|
[2024-09-22 06:06:43,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 13676.5). Total num frames: 5488640. Throughput: 0: 3534.7. Samples: 1369222. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:06:43,034][04746] Avg episode reward: [(0, '29.276')] |
|
[2024-09-22 06:06:43,069][06893] Saving new best policy, reward=29.276! |
|
[2024-09-22 06:06:45,725][06906] Updated weights for policy 0, policy_version 1350 (0.0014) |
|
[2024-09-22 06:06:48,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14062.9, 300 sec: 13676.5). Total num frames: 5558272. Throughput: 0: 3523.2. Samples: 1389876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:06:48,035][04746] Avg episode reward: [(0, '28.923')] |
|
[2024-09-22 06:06:48,756][06906] Updated weights for policy 0, policy_version 1360 (0.0018) |
|
[2024-09-22 06:06:51,574][06906] Updated weights for policy 0, policy_version 1370 (0.0015) |
|
[2024-09-22 06:06:53,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14199.5, 300 sec: 13704.2). Total num frames: 5632000. Throughput: 0: 3545.9. Samples: 1400736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:06:53,034][04746] Avg episode reward: [(0, '28.842')] |
|
[2024-09-22 06:06:54,417][06906] Updated weights for policy 0, policy_version 1380 (0.0015) |
|
[2024-09-22 06:06:57,212][06906] Updated weights for policy 0, policy_version 1390 (0.0013) |
|
[2024-09-22 06:06:58,031][04746] Fps is (10 sec: 14745.6, 60 sec: 14199.5, 300 sec: 13732.0). Total num frames: 5705728. Throughput: 0: 3544.0. Samples: 1422546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:06:58,034][04746] Avg episode reward: [(0, '29.792')] |
|
[2024-09-22 06:06:58,044][06893] Saving new best policy, reward=29.792! |
|
[2024-09-22 06:07:00,170][06906] Updated weights for policy 0, policy_version 1400 (0.0016) |
|
[2024-09-22 06:07:03,031][04746] Fps is (10 sec: 13926.0, 60 sec: 14131.1, 300 sec: 13732.0). Total num frames: 5771264. Throughput: 0: 3535.4. Samples: 1442942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:07:03,034][04746] Avg episode reward: [(0, '26.261')] |
|
[2024-09-22 06:07:03,245][06906] Updated weights for policy 0, policy_version 1410 (0.0015) |
|
[2024-09-22 06:07:06,026][06906] Updated weights for policy 0, policy_version 1420 (0.0014) |
|
[2024-09-22 06:07:08,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14199.5, 300 sec: 13759.8). Total num frames: 5844992. Throughput: 0: 3553.7. Samples: 1454014. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:07:08,033][04746] Avg episode reward: [(0, '23.809')] |
|
[2024-09-22 06:07:08,746][06906] Updated weights for policy 0, policy_version 1430 (0.0013) |
|
[2024-09-22 06:07:11,538][06906] Updated weights for policy 0, policy_version 1440 (0.0016) |
|
[2024-09-22 06:07:13,031][04746] Fps is (10 sec: 14746.0, 60 sec: 14199.5, 300 sec: 13787.6). Total num frames: 5918720. Throughput: 0: 3553.0. Samples: 1476198. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:07:13,034][04746] Avg episode reward: [(0, '24.877')] |
|
[2024-09-22 06:07:14,501][06906] Updated weights for policy 0, policy_version 1450 (0.0014) |
|
[2024-09-22 06:07:17,502][06906] Updated weights for policy 0, policy_version 1460 (0.0018) |
|
[2024-09-22 06:07:18,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14199.5, 300 sec: 13773.7). Total num frames: 5984256. Throughput: 0: 3550.3. Samples: 1496784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:07:18,033][04746] Avg episode reward: [(0, '23.963')] |
|
[2024-09-22 06:07:20,270][06906] Updated weights for policy 0, policy_version 1470 (0.0014) |
|
[2024-09-22 06:07:23,031][04746] Fps is (10 sec: 13926.5, 60 sec: 14199.5, 300 sec: 13815.3). Total num frames: 6057984. Throughput: 0: 3567.2. Samples: 1508040. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2024-09-22 06:07:23,035][04746] Avg episode reward: [(0, '24.729')] |
|
[2024-09-22 06:07:23,054][06906] Updated weights for policy 0, policy_version 1480 (0.0016) |
|
[2024-09-22 06:07:25,760][06906] Updated weights for policy 0, policy_version 1490 (0.0014) |
|
[2024-09-22 06:07:28,031][04746] Fps is (10 sec: 14745.5, 60 sec: 14199.4, 300 sec: 13843.1). Total num frames: 6131712. Throughput: 0: 3574.1. Samples: 1530058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:07:28,033][04746] Avg episode reward: [(0, '27.785')] |
|
[2024-09-22 06:07:28,824][06906] Updated weights for policy 0, policy_version 1500 (0.0018) |
|
[2024-09-22 06:07:31,944][06906] Updated weights for policy 0, policy_version 1510 (0.0018) |
|
[2024-09-22 06:07:33,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14199.5, 300 sec: 13829.2). Total num frames: 6197248. Throughput: 0: 3557.4. Samples: 1549960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:07:33,034][04746] Avg episode reward: [(0, '29.769')] |
|
[2024-09-22 06:07:34,749][06906] Updated weights for policy 0, policy_version 1520 (0.0016) |
|
[2024-09-22 06:07:37,616][06906] Updated weights for policy 0, policy_version 1530 (0.0014) |
|
[2024-09-22 06:07:38,031][04746] Fps is (10 sec: 13926.6, 60 sec: 14199.5, 300 sec: 13870.9). Total num frames: 6270976. Throughput: 0: 3557.6. Samples: 1560828. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-09-22 06:07:38,033][04746] Avg episode reward: [(0, '26.364')] |
|
[2024-09-22 06:07:38,042][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001531_6270976.pth... |
|
[2024-09-22 06:07:38,123][06893] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000710_2908160.pth |
|
[2024-09-22 06:07:40,498][06906] Updated weights for policy 0, policy_version 1540 (0.0015) |
|
[2024-09-22 06:07:43,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14199.5, 300 sec: 13870.9). Total num frames: 6340608. Throughput: 0: 3542.5. Samples: 1581960. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:07:43,034][04746] Avg episode reward: [(0, '24.922')] |
|
[2024-09-22 06:07:43,629][06906] Updated weights for policy 0, policy_version 1550 (0.0015) |
|
[2024-09-22 06:07:46,667][06906] Updated weights for policy 0, policy_version 1560 (0.0015) |
|
[2024-09-22 06:07:48,031][04746] Fps is (10 sec: 13516.7, 60 sec: 14131.2, 300 sec: 13870.9). Total num frames: 6406144. Throughput: 0: 3544.8. Samples: 1602458. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:07:48,034][04746] Avg episode reward: [(0, '27.267')] |
|
[2024-09-22 06:07:49,407][06906] Updated weights for policy 0, policy_version 1570 (0.0014) |
|
[2024-09-22 06:07:52,150][06906] Updated weights for policy 0, policy_version 1580 (0.0016) |
|
[2024-09-22 06:07:53,031][04746] Fps is (10 sec: 14336.1, 60 sec: 14199.5, 300 sec: 13926.4). Total num frames: 6483968. Throughput: 0: 3545.2. Samples: 1613550. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:07:53,034][04746] Avg episode reward: [(0, '26.000')] |
|
[2024-09-22 06:07:54,976][06906] Updated weights for policy 0, policy_version 1590 (0.0015) |
|
[2024-09-22 06:07:57,903][06906] Updated weights for policy 0, policy_version 1600 (0.0019) |
|
[2024-09-22 06:07:58,031][04746] Fps is (10 sec: 14745.6, 60 sec: 14131.2, 300 sec: 13926.4). Total num frames: 6553600. Throughput: 0: 3533.6. Samples: 1635212. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:07:58,034][04746] Avg episode reward: [(0, '28.912')] |
|
[2024-09-22 06:08:00,923][06906] Updated weights for policy 0, policy_version 1610 (0.0015) |
|
[2024-09-22 06:08:03,031][04746] Fps is (10 sec: 13926.3, 60 sec: 14199.5, 300 sec: 13940.3). Total num frames: 6623232. Throughput: 0: 3544.4. Samples: 1656284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:03,033][04746] Avg episode reward: [(0, '31.208')] |
|
[2024-09-22 06:08:03,037][06893] Saving new best policy, reward=31.208! |
|
[2024-09-22 06:08:03,740][06906] Updated weights for policy 0, policy_version 1620 (0.0013) |
|
[2024-09-22 06:08:06,459][06906] Updated weights for policy 0, policy_version 1630 (0.0014) |
|
[2024-09-22 06:08:08,031][04746] Fps is (10 sec: 14335.9, 60 sec: 14199.5, 300 sec: 13981.9). Total num frames: 6696960. Throughput: 0: 3540.1. Samples: 1667344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:08,034][04746] Avg episode reward: [(0, '29.081')] |
|
[2024-09-22 06:08:09,240][06906] Updated weights for policy 0, policy_version 1640 (0.0014) |
|
[2024-09-22 06:08:12,326][06906] Updated weights for policy 0, policy_version 1650 (0.0016) |
|
[2024-09-22 06:08:13,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 13981.9). Total num frames: 6766592. Throughput: 0: 3523.4. Samples: 1688610. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) |
|
[2024-09-22 06:08:13,034][04746] Avg episode reward: [(0, '28.647')] |
|
[2024-09-22 06:08:15,293][06906] Updated weights for policy 0, policy_version 1660 (0.0015) |
|
[2024-09-22 06:08:18,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 6836224. Throughput: 0: 3552.0. Samples: 1709802. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:08:18,034][04746] Avg episode reward: [(0, '27.663')] |
|
[2024-09-22 06:08:18,055][06906] Updated weights for policy 0, policy_version 1670 (0.0014) |
|
[2024-09-22 06:08:20,785][06906] Updated weights for policy 0, policy_version 1680 (0.0014) |
|
[2024-09-22 06:08:23,031][04746] Fps is (10 sec: 14745.7, 60 sec: 14267.7, 300 sec: 14037.5). Total num frames: 6914048. Throughput: 0: 3558.1. Samples: 1720942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:23,034][04746] Avg episode reward: [(0, '27.779')] |
|
[2024-09-22 06:08:23,555][06906] Updated weights for policy 0, policy_version 1690 (0.0014) |
|
[2024-09-22 06:08:26,565][06906] Updated weights for policy 0, policy_version 1700 (0.0014) |
|
[2024-09-22 06:08:28,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14131.2, 300 sec: 14023.6). Total num frames: 6979584. Throughput: 0: 3560.0. Samples: 1742160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:08:28,034][04746] Avg episode reward: [(0, '26.484')] |
|
[2024-09-22 06:08:29,575][06906] Updated weights for policy 0, policy_version 1710 (0.0014) |
|
[2024-09-22 06:08:32,341][06906] Updated weights for policy 0, policy_version 1720 (0.0014) |
|
[2024-09-22 06:08:33,031][04746] Fps is (10 sec: 13926.5, 60 sec: 14267.8, 300 sec: 14051.4). Total num frames: 7053312. Throughput: 0: 3584.4. Samples: 1763758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:08:33,034][04746] Avg episode reward: [(0, '28.002')] |
|
[2024-09-22 06:08:35,066][06906] Updated weights for policy 0, policy_version 1730 (0.0014) |
|
[2024-09-22 06:08:37,861][06906] Updated weights for policy 0, policy_version 1740 (0.0017) |
|
[2024-09-22 06:08:38,031][04746] Fps is (10 sec: 14745.6, 60 sec: 14267.7, 300 sec: 14079.1). Total num frames: 7127040. Throughput: 0: 3585.8. Samples: 1774912. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:38,035][04746] Avg episode reward: [(0, '25.099')] |
|
[2024-09-22 06:08:40,840][06906] Updated weights for policy 0, policy_version 1750 (0.0015) |
|
[2024-09-22 06:08:43,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14267.8, 300 sec: 14079.1). Total num frames: 7196672. Throughput: 0: 3568.1. Samples: 1795776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:43,034][04746] Avg episode reward: [(0, '24.603')] |
|
[2024-09-22 06:08:43,786][06906] Updated weights for policy 0, policy_version 1760 (0.0015) |
|
[2024-09-22 06:08:46,470][06906] Updated weights for policy 0, policy_version 1770 (0.0014) |
|
[2024-09-22 06:08:48,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14404.3, 300 sec: 14093.0). Total num frames: 7270400. Throughput: 0: 3595.8. Samples: 1818094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:48,033][04746] Avg episode reward: [(0, '28.280')] |
|
[2024-09-22 06:08:49,259][06906] Updated weights for policy 0, policy_version 1780 (0.0018) |
|
[2024-09-22 06:08:51,996][06906] Updated weights for policy 0, policy_version 1790 (0.0015) |
|
[2024-09-22 06:08:53,031][04746] Fps is (10 sec: 14745.4, 60 sec: 14336.0, 300 sec: 14120.8). Total num frames: 7344128. Throughput: 0: 3596.8. Samples: 1829200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:53,034][04746] Avg episode reward: [(0, '30.855')] |
|
[2024-09-22 06:08:54,933][06906] Updated weights for policy 0, policy_version 1800 (0.0017) |
|
[2024-09-22 06:08:57,835][06906] Updated weights for policy 0, policy_version 1810 (0.0015) |
|
[2024-09-22 06:08:58,031][04746] Fps is (10 sec: 14336.0, 60 sec: 14336.0, 300 sec: 14120.8). Total num frames: 7413760. Throughput: 0: 3586.4. Samples: 1849996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:08:58,034][04746] Avg episode reward: [(0, '27.938')] |
|
[2024-09-22 06:09:00,571][06906] Updated weights for policy 0, policy_version 1820 (0.0014) |
|
[2024-09-22 06:09:03,031][04746] Fps is (10 sec: 14745.8, 60 sec: 14472.6, 300 sec: 14176.3). Total num frames: 7491584. Throughput: 0: 3621.7. Samples: 1872776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:09:03,033][04746] Avg episode reward: [(0, '27.574')] |
|
[2024-09-22 06:09:03,308][06906] Updated weights for policy 0, policy_version 1830 (0.0015) |
|
[2024-09-22 06:09:05,943][06906] Updated weights for policy 0, policy_version 1840 (0.0016) |
|
[2024-09-22 06:09:08,031][04746] Fps is (10 sec: 15155.2, 60 sec: 14472.5, 300 sec: 14190.2). Total num frames: 7565312. Throughput: 0: 3629.4. Samples: 1884264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:09:08,034][04746] Avg episode reward: [(0, '30.259')] |
|
[2024-09-22 06:09:08,883][06906] Updated weights for policy 0, policy_version 1850 (0.0015) |
|
[2024-09-22 06:09:11,811][06906] Updated weights for policy 0, policy_version 1860 (0.0016) |
|
[2024-09-22 06:09:13,031][04746] Fps is (10 sec: 14335.9, 60 sec: 14472.6, 300 sec: 14190.2). Total num frames: 7634944. Throughput: 0: 3622.0. Samples: 1905148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:09:13,034][04746] Avg episode reward: [(0, '26.642')] |
|
[2024-09-22 06:09:14,570][06906] Updated weights for policy 0, policy_version 1870 (0.0015) |
|
[2024-09-22 06:09:17,253][06906] Updated weights for policy 0, policy_version 1880 (0.0014) |
|
[2024-09-22 06:09:18,031][04746] Fps is (10 sec: 14335.9, 60 sec: 14540.8, 300 sec: 14218.0). Total num frames: 7708672. Throughput: 0: 3644.3. Samples: 1927752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:09:18,033][04746] Avg episode reward: [(0, '25.555')] |
|
[2024-09-22 06:09:20,033][06906] Updated weights for policy 0, policy_version 1890 (0.0016) |
|
[2024-09-22 06:09:23,025][06906] Updated weights for policy 0, policy_version 1900 (0.0014) |
|
[2024-09-22 06:09:23,038][04746] Fps is (10 sec: 14734.9, 60 sec: 14470.8, 300 sec: 14231.5). Total num frames: 7782400. Throughput: 0: 3644.5. Samples: 1938942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:09:23,043][04746] Avg episode reward: [(0, '28.122')] |
|
[2024-09-22 06:09:26,050][06906] Updated weights for policy 0, policy_version 1910 (0.0014) |
|
[2024-09-22 06:09:28,031][04746] Fps is (10 sec: 14336.3, 60 sec: 14540.8, 300 sec: 14218.0). Total num frames: 7852032. Throughput: 0: 3632.6. Samples: 1959244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:09:28,033][04746] Avg episode reward: [(0, '28.462')] |
|
[2024-09-22 06:09:28,842][06906] Updated weights for policy 0, policy_version 1920 (0.0013) |
|
[2024-09-22 06:09:31,567][06906] Updated weights for policy 0, policy_version 1930 (0.0017) |
|
[2024-09-22 06:09:33,031][04746] Fps is (10 sec: 14346.3, 60 sec: 14540.8, 300 sec: 14245.7). Total num frames: 7925760. Throughput: 0: 3632.2. Samples: 1981544. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:09:33,033][04746] Avg episode reward: [(0, '33.180')] |
|
[2024-09-22 06:09:33,035][06893] Saving new best policy, reward=33.180! |
|
[2024-09-22 06:09:34,368][06906] Updated weights for policy 0, policy_version 1940 (0.0014) |
|
[2024-09-22 06:09:37,321][06906] Updated weights for policy 0, policy_version 1950 (0.0014) |
|
[2024-09-22 06:09:38,031][04746] Fps is (10 sec: 14335.8, 60 sec: 14472.5, 300 sec: 14245.7). Total num frames: 7995392. Throughput: 0: 3627.2. Samples: 1992422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:09:38,033][04746] Avg episode reward: [(0, '31.870')] |
|
[2024-09-22 06:09:38,044][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001952_7995392.pth... |
|
[2024-09-22 06:09:38,118][06893] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001116_4571136.pth |
|
[2024-09-22 06:09:40,305][06906] Updated weights for policy 0, policy_version 1960 (0.0015) |
|
[2024-09-22 06:09:43,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14472.5, 300 sec: 14231.9). Total num frames: 8065024. Throughput: 0: 3618.8. Samples: 2012844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:09:43,034][04746] Avg episode reward: [(0, '28.848')] |
|
[2024-09-22 06:09:43,249][06906] Updated weights for policy 0, policy_version 1970 (0.0015) |
|
[2024-09-22 06:09:46,191][06906] Updated weights for policy 0, policy_version 1980 (0.0014) |
|
[2024-09-22 06:09:48,031][04746] Fps is (10 sec: 13926.4, 60 sec: 14404.3, 300 sec: 14245.7). Total num frames: 8134656. Throughput: 0: 3579.4. Samples: 2033850. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:09:48,033][04746] Avg episode reward: [(0, '30.720')] |
|
[2024-09-22 06:09:49,107][06906] Updated weights for policy 0, policy_version 1990 (0.0016) |
|
[2024-09-22 06:09:52,217][06906] Updated weights for policy 0, policy_version 2000 (0.0017) |
|
[2024-09-22 06:09:53,031][04746] Fps is (10 sec: 13516.9, 60 sec: 14267.8, 300 sec: 14218.0). Total num frames: 8200192. Throughput: 0: 3547.8. Samples: 2043916. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-09-22 06:09:53,033][04746] Avg episode reward: [(0, '31.790')] |
|
[2024-09-22 06:09:55,358][06906] Updated weights for policy 0, policy_version 2010 (0.0014) |
|
[2024-09-22 06:09:58,031][04746] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14190.2). Total num frames: 8265728. Throughput: 0: 3515.7. Samples: 2063356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:09:58,033][04746] Avg episode reward: [(0, '30.821')] |
|
[2024-09-22 06:09:58,682][06906] Updated weights for policy 0, policy_version 2020 (0.0017) |
|
[2024-09-22 06:10:02,065][06906] Updated weights for policy 0, policy_version 2030 (0.0017) |
|
[2024-09-22 06:10:03,031][04746] Fps is (10 sec: 12287.9, 60 sec: 13858.1, 300 sec: 14162.4). Total num frames: 8323072. Throughput: 0: 3419.3. Samples: 2081622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:10:03,034][04746] Avg episode reward: [(0, '27.214')] |
|
[2024-09-22 06:10:05,664][06906] Updated weights for policy 0, policy_version 2040 (0.0017) |
|
[2024-09-22 06:10:08,031][04746] Fps is (10 sec: 11468.7, 60 sec: 13585.1, 300 sec: 14106.9). Total num frames: 8380416. Throughput: 0: 3358.8. Samples: 2090066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:10:08,033][04746] Avg episode reward: [(0, '28.579')] |
|
[2024-09-22 06:10:09,296][06906] Updated weights for policy 0, policy_version 2050 (0.0017) |
|
[2024-09-22 06:10:12,735][06906] Updated weights for policy 0, policy_version 2060 (0.0017) |
|
[2024-09-22 06:10:13,031][04746] Fps is (10 sec: 11468.7, 60 sec: 13380.2, 300 sec: 14051.4). Total num frames: 8437760. Throughput: 0: 3288.1. Samples: 2107210. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:10:13,036][04746] Avg episode reward: [(0, '30.279')] |
|
[2024-09-22 06:10:16,120][06906] Updated weights for policy 0, policy_version 2070 (0.0017) |
|
[2024-09-22 06:10:18,031][04746] Fps is (10 sec: 11878.6, 60 sec: 13175.5, 300 sec: 14037.5). Total num frames: 8499200. Throughput: 0: 3198.1. Samples: 2125458. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:10:18,034][04746] Avg episode reward: [(0, '29.975')] |
|
[2024-09-22 06:10:19,656][06906] Updated weights for policy 0, policy_version 2080 (0.0018) |
|
[2024-09-22 06:10:23,031][04746] Fps is (10 sec: 11878.4, 60 sec: 12903.9, 300 sec: 13981.9). Total num frames: 8556544. Throughput: 0: 3141.6. Samples: 2133794. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:10:23,033][04746] Avg episode reward: [(0, '27.525')] |
|
[2024-09-22 06:10:23,348][06906] Updated weights for policy 0, policy_version 2090 (0.0018) |
|
[2024-09-22 06:10:26,726][06906] Updated weights for policy 0, policy_version 2100 (0.0016) |
|
[2024-09-22 06:10:28,031][04746] Fps is (10 sec: 11468.8, 60 sec: 12697.6, 300 sec: 13940.3). Total num frames: 8613888. Throughput: 0: 3077.3. Samples: 2151320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:10:28,033][04746] Avg episode reward: [(0, '24.633')] |
|
[2024-09-22 06:10:30,089][06906] Updated weights for policy 0, policy_version 2110 (0.0015) |
|
[2024-09-22 06:10:33,031][04746] Fps is (10 sec: 11878.4, 60 sec: 12492.8, 300 sec: 13912.5). Total num frames: 8675328. Throughput: 0: 3015.4. Samples: 2169544. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-09-22 06:10:33,034][04746] Avg episode reward: [(0, '27.902')] |
|
[2024-09-22 06:10:33,512][06906] Updated weights for policy 0, policy_version 2120 (0.0014) |
|
[2024-09-22 06:10:37,238][06906] Updated weights for policy 0, policy_version 2130 (0.0017) |
|
[2024-09-22 06:10:38,031][04746] Fps is (10 sec: 11878.2, 60 sec: 12288.0, 300 sec: 13870.9). Total num frames: 8732672. Throughput: 0: 2976.3. Samples: 2177850. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:10:38,034][04746] Avg episode reward: [(0, '28.677')] |
|
[2024-09-22 06:10:40,661][06906] Updated weights for policy 0, policy_version 2140 (0.0015) |
|
[2024-09-22 06:10:43,031][04746] Fps is (10 sec: 11468.8, 60 sec: 12083.2, 300 sec: 13815.3). Total num frames: 8790016. Throughput: 0: 2932.5. Samples: 2195320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:10:43,034][04746] Avg episode reward: [(0, '27.919')] |
|
[2024-09-22 06:10:44,061][06906] Updated weights for policy 0, policy_version 2150 (0.0016) |
|
[2024-09-22 06:10:47,398][06906] Updated weights for policy 0, policy_version 2160 (0.0020) |
|
[2024-09-22 06:10:48,031][04746] Fps is (10 sec: 11878.5, 60 sec: 11946.7, 300 sec: 13801.4). Total num frames: 8851456. Throughput: 0: 2936.0. Samples: 2213744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:10:48,033][04746] Avg episode reward: [(0, '29.776')] |
|
[2024-09-22 06:10:50,642][06906] Updated weights for policy 0, policy_version 2170 (0.0017) |
|
[2024-09-22 06:10:53,031][04746] Fps is (10 sec: 12697.3, 60 sec: 11946.6, 300 sec: 13773.7). Total num frames: 8916992. Throughput: 0: 2957.7. Samples: 2223164. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:10:53,035][04746] Avg episode reward: [(0, '30.713')] |
|
[2024-09-22 06:10:53,743][06906] Updated weights for policy 0, policy_version 2180 (0.0017) |
|
[2024-09-22 06:10:56,608][06906] Updated weights for policy 0, policy_version 2190 (0.0014) |
|
[2024-09-22 06:10:58,031][04746] Fps is (10 sec: 13516.7, 60 sec: 12014.9, 300 sec: 13773.7). Total num frames: 8986624. Throughput: 0: 3037.3. Samples: 2243888. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:10:58,034][04746] Avg episode reward: [(0, '28.134')] |
|
[2024-09-22 06:10:59,455][06906] Updated weights for policy 0, policy_version 2200 (0.0015) |
|
[2024-09-22 06:11:02,390][06906] Updated weights for policy 0, policy_version 2210 (0.0019) |
|
[2024-09-22 06:11:03,033][04746] Fps is (10 sec: 13923.8, 60 sec: 12219.3, 300 sec: 13773.6). Total num frames: 9056256. Throughput: 0: 3093.6. Samples: 2264678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:11:03,035][04746] Avg episode reward: [(0, '27.638')] |
|
[2024-09-22 06:11:06,169][06906] Updated weights for policy 0, policy_version 2220 (0.0016) |
|
[2024-09-22 06:11:08,031][04746] Fps is (10 sec: 12697.7, 60 sec: 12219.8, 300 sec: 13718.1). Total num frames: 9113600. Throughput: 0: 3091.7. Samples: 2272920. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:11:08,033][04746] Avg episode reward: [(0, '28.320')] |
|
[2024-09-22 06:11:09,625][06906] Updated weights for policy 0, policy_version 2230 (0.0016) |
|
[2024-09-22 06:11:13,014][06906] Updated weights for policy 0, policy_version 2240 (0.0016) |
|
[2024-09-22 06:11:13,031][04746] Fps is (10 sec: 11880.9, 60 sec: 12288.0, 300 sec: 13704.2). Total num frames: 9175040. Throughput: 0: 3097.7. Samples: 2290716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:11:13,034][04746] Avg episode reward: [(0, '29.334')] |
|
[2024-09-22 06:11:16,343][06906] Updated weights for policy 0, policy_version 2250 (0.0016) |
|
[2024-09-22 06:11:18,031][04746] Fps is (10 sec: 11878.4, 60 sec: 12219.7, 300 sec: 13648.7). Total num frames: 9232384. Throughput: 0: 3094.1. Samples: 2308778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:11:18,033][04746] Avg episode reward: [(0, '31.025')] |
|
[2024-09-22 06:11:20,034][06906] Updated weights for policy 0, policy_version 2260 (0.0018) |
|
[2024-09-22 06:11:23,031][04746] Fps is (10 sec: 11468.9, 60 sec: 12219.7, 300 sec: 13593.2). Total num frames: 9289728. Throughput: 0: 3092.1. Samples: 2316994. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:11:23,035][04746] Avg episode reward: [(0, '29.830')] |
|
[2024-09-22 06:11:23,554][06906] Updated weights for policy 0, policy_version 2270 (0.0015) |
|
[2024-09-22 06:11:26,999][06906] Updated weights for policy 0, policy_version 2280 (0.0016) |
|
[2024-09-22 06:11:28,031][04746] Fps is (10 sec: 11878.4, 60 sec: 12288.0, 300 sec: 13579.3). Total num frames: 9351168. Throughput: 0: 3100.1. Samples: 2334826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:11:28,034][04746] Avg episode reward: [(0, '28.612')] |
|
[2024-09-22 06:11:30,359][06906] Updated weights for policy 0, policy_version 2290 (0.0014) |
|
[2024-09-22 06:11:33,031][04746] Fps is (10 sec: 11878.3, 60 sec: 12219.7, 300 sec: 13523.7). Total num frames: 9408512. Throughput: 0: 3083.4. Samples: 2352498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-09-22 06:11:33,034][04746] Avg episode reward: [(0, '30.104')] |
|
[2024-09-22 06:11:33,991][06906] Updated weights for policy 0, policy_version 2300 (0.0015) |
|
[2024-09-22 06:11:37,626][06906] Updated weights for policy 0, policy_version 2310 (0.0017) |
|
[2024-09-22 06:11:38,031][04746] Fps is (10 sec: 11468.6, 60 sec: 12219.7, 300 sec: 13482.1). Total num frames: 9465856. Throughput: 0: 3056.4. Samples: 2360700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:11:38,034][04746] Avg episode reward: [(0, '30.210')] |
|
[2024-09-22 06:11:38,043][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002311_9465856.pth... |
|
[2024-09-22 06:11:38,126][06893] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001531_6270976.pth |
|
[2024-09-22 06:11:40,997][06906] Updated weights for policy 0, policy_version 2320 (0.0017) |
|
[2024-09-22 06:11:43,031][04746] Fps is (10 sec: 11468.9, 60 sec: 12219.7, 300 sec: 13440.4). Total num frames: 9523200. Throughput: 0: 2996.4. Samples: 2378726. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:11:43,034][04746] Avg episode reward: [(0, '28.804')] |
|
[2024-09-22 06:11:44,444][06906] Updated weights for policy 0, policy_version 2330 (0.0015) |
|
[2024-09-22 06:11:48,018][06906] Updated weights for policy 0, policy_version 2340 (0.0017) |
|
[2024-09-22 06:11:48,031][04746] Fps is (10 sec: 11878.3, 60 sec: 12219.7, 300 sec: 13398.8). Total num frames: 9584640. Throughput: 0: 2922.2. Samples: 2396170. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2024-09-22 06:11:48,034][04746] Avg episode reward: [(0, '31.504')] |
|
[2024-09-22 06:11:51,642][06906] Updated weights for policy 0, policy_version 2350 (0.0017) |
|
[2024-09-22 06:11:53,031][04746] Fps is (10 sec: 11878.1, 60 sec: 12083.2, 300 sec: 13343.2). Total num frames: 9641984. Throughput: 0: 2923.3. Samples: 2404470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:11:53,034][04746] Avg episode reward: [(0, '31.838')] |
|
[2024-09-22 06:11:55,083][06906] Updated weights for policy 0, policy_version 2360 (0.0018) |
|
[2024-09-22 06:11:58,031][04746] Fps is (10 sec: 11469.1, 60 sec: 11878.4, 300 sec: 13315.5). Total num frames: 9699328. Throughput: 0: 2926.8. Samples: 2422422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:11:58,033][04746] Avg episode reward: [(0, '30.794')] |
|
[2024-09-22 06:11:58,469][06906] Updated weights for policy 0, policy_version 2370 (0.0015) |
|
[2024-09-22 06:12:01,564][06906] Updated weights for policy 0, policy_version 2380 (0.0018) |
|
[2024-09-22 06:12:03,031][04746] Fps is (10 sec: 12288.3, 60 sec: 11810.6, 300 sec: 13287.7). Total num frames: 9764864. Throughput: 0: 2956.6. Samples: 2441824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-09-22 06:12:03,034][04746] Avg episode reward: [(0, '28.643')] |
|
[2024-09-22 06:12:04,847][06906] Updated weights for policy 0, policy_version 2390 (0.0018) |
|
[2024-09-22 06:12:07,678][06906] Updated weights for policy 0, policy_version 2400 (0.0017) |
|
[2024-09-22 06:12:08,032][04746] Fps is (10 sec: 13516.3, 60 sec: 12014.8, 300 sec: 13273.8). Total num frames: 9834496. Throughput: 0: 2990.0. Samples: 2451544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-09-22 06:12:08,033][04746] Avg episode reward: [(0, '27.010')] |
|
[2024-09-22 06:12:10,518][06906] Updated weights for policy 0, policy_version 2410 (0.0015) |
|
[2024-09-22 06:12:13,031][04746] Fps is (10 sec: 13926.3, 60 sec: 12151.5, 300 sec: 13287.7). Total num frames: 9904128. Throughput: 0: 3068.1. Samples: 2472890. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-09-22 06:12:13,033][04746] Avg episode reward: [(0, '29.198')] |
|
[2024-09-22 06:12:13,580][06906] Updated weights for policy 0, policy_version 2420 (0.0014) |
|
[2024-09-22 06:12:17,131][06906] Updated weights for policy 0, policy_version 2430 (0.0017) |
|
[2024-09-22 06:12:18,031][04746] Fps is (10 sec: 12698.1, 60 sec: 12151.5, 300 sec: 13232.2). Total num frames: 9961472. Throughput: 0: 3073.9. Samples: 2490824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-09-22 06:12:18,034][04746] Avg episode reward: [(0, '29.191')] |
|
[2024-09-22 06:12:20,788][06906] Updated weights for policy 0, policy_version 2440 (0.0018) |
|
[2024-09-22 06:12:21,773][06893] Stopping Batcher_0... |
|
[2024-09-22 06:12:21,774][06893] Loop batcher_evt_loop terminating... |
|
[2024-09-22 06:12:21,777][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... |
|
[2024-09-22 06:12:21,774][04746] Component Batcher_0 stopped! |
|
[2024-09-22 06:12:21,808][06906] Weights refcount: 2 0 |
|
[2024-09-22 06:12:21,810][06906] Stopping InferenceWorker_p0-w0... |
|
[2024-09-22 06:12:21,810][06906] Loop inference_proc0-0_evt_loop terminating... |
|
[2024-09-22 06:12:21,811][04746] Component InferenceWorker_p0-w0 stopped! |
|
[2024-09-22 06:12:21,838][06913] Stopping RolloutWorker_w4... |
|
[2024-09-22 06:12:21,838][06913] Loop rollout_proc4_evt_loop terminating... |
|
[2024-09-22 06:12:21,838][04746] Component RolloutWorker_w4 stopped! |
|
[2024-09-22 06:12:21,851][06911] Stopping RolloutWorker_w5... |
|
[2024-09-22 06:12:21,852][06911] Loop rollout_proc5_evt_loop terminating... |
|
[2024-09-22 06:12:21,851][04746] Component RolloutWorker_w5 stopped! |
|
[2024-09-22 06:12:21,864][06910] Stopping RolloutWorker_w3... |
|
[2024-09-22 06:12:21,865][06910] Loop rollout_proc3_evt_loop terminating... |
|
[2024-09-22 06:12:21,867][06918] Stopping RolloutWorker_w7... |
|
[2024-09-22 06:12:21,864][04746] Component RolloutWorker_w3 stopped! |
|
[2024-09-22 06:12:21,868][06918] Loop rollout_proc7_evt_loop terminating... |
|
[2024-09-22 06:12:21,868][04746] Component RolloutWorker_w7 stopped! |
|
[2024-09-22 06:12:21,872][06893] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001952_7995392.pth |
|
[2024-09-22 06:12:21,878][06912] Stopping RolloutWorker_w6... |
|
[2024-09-22 06:12:21,879][04746] Component RolloutWorker_w6 stopped! |
|
[2024-09-22 06:12:21,882][06912] Loop rollout_proc6_evt_loop terminating... |
|
[2024-09-22 06:12:21,885][06893] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... |
|
[2024-09-22 06:12:21,898][06908] Stopping RolloutWorker_w1... |
|
[2024-09-22 06:12:21,899][06908] Loop rollout_proc1_evt_loop terminating... |
|
[2024-09-22 06:12:21,898][04746] Component RolloutWorker_w1 stopped! |
|
[2024-09-22 06:12:21,921][06909] Stopping RolloutWorker_w2... |
|
[2024-09-22 06:12:21,922][06909] Loop rollout_proc2_evt_loop terminating... |
|
[2024-09-22 06:12:21,924][04746] Component RolloutWorker_w2 stopped! |
|
[2024-09-22 06:12:21,966][06907] Stopping RolloutWorker_w0... |
|
[2024-09-22 06:12:21,967][06907] Loop rollout_proc0_evt_loop terminating... |
|
[2024-09-22 06:12:21,967][04746] Component RolloutWorker_w0 stopped! |
|
[2024-09-22 06:12:22,064][06893] Stopping LearnerWorker_p0... |
|
[2024-09-22 06:12:22,064][06893] Loop learner_proc0_evt_loop terminating... |
|
[2024-09-22 06:12:22,063][04746] Component LearnerWorker_p0 stopped! |
|
[2024-09-22 06:12:22,068][04746] Waiting for process learner_proc0 to stop... |
|
[2024-09-22 06:12:23,148][04746] Waiting for process inference_proc0-0 to join... |
|
[2024-09-22 06:12:23,151][04746] Waiting for process rollout_proc0 to join... |
|
[2024-09-22 06:12:23,154][04746] Waiting for process rollout_proc1 to join... |
|
[2024-09-22 06:12:23,156][04746] Waiting for process rollout_proc2 to join... |
|
[2024-09-22 06:12:23,158][04746] Waiting for process rollout_proc3 to join... |
|
[2024-09-22 06:12:23,160][04746] Waiting for process rollout_proc4 to join... |
|
[2024-09-22 06:12:23,162][04746] Waiting for process rollout_proc5 to join... |
|
[2024-09-22 06:12:23,165][04746] Waiting for process rollout_proc6 to join... |
|
[2024-09-22 06:12:23,168][04746] Waiting for process rollout_proc7 to join... |
|
[2024-09-22 06:12:23,170][04746] Batcher 0 profile tree view: |
|
batching: 41.2143, releasing_batches: 0.0861 |
|
[2024-09-22 06:12:23,171][04746] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0001 |
|
wait_policy_total: 12.9840 |
|
update_model: 12.5052 |
|
weight_update: 0.0017 |
|
one_step: 0.0084 |
|
handle_policy_step: 680.9309 |
|
deserialize: 27.9295, stack: 4.8215, obs_to_device_normalize: 159.6845, forward: 336.5609, send_messages: 40.8327 |
|
prepare_outputs: 78.3299 |
|
to_cpu: 49.7910 |
|
[2024-09-22 06:12:23,173][04746] Learner 0 profile tree view: |
|
misc: 0.0139, prepare_batch: 26.0761 |
|
train: 112.9910 |
|
epoch_init: 0.0147, minibatch_init: 0.0160, losses_postprocess: 0.8946, kl_divergence: 0.9136, after_optimizer: 52.6494 |
|
calculate_losses: 38.7085 |
|
losses_init: 0.0096, forward_head: 1.8256, bptt_initial: 27.6678, tail: 1.5628, advantages_returns: 0.4044, losses: 3.8285 |
|
bptt: 2.9188 |
|
bptt_forward_core: 2.7751 |
|
update: 18.7262 |
|
clip: 1.8584 |
|
[2024-09-22 06:12:23,176][04746] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.4337, enqueue_policy_requests: 22.8105, env_step: 304.0444, overhead: 17.7295, complete_rollouts: 1.0132 |
|
save_policy_outputs: 26.1419 |
|
split_output_tensors: 10.4431 |
|
[2024-09-22 06:12:23,178][04746] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.4597, enqueue_policy_requests: 23.9968, env_step: 320.4870, overhead: 18.6543, complete_rollouts: 1.3253 |
|
save_policy_outputs: 27.0024 |
|
split_output_tensors: 10.7421 |
|
[2024-09-22 06:12:23,180][04746] Loop Runner_EvtLoop terminating... |
|
[2024-09-22 06:12:23,181][04746] Runner profile tree view: |
|
main_loop: 759.4308 |
|
[2024-09-22 06:12:23,182][04746] Collected {0: 10006528}, FPS: 13176.4 |
|
[2024-09-22 06:12:23,547][04746] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-09-22 06:12:23,550][04746] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-09-22 06:12:23,552][04746] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-09-22 06:12:23,553][04746] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-09-22 06:12:23,554][04746] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:12:23,556][04746] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-09-22 06:12:23,559][04746] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:12:23,561][04746] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-09-22 06:12:23,563][04746] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2024-09-22 06:12:23,564][04746] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2024-09-22 06:12:23,565][04746] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-09-22 06:12:23,566][04746] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-09-22 06:12:23,568][04746] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-09-22 06:12:23,571][04746] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-09-22 06:12:23,572][04746] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-09-22 06:12:23,605][04746] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:12:23,609][04746] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:12:23,612][04746] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:12:23,630][04746] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:12:23,759][04746] Conv encoder output size: 512 |
|
[2024-09-22 06:12:23,761][04746] Policy head output size: 512 |
|
[2024-09-22 06:12:24,038][04746] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... |
|
[2024-09-22 06:12:24,942][04746] Num frames 100... |
|
[2024-09-22 06:12:25,115][04746] Num frames 200... |
|
[2024-09-22 06:12:25,269][04746] Num frames 300... |
|
[2024-09-22 06:12:25,426][04746] Num frames 400... |
|
[2024-09-22 06:12:25,577][04746] Num frames 500... |
|
[2024-09-22 06:12:25,746][04746] Num frames 600... |
|
[2024-09-22 06:12:25,902][04746] Num frames 700... |
|
[2024-09-22 06:12:26,054][04746] Num frames 800... |
|
[2024-09-22 06:12:26,213][04746] Num frames 900... |
|
[2024-09-22 06:12:26,381][04746] Num frames 1000... |
|
[2024-09-22 06:12:26,539][04746] Num frames 1100... |
|
[2024-09-22 06:12:26,687][04746] Num frames 1200... |
|
[2024-09-22 06:12:26,864][04746] Num frames 1300... |
|
[2024-09-22 06:12:27,021][04746] Num frames 1400... |
|
[2024-09-22 06:12:27,174][04746] Num frames 1500... |
|
[2024-09-22 06:12:27,329][04746] Num frames 1600... |
|
[2024-09-22 06:12:27,547][04746] Avg episode rewards: #0: 39.959, true rewards: #0: 16.960 |
|
[2024-09-22 06:12:27,549][04746] Avg episode reward: 39.959, avg true_objective: 16.960 |
|
[2024-09-22 06:12:27,558][04746] Num frames 1700... |
|
[2024-09-22 06:12:27,714][04746] Num frames 1800... |
|
[2024-09-22 06:12:27,857][04746] Num frames 1900... |
|
[2024-09-22 06:12:28,003][04746] Avg episode rewards: #0: 21.260, true rewards: #0: 9.760 |
|
[2024-09-22 06:12:28,005][04746] Avg episode reward: 21.260, avg true_objective: 9.760 |
|
[2024-09-22 06:12:28,080][04746] Num frames 2000... |
|
[2024-09-22 06:12:28,220][04746] Num frames 2100... |
|
[2024-09-22 06:12:28,369][04746] Num frames 2200... |
|
[2024-09-22 06:12:28,537][04746] Num frames 2300... |
|
[2024-09-22 06:12:28,700][04746] Num frames 2400... |
|
[2024-09-22 06:12:28,753][04746] Avg episode rewards: #0: 17.333, true rewards: #0: 8.000 |
|
[2024-09-22 06:12:28,755][04746] Avg episode reward: 17.333, avg true_objective: 8.000 |
|
[2024-09-22 06:12:28,896][04746] Num frames 2500... |
|
[2024-09-22 06:12:29,065][04746] Num frames 2600... |
|
[2024-09-22 06:12:29,220][04746] Num frames 2700... |
|
[2024-09-22 06:12:29,368][04746] Num frames 2800... |
|
[2024-09-22 06:12:29,544][04746] Num frames 2900... |
|
[2024-09-22 06:12:29,704][04746] Num frames 3000... |
|
[2024-09-22 06:12:29,869][04746] Num frames 3100... |
|
[2024-09-22 06:12:29,931][04746] Avg episode rewards: #0: 17.010, true rewards: #0: 7.760 |
|
[2024-09-22 06:12:29,933][04746] Avg episode reward: 17.010, avg true_objective: 7.760 |
|
[2024-09-22 06:12:30,089][04746] Num frames 3200... |
|
[2024-09-22 06:12:30,263][04746] Num frames 3300... |
|
[2024-09-22 06:12:30,419][04746] Num frames 3400... |
|
[2024-09-22 06:12:30,579][04746] Num frames 3500... |
|
[2024-09-22 06:12:30,757][04746] Num frames 3600... |
|
[2024-09-22 06:12:30,909][04746] Num frames 3700... |
|
[2024-09-22 06:12:31,071][04746] Num frames 3800... |
|
[2024-09-22 06:12:31,236][04746] Num frames 3900... |
|
[2024-09-22 06:12:31,394][04746] Num frames 4000... |
|
[2024-09-22 06:12:31,552][04746] Num frames 4100... |
|
[2024-09-22 06:12:31,697][04746] Num frames 4200... |
|
[2024-09-22 06:12:31,863][04746] Num frames 4300... |
|
[2024-09-22 06:12:32,023][04746] Num frames 4400... |
|
[2024-09-22 06:12:32,187][04746] Num frames 4500... |
|
[2024-09-22 06:12:32,342][04746] Num frames 4600... |
|
[2024-09-22 06:12:32,504][04746] Num frames 4700... |
|
[2024-09-22 06:12:32,651][04746] Num frames 4800... |
|
[2024-09-22 06:12:32,856][04746] Avg episode rewards: #0: 23.392, true rewards: #0: 9.792 |
|
[2024-09-22 06:12:32,857][04746] Avg episode reward: 23.392, avg true_objective: 9.792 |
|
[2024-09-22 06:12:32,867][04746] Num frames 4900... |
|
[2024-09-22 06:12:33,037][04746] Num frames 5000... |
|
[2024-09-22 06:12:33,185][04746] Num frames 5100... |
|
[2024-09-22 06:12:33,346][04746] Num frames 5200... |
|
[2024-09-22 06:12:33,503][04746] Num frames 5300... |
|
[2024-09-22 06:12:33,661][04746] Num frames 5400... |
|
[2024-09-22 06:12:33,817][04746] Num frames 5500... |
|
[2024-09-22 06:12:33,968][04746] Num frames 5600... |
|
[2024-09-22 06:12:34,125][04746] Num frames 5700... |
|
[2024-09-22 06:12:34,214][04746] Avg episode rewards: #0: 22.366, true rewards: #0: 9.533 |
|
[2024-09-22 06:12:34,215][04746] Avg episode reward: 22.366, avg true_objective: 9.533 |
|
[2024-09-22 06:12:34,350][04746] Num frames 5800... |
|
[2024-09-22 06:12:34,489][04746] Num frames 5900... |
|
[2024-09-22 06:12:34,652][04746] Num frames 6000... |
|
[2024-09-22 06:12:34,807][04746] Num frames 6100... |
|
[2024-09-22 06:12:34,979][04746] Num frames 6200... |
|
[2024-09-22 06:12:35,123][04746] Num frames 6300... |
|
[2024-09-22 06:12:35,277][04746] Num frames 6400... |
|
[2024-09-22 06:12:35,448][04746] Num frames 6500... |
|
[2024-09-22 06:12:35,606][04746] Num frames 6600... |
|
[2024-09-22 06:12:35,753][04746] Num frames 6700... |
|
[2024-09-22 06:12:35,913][04746] Num frames 6800... |
|
[2024-09-22 06:12:36,072][04746] Num frames 6900... |
|
[2024-09-22 06:12:36,217][04746] Num frames 7000... |
|
[2024-09-22 06:12:36,375][04746] Num frames 7100... |
|
[2024-09-22 06:12:36,543][04746] Num frames 7200... |
|
[2024-09-22 06:12:36,687][04746] Num frames 7300... |
|
[2024-09-22 06:12:36,843][04746] Num frames 7400... |
|
[2024-09-22 06:12:37,003][04746] Num frames 7500... |
|
[2024-09-22 06:12:37,157][04746] Num frames 7600... |
|
[2024-09-22 06:12:37,313][04746] Num frames 7700... |
|
[2024-09-22 06:12:37,479][04746] Num frames 7800... |
|
[2024-09-22 06:12:37,573][04746] Avg episode rewards: #0: 28.028, true rewards: #0: 11.171 |
|
[2024-09-22 06:12:37,574][04746] Avg episode reward: 28.028, avg true_objective: 11.171 |
|
[2024-09-22 06:12:37,706][04746] Num frames 7900... |
|
[2024-09-22 06:12:37,854][04746] Num frames 8000... |
|
[2024-09-22 06:12:38,021][04746] Num frames 8100... |
|
[2024-09-22 06:12:38,197][04746] Num frames 8200... |
|
[2024-09-22 06:12:38,358][04746] Num frames 8300... |
|
[2024-09-22 06:12:38,510][04746] Num frames 8400... |
|
[2024-09-22 06:12:38,687][04746] Num frames 8500... |
|
[2024-09-22 06:12:38,841][04746] Num frames 8600... |
|
[2024-09-22 06:12:38,995][04746] Num frames 8700... |
|
[2024-09-22 06:12:39,166][04746] Num frames 8800... |
|
[2024-09-22 06:12:39,329][04746] Num frames 8900... |
|
[2024-09-22 06:12:39,484][04746] Num frames 9000... |
|
[2024-09-22 06:12:39,649][04746] Num frames 9100... |
|
[2024-09-22 06:12:39,835][04746] Num frames 9200... |
|
[2024-09-22 06:12:40,038][04746] Avg episode rewards: #0: 29.365, true rewards: #0: 11.615 |
|
[2024-09-22 06:12:40,039][04746] Avg episode reward: 29.365, avg true_objective: 11.615 |
|
[2024-09-22 06:12:40,056][04746] Num frames 9300... |
|
[2024-09-22 06:12:40,215][04746] Num frames 9400... |
|
[2024-09-22 06:12:40,393][04746] Num frames 9500... |
|
[2024-09-22 06:12:40,553][04746] Num frames 9600... |
|
[2024-09-22 06:12:40,718][04746] Num frames 9700... |
|
[2024-09-22 06:12:40,899][04746] Num frames 9800... |
|
[2024-09-22 06:12:41,058][04746] Num frames 9900... |
|
[2024-09-22 06:12:41,221][04746] Num frames 10000... |
|
[2024-09-22 06:12:41,410][04746] Num frames 10100... |
|
[2024-09-22 06:12:41,585][04746] Num frames 10200... |
|
[2024-09-22 06:12:41,750][04746] Num frames 10300... |
|
[2024-09-22 06:12:41,924][04746] Num frames 10400... |
|
[2024-09-22 06:12:42,091][04746] Num frames 10500... |
|
[2024-09-22 06:12:42,271][04746] Num frames 10600... |
|
[2024-09-22 06:12:42,437][04746] Num frames 10700... |
|
[2024-09-22 06:12:42,602][04746] Num frames 10800... |
|
[2024-09-22 06:12:42,789][04746] Num frames 10900... |
|
[2024-09-22 06:12:42,954][04746] Num frames 11000... |
|
[2024-09-22 06:12:43,120][04746] Num frames 11100... |
|
[2024-09-22 06:12:43,279][04746] Num frames 11200... |
|
[2024-09-22 06:12:43,456][04746] Num frames 11300... |
|
[2024-09-22 06:12:43,668][04746] Avg episode rewards: #0: 32.546, true rewards: #0: 12.658 |
|
[2024-09-22 06:12:43,670][04746] Avg episode reward: 32.546, avg true_objective: 12.658 |
|
[2024-09-22 06:12:43,687][04746] Num frames 11400... |
|
[2024-09-22 06:12:43,854][04746] Num frames 11500... |
|
[2024-09-22 06:12:44,023][04746] Num frames 11600... |
|
[2024-09-22 06:12:44,199][04746] Num frames 11700... |
|
[2024-09-22 06:12:44,351][04746] Num frames 11800... |
|
[2024-09-22 06:12:44,507][04746] Num frames 11900... |
|
[2024-09-22 06:12:44,682][04746] Num frames 12000... |
|
[2024-09-22 06:12:44,829][04746] Num frames 12100... |
|
[2024-09-22 06:12:44,985][04746] Num frames 12200... |
|
[2024-09-22 06:12:45,160][04746] Num frames 12300... |
|
[2024-09-22 06:12:45,324][04746] Num frames 12400... |
|
[2024-09-22 06:12:45,483][04746] Num frames 12500... |
|
[2024-09-22 06:12:45,642][04746] Num frames 12600... |
|
[2024-09-22 06:12:45,814][04746] Num frames 12700... |
|
[2024-09-22 06:12:45,972][04746] Num frames 12800... |
|
[2024-09-22 06:12:46,141][04746] Num frames 12900... |
|
[2024-09-22 06:12:46,299][04746] Num frames 13000... |
|
[2024-09-22 06:12:46,460][04746] Avg episode rewards: #0: 33.657, true rewards: #0: 13.057 |
|
[2024-09-22 06:12:46,462][04746] Avg episode reward: 33.657, avg true_objective: 13.057 |
|
[2024-09-22 06:13:23,381][04746] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2024-09-22 06:14:25,599][04746] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-09-22 06:14:25,601][04746] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-09-22 06:14:25,603][04746] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-09-22 06:14:25,604][04746] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-09-22 06:14:25,607][04746] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:14:25,608][04746] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-09-22 06:14:25,609][04746] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2024-09-22 06:14:25,611][04746] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-09-22 06:14:25,612][04746] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2024-09-22 06:14:25,613][04746] Adding new argument 'hf_repository'='ThomasSimonini/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2024-09-22 06:14:25,614][04746] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-09-22 06:14:25,616][04746] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-09-22 06:14:25,617][04746] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-09-22 06:14:25,618][04746] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-09-22 06:14:25,623][04746] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-09-22 06:14:25,650][04746] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:14:25,653][04746] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:14:25,667][04746] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:14:25,717][04746] Conv encoder output size: 512 |
|
[2024-09-22 06:14:25,719][04746] Policy head output size: 512 |
|
[2024-09-22 06:14:25,742][04746] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... |
|
[2024-09-22 06:14:26,239][04746] Num frames 100... |
|
[2024-09-22 06:14:26,399][04746] Num frames 200... |
|
[2024-09-22 06:14:26,561][04746] Num frames 300... |
|
[2024-09-22 06:14:26,716][04746] Num frames 400... |
|
[2024-09-22 06:14:26,878][04746] Num frames 500... |
|
[2024-09-22 06:14:27,036][04746] Num frames 600... |
|
[2024-09-22 06:14:27,212][04746] Num frames 700... |
|
[2024-09-22 06:14:27,367][04746] Num frames 800... |
|
[2024-09-22 06:14:27,531][04746] Num frames 900... |
|
[2024-09-22 06:14:27,697][04746] Num frames 1000... |
|
[2024-09-22 06:14:27,867][04746] Num frames 1100... |
|
[2024-09-22 06:14:28,071][04746] Avg episode rewards: #0: 28.950, true rewards: #0: 11.950 |
|
[2024-09-22 06:14:28,073][04746] Avg episode reward: 28.950, avg true_objective: 11.950 |
|
[2024-09-22 06:14:28,084][04746] Num frames 1200... |
|
[2024-09-22 06:14:28,257][04746] Num frames 1300... |
|
[2024-09-22 06:14:28,436][04746] Num frames 1400... |
|
[2024-09-22 06:14:28,602][04746] Num frames 1500... |
|
[2024-09-22 06:14:28,767][04746] Num frames 1600... |
|
[2024-09-22 06:14:28,901][04746] Num frames 1700... |
|
[2024-09-22 06:14:29,041][04746] Num frames 1800... |
|
[2024-09-22 06:14:29,148][04746] Avg episode rewards: #0: 19.645, true rewards: #0: 9.145 |
|
[2024-09-22 06:14:29,150][04746] Avg episode reward: 19.645, avg true_objective: 9.145 |
|
[2024-09-22 06:14:29,250][04746] Num frames 1900... |
|
[2024-09-22 06:14:29,382][04746] Num frames 2000... |
|
[2024-09-22 06:14:29,514][04746] Num frames 2100... |
|
[2024-09-22 06:14:29,648][04746] Num frames 2200... |
|
[2024-09-22 06:14:29,785][04746] Num frames 2300... |
|
[2024-09-22 06:14:29,923][04746] Num frames 2400... |
|
[2024-09-22 06:14:30,060][04746] Num frames 2500... |
|
[2024-09-22 06:14:30,190][04746] Num frames 2600... |
|
[2024-09-22 06:14:30,352][04746] Num frames 2700... |
|
[2024-09-22 06:14:30,483][04746] Num frames 2800... |
|
[2024-09-22 06:14:30,614][04746] Num frames 2900... |
|
[2024-09-22 06:14:30,745][04746] Num frames 3000... |
|
[2024-09-22 06:14:30,875][04746] Num frames 3100... |
|
[2024-09-22 06:14:30,945][04746] Avg episode rewards: #0: 22.364, true rewards: #0: 10.363 |
|
[2024-09-22 06:14:30,947][04746] Avg episode reward: 22.364, avg true_objective: 10.363 |
|
[2024-09-22 06:14:31,071][04746] Num frames 3200... |
|
[2024-09-22 06:14:31,216][04746] Num frames 3300... |
|
[2024-09-22 06:14:31,359][04746] Num frames 3400... |
|
[2024-09-22 06:14:31,489][04746] Num frames 3500... |
|
[2024-09-22 06:14:31,623][04746] Num frames 3600... |
|
[2024-09-22 06:14:31,781][04746] Num frames 3700... |
|
[2024-09-22 06:14:31,942][04746] Num frames 3800... |
|
[2024-09-22 06:14:32,079][04746] Num frames 3900... |
|
[2024-09-22 06:14:32,210][04746] Num frames 4000... |
|
[2024-09-22 06:14:32,339][04746] Num frames 4100... |
|
[2024-09-22 06:14:32,471][04746] Num frames 4200... |
|
[2024-09-22 06:14:32,602][04746] Num frames 4300... |
|
[2024-09-22 06:14:32,730][04746] Num frames 4400... |
|
[2024-09-22 06:14:32,861][04746] Num frames 4500... |
|
[2024-09-22 06:14:32,996][04746] Num frames 4600... |
|
[2024-09-22 06:14:33,126][04746] Num frames 4700... |
|
[2024-09-22 06:14:33,262][04746] Num frames 4800... |
|
[2024-09-22 06:14:33,444][04746] Num frames 4900... |
|
[2024-09-22 06:14:33,599][04746] Num frames 5000... |
|
[2024-09-22 06:14:33,729][04746] Num frames 5100... |
|
[2024-09-22 06:14:33,879][04746] Avg episode rewards: #0: 29.935, true rewards: #0: 12.935 |
|
[2024-09-22 06:14:33,881][04746] Avg episode reward: 29.935, avg true_objective: 12.935 |
|
[2024-09-22 06:14:33,922][04746] Num frames 5200... |
|
[2024-09-22 06:14:34,053][04746] Num frames 5300... |
|
[2024-09-22 06:14:34,184][04746] Num frames 5400... |
|
[2024-09-22 06:14:34,314][04746] Num frames 5500... |
|
[2024-09-22 06:14:34,446][04746] Num frames 5600... |
|
[2024-09-22 06:14:34,576][04746] Num frames 5700... |
|
[2024-09-22 06:14:34,703][04746] Num frames 5800... |
|
[2024-09-22 06:14:34,830][04746] Num frames 5900... |
|
[2024-09-22 06:14:34,964][04746] Num frames 6000... |
|
[2024-09-22 06:14:35,094][04746] Num frames 6100... |
|
[2024-09-22 06:14:35,219][04746] Num frames 6200... |
|
[2024-09-22 06:14:35,347][04746] Num frames 6300... |
|
[2024-09-22 06:14:35,482][04746] Num frames 6400... |
|
[2024-09-22 06:14:35,619][04746] Num frames 6500... |
|
[2024-09-22 06:14:35,747][04746] Num frames 6600... |
|
[2024-09-22 06:14:35,879][04746] Num frames 6700... |
|
[2024-09-22 06:14:36,016][04746] Num frames 6800... |
|
[2024-09-22 06:14:36,150][04746] Num frames 6900... |
|
[2024-09-22 06:14:36,279][04746] Num frames 7000... |
|
[2024-09-22 06:14:36,407][04746] Num frames 7100... |
|
[2024-09-22 06:14:36,538][04746] Num frames 7200... |
|
[2024-09-22 06:14:36,690][04746] Avg episode rewards: #0: 36.148, true rewards: #0: 14.548 |
|
[2024-09-22 06:14:36,693][04746] Avg episode reward: 36.148, avg true_objective: 14.548 |
|
[2024-09-22 06:14:36,728][04746] Num frames 7300... |
|
[2024-09-22 06:14:36,853][04746] Num frames 7400... |
|
[2024-09-22 06:14:36,991][04746] Num frames 7500... |
|
[2024-09-22 06:14:37,126][04746] Num frames 7600... |
|
[2024-09-22 06:14:37,254][04746] Num frames 7700... |
|
[2024-09-22 06:14:37,381][04746] Num frames 7800... |
|
[2024-09-22 06:14:37,508][04746] Num frames 7900... |
|
[2024-09-22 06:14:37,635][04746] Avg episode rewards: #0: 32.596, true rewards: #0: 13.263 |
|
[2024-09-22 06:14:37,637][04746] Avg episode reward: 32.596, avg true_objective: 13.263 |
|
[2024-09-22 06:14:37,692][04746] Num frames 8000... |
|
[2024-09-22 06:14:37,818][04746] Num frames 8100... |
|
[2024-09-22 06:14:37,945][04746] Num frames 8200... |
|
[2024-09-22 06:14:38,070][04746] Num frames 8300... |
|
[2024-09-22 06:14:38,200][04746] Num frames 8400... |
|
[2024-09-22 06:14:38,326][04746] Num frames 8500... |
|
[2024-09-22 06:14:38,458][04746] Num frames 8600... |
|
[2024-09-22 06:14:38,590][04746] Num frames 8700... |
|
[2024-09-22 06:14:38,725][04746] Num frames 8800... |
|
[2024-09-22 06:14:38,854][04746] Num frames 8900... |
|
[2024-09-22 06:14:38,992][04746] Num frames 9000... |
|
[2024-09-22 06:14:39,124][04746] Num frames 9100... |
|
[2024-09-22 06:14:39,263][04746] Num frames 9200... |
|
[2024-09-22 06:14:39,399][04746] Num frames 9300... |
|
[2024-09-22 06:17:24,401][11734] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2024-09-22 06:17:24,404][11734] Rollout worker 0 uses device cpu |
|
[2024-09-22 06:17:24,406][11734] Rollout worker 1 uses device cpu |
|
[2024-09-22 06:17:24,407][11734] Rollout worker 2 uses device cpu |
|
[2024-09-22 06:17:24,409][11734] Rollout worker 3 uses device cpu |
|
[2024-09-22 06:17:24,410][11734] Rollout worker 4 uses device cpu |
|
[2024-09-22 06:17:24,411][11734] Rollout worker 5 uses device cpu |
|
[2024-09-22 06:17:24,413][11734] Rollout worker 6 uses device cpu |
|
[2024-09-22 06:17:24,414][11734] Rollout worker 7 uses device cpu |
|
[2024-09-22 06:17:24,484][11734] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:17:24,486][11734] InferenceWorker_p0-w0: min num requests: 2 |
|
[2024-09-22 06:17:24,526][11734] Starting all processes... |
|
[2024-09-22 06:17:24,528][11734] Starting process learner_proc0 |
|
[2024-09-22 06:17:24,938][11734] Starting all processes... |
|
[2024-09-22 06:17:24,947][11734] Starting process inference_proc0-0 |
|
[2024-09-22 06:17:24,947][11734] Starting process rollout_proc0 |
|
[2024-09-22 06:17:24,947][11734] Starting process rollout_proc1 |
|
[2024-09-22 06:17:24,949][11734] Starting process rollout_proc2 |
|
[2024-09-22 06:17:24,950][11734] Starting process rollout_proc3 |
|
[2024-09-22 06:17:24,955][11734] Starting process rollout_proc4 |
|
[2024-09-22 06:17:24,970][11734] Starting process rollout_proc5 |
|
[2024-09-22 06:17:24,973][11734] Starting process rollout_proc6 |
|
[2024-09-22 06:17:24,982][11734] Starting process rollout_proc7 |
|
[2024-09-22 06:17:29,220][12533] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:17:29,221][12533] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2024-09-22 06:17:29,251][12533] Num visible devices: 1 |
|
[2024-09-22 06:17:29,320][12536] Worker 2 uses CPU cores [2] |
|
[2024-09-22 06:17:29,320][12520] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:17:29,321][12520] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2024-09-22 06:17:29,341][12520] Num visible devices: 1 |
|
[2024-09-22 06:17:29,352][12537] Worker 3 uses CPU cores [3] |
|
[2024-09-22 06:17:29,391][12520] Starting seed is not provided |
|
[2024-09-22 06:17:29,392][12520] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:17:29,393][12520] Initializing actor-critic model on device cuda:0 |
|
[2024-09-22 06:17:29,395][12520] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:17:29,397][12520] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:17:29,421][12520] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:17:29,426][12540] Worker 7 uses CPU cores [7] |
|
[2024-09-22 06:17:29,502][12538] Worker 4 uses CPU cores [4] |
|
[2024-09-22 06:17:29,503][12535] Worker 1 uses CPU cores [1] |
|
[2024-09-22 06:17:29,515][12534] Worker 0 uses CPU cores [0] |
|
[2024-09-22 06:17:29,589][12539] Worker 6 uses CPU cores [6] |
|
[2024-09-22 06:17:29,592][12520] Conv encoder output size: 512 |
|
[2024-09-22 06:17:29,593][12520] Policy head output size: 512 |
|
[2024-09-22 06:17:29,609][12520] Created Actor Critic model with architecture: |
|
[2024-09-22 06:17:29,609][12520] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2024-09-22 06:17:29,872][12520] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2024-09-22 06:17:29,892][12541] Worker 5 uses CPU cores [5] |
|
[2024-09-22 06:17:30,625][12520] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... |
|
[2024-09-22 06:17:30,665][12520] Loading model from checkpoint |
|
[2024-09-22 06:17:30,667][12520] Loaded experiment state at self.train_step=2443, self.env_steps=10006528 |
|
[2024-09-22 06:17:30,667][12520] Initialized policy 0 weights for model version 2443 |
|
[2024-09-22 06:17:30,672][12520] LearnerWorker_p0 finished initialization! |
|
[2024-09-22 06:17:30,672][12520] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:17:30,860][12533] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:17:30,861][12533] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:17:30,875][12533] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:17:31,004][12533] Conv encoder output size: 512 |
|
[2024-09-22 06:17:31,005][12533] Policy head output size: 512 |
|
[2024-09-22 06:17:31,064][11734] Inference worker 0-0 is ready! |
|
[2024-09-22 06:17:31,066][11734] All inference workers are ready! Signal rollout workers to start! |
|
[2024-09-22 06:17:31,122][12540] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,124][12534] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,124][12539] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,126][12538] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,126][12541] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,126][12535] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,134][12536] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,140][12537] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:31,458][12540] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:31,459][12539] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:31,459][12541] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:31,619][12538] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:31,621][12534] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:31,724][12539] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:31,870][12541] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:31,914][12535] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:32,052][12534] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:32,052][12538] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:32,149][12539] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:32,266][12535] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:32,310][12536] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:32,349][12540] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:32,511][12541] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:32,621][12538] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:32,622][12534] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:32,703][12536] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:32,767][12537] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:17:32,785][12539] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:32,810][12535] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:32,862][12540] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:32,943][12541] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:33,088][12534] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:33,157][12538] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:33,253][12537] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:17:33,257][12535] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:33,298][12536] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:33,368][12540] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:33,615][12536] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:33,697][12537] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:17:34,075][12537] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:17:34,420][11734] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 10006528. Throughput: 0: nan. Samples: 772. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-09-22 06:17:34,422][11734] Avg episode reward: [(0, '1.866')] |
|
[2024-09-22 06:17:34,924][12520] Signal inference workers to stop experience collection... |
|
[2024-09-22 06:17:34,935][12533] InferenceWorker_p0-w0: stopping experience collection |
|
[2024-09-22 06:17:37,581][12520] Signal inference workers to resume experience collection... |
|
[2024-09-22 06:17:37,582][12520] Stopping Batcher_0... |
|
[2024-09-22 06:17:37,582][12520] Loop batcher_evt_loop terminating... |
|
[2024-09-22 06:17:37,593][11734] Component Batcher_0 stopped! |
|
[2024-09-22 06:17:37,604][12533] Weights refcount: 2 0 |
|
[2024-09-22 06:17:37,611][12533] Stopping InferenceWorker_p0-w0... |
|
[2024-09-22 06:17:37,610][11734] Component InferenceWorker_p0-w0 stopped! |
|
[2024-09-22 06:17:37,613][12533] Loop inference_proc0-0_evt_loop terminating... |
|
[2024-09-22 06:17:37,629][12534] Stopping RolloutWorker_w0... |
|
[2024-09-22 06:17:37,630][12534] Loop rollout_proc0_evt_loop terminating... |
|
[2024-09-22 06:17:37,630][11734] Component RolloutWorker_w0 stopped! |
|
[2024-09-22 06:17:37,632][12540] Stopping RolloutWorker_w7... |
|
[2024-09-22 06:17:37,634][12540] Loop rollout_proc7_evt_loop terminating... |
|
[2024-09-22 06:17:37,634][12536] Stopping RolloutWorker_w2... |
|
[2024-09-22 06:17:37,633][11734] Component RolloutWorker_w7 stopped! |
|
[2024-09-22 06:17:37,635][12536] Loop rollout_proc2_evt_loop terminating... |
|
[2024-09-22 06:17:37,635][11734] Component RolloutWorker_w2 stopped! |
|
[2024-09-22 06:17:37,639][12539] Stopping RolloutWorker_w6... |
|
[2024-09-22 06:17:37,639][12539] Loop rollout_proc6_evt_loop terminating... |
|
[2024-09-22 06:17:37,641][11734] Component RolloutWorker_w6 stopped! |
|
[2024-09-22 06:17:37,642][12537] Stopping RolloutWorker_w3... |
|
[2024-09-22 06:17:37,644][12537] Loop rollout_proc3_evt_loop terminating... |
|
[2024-09-22 06:17:37,645][11734] Component RolloutWorker_w3 stopped! |
|
[2024-09-22 06:17:37,670][12541] Stopping RolloutWorker_w5... |
|
[2024-09-22 06:17:37,670][11734] Component RolloutWorker_w5 stopped! |
|
[2024-09-22 06:17:37,672][12541] Loop rollout_proc5_evt_loop terminating... |
|
[2024-09-22 06:17:37,725][12535] Stopping RolloutWorker_w1... |
|
[2024-09-22 06:17:37,726][12535] Loop rollout_proc1_evt_loop terminating... |
|
[2024-09-22 06:17:37,725][11734] Component RolloutWorker_w1 stopped! |
|
[2024-09-22 06:17:37,848][11734] Component RolloutWorker_w4 stopped! |
|
[2024-09-22 06:17:37,847][12538] Stopping RolloutWorker_w4... |
|
[2024-09-22 06:17:37,853][12538] Loop rollout_proc4_evt_loop terminating... |
|
[2024-09-22 06:17:38,311][12520] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002445_10014720.pth... |
|
[2024-09-22 06:17:38,415][12520] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002311_9465856.pth |
|
[2024-09-22 06:17:38,430][12520] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002445_10014720.pth... |
|
[2024-09-22 06:17:38,536][12520] Stopping LearnerWorker_p0... |
|
[2024-09-22 06:17:38,537][12520] Loop learner_proc0_evt_loop terminating... |
|
[2024-09-22 06:17:38,536][11734] Component LearnerWorker_p0 stopped! |
|
[2024-09-22 06:17:38,540][11734] Waiting for process learner_proc0 to stop... |
|
[2024-09-22 06:17:39,375][11734] Waiting for process inference_proc0-0 to join... |
|
[2024-09-22 06:17:39,377][11734] Waiting for process rollout_proc0 to join... |
|
[2024-09-22 06:17:39,380][11734] Waiting for process rollout_proc1 to join... |
|
[2024-09-22 06:17:39,382][11734] Waiting for process rollout_proc2 to join... |
|
[2024-09-22 06:17:39,384][11734] Waiting for process rollout_proc3 to join... |
|
[2024-09-22 06:17:39,387][11734] Waiting for process rollout_proc4 to join... |
|
[2024-09-22 06:17:39,389][11734] Waiting for process rollout_proc5 to join... |
|
[2024-09-22 06:17:39,391][11734] Waiting for process rollout_proc6 to join... |
|
[2024-09-22 06:17:39,394][11734] Waiting for process rollout_proc7 to join... |
|
[2024-09-22 06:17:39,396][11734] Batcher 0 profile tree view: |
|
batching: 0.0272, releasing_batches: 0.0006 |
|
[2024-09-22 06:17:39,399][11734] InferenceWorker_p0-w0 profile tree view: |
|
update_model: 0.0150 |
|
wait_policy: 0.0001 |
|
wait_policy_total: 1.7781 |
|
one_step: 0.0072 |
|
handle_policy_step: 1.9771 |
|
deserialize: 0.0586, stack: 0.0103, obs_to_device_normalize: 0.3707, forward: 1.2566, send_messages: 0.1221 |
|
prepare_outputs: 0.1041 |
|
to_cpu: 0.0496 |
|
[2024-09-22 06:17:39,402][11734] Learner 0 profile tree view: |
|
misc: 0.0000, prepare_batch: 1.5120 |
|
train: 2.3528 |
|
epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0005, kl_divergence: 0.0125, after_optimizer: 0.0419 |
|
calculate_losses: 0.9267 |
|
losses_init: 0.0000, forward_head: 0.3230, bptt_initial: 0.5264, tail: 0.0353, advantages_returns: 0.0010, losses: 0.0367 |
|
bptt: 0.0038 |
|
bptt_forward_core: 0.0036 |
|
update: 1.3702 |
|
clip: 0.0448 |
|
[2024-09-22 06:17:39,405][11734] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.0008, enqueue_policy_requests: 0.0398, env_step: 0.4080, overhead: 0.0266, complete_rollouts: 0.0008 |
|
save_policy_outputs: 0.0351 |
|
split_output_tensors: 0.0139 |
|
[2024-09-22 06:17:39,406][11734] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.0010, enqueue_policy_requests: 0.0449, env_step: 0.4532, overhead: 0.0323, complete_rollouts: 0.0014 |
|
save_policy_outputs: 0.0410 |
|
split_output_tensors: 0.0159 |
|
[2024-09-22 06:17:39,409][11734] Loop Runner_EvtLoop terminating... |
|
[2024-09-22 06:17:39,410][11734] Runner profile tree view: |
|
main_loop: 14.8845 |
|
[2024-09-22 06:17:39,412][11734] Collected {0: 10014720}, FPS: 550.4 |
|
[2024-09-22 06:17:39,433][11734] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-09-22 06:17:39,434][11734] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-09-22 06:17:39,435][11734] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-09-22 06:17:39,438][11734] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-09-22 06:17:39,439][11734] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:17:39,440][11734] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-09-22 06:17:39,442][11734] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:17:39,443][11734] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-09-22 06:17:39,444][11734] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2024-09-22 06:17:39,446][11734] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2024-09-22 06:17:39,447][11734] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-09-22 06:17:39,449][11734] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-09-22 06:17:39,450][11734] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-09-22 06:17:39,452][11734] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-09-22 06:17:39,454][11734] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-09-22 06:17:39,486][11734] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:17:39,490][11734] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:17:39,493][11734] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:17:39,510][11734] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:17:39,647][11734] Conv encoder output size: 512 |
|
[2024-09-22 06:17:39,649][11734] Policy head output size: 512 |
|
[2024-09-22 06:17:39,942][11734] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002445_10014720.pth... |
|
[2024-09-22 06:17:40,875][11734] Num frames 100... |
|
[2024-09-22 06:17:41,034][11734] Num frames 200... |
|
[2024-09-22 06:17:41,208][11734] Num frames 300... |
|
[2024-09-22 06:17:41,367][11734] Num frames 400... |
|
[2024-09-22 06:17:41,530][11734] Num frames 500... |
|
[2024-09-22 06:17:41,691][11734] Num frames 600... |
|
[2024-09-22 06:17:41,865][11734] Num frames 700... |
|
[2024-09-22 06:17:42,034][11734] Num frames 800... |
|
[2024-09-22 06:17:42,189][11734] Num frames 900... |
|
[2024-09-22 06:17:42,365][11734] Num frames 1000... |
|
[2024-09-22 06:17:42,537][11734] Num frames 1100... |
|
[2024-09-22 06:17:42,708][11734] Num frames 1200... |
|
[2024-09-22 06:17:42,874][11734] Num frames 1300... |
|
[2024-09-22 06:17:43,048][11734] Num frames 1400... |
|
[2024-09-22 06:17:43,217][11734] Num frames 1500... |
|
[2024-09-22 06:17:43,376][11734] Num frames 1600... |
|
[2024-09-22 06:17:43,525][11734] Num frames 1700... |
|
[2024-09-22 06:17:43,702][11734] Num frames 1800... |
|
[2024-09-22 06:17:43,842][11734] Num frames 1900... |
|
[2024-09-22 06:17:43,987][11734] Num frames 2000... |
|
[2024-09-22 06:17:44,100][11734] Avg episode rewards: #0: 54.329, true rewards: #0: 20.330 |
|
[2024-09-22 06:17:44,102][11734] Avg episode reward: 54.329, avg true_objective: 20.330 |
|
[2024-09-22 06:17:44,216][11734] Num frames 2100... |
|
[2024-09-22 06:17:44,378][11734] Num frames 2200... |
|
[2024-09-22 06:17:44,535][11734] Num frames 2300... |
|
[2024-09-22 06:17:44,683][11734] Num frames 2400... |
|
[2024-09-22 06:17:44,870][11734] Avg episode rewards: #0: 29.904, true rewards: #0: 12.405 |
|
[2024-09-22 06:17:44,872][11734] Avg episode reward: 29.904, avg true_objective: 12.405 |
|
[2024-09-22 06:17:44,909][11734] Num frames 2500... |
|
[2024-09-22 06:17:45,059][11734] Num frames 2600... |
|
[2024-09-22 06:17:45,215][11734] Num frames 2700... |
|
[2024-09-22 06:17:45,387][11734] Num frames 2800... |
|
[2024-09-22 06:17:45,547][11734] Num frames 2900... |
|
[2024-09-22 06:17:45,693][11734] Num frames 3000... |
|
[2024-09-22 06:17:45,849][11734] Num frames 3100... |
|
[2024-09-22 06:17:46,021][11734] Num frames 3200... |
|
[2024-09-22 06:17:46,169][11734] Num frames 3300... |
|
[2024-09-22 06:17:46,317][11734] Num frames 3400... |
|
[2024-09-22 06:17:46,489][11734] Num frames 3500... |
|
[2024-09-22 06:17:46,640][11734] Num frames 3600... |
|
[2024-09-22 06:17:46,776][11734] Num frames 3700... |
|
[2024-09-22 06:17:46,959][11734] Num frames 3800... |
|
[2024-09-22 06:17:47,123][11734] Num frames 3900... |
|
[2024-09-22 06:17:47,298][11734] Num frames 4000... |
|
[2024-09-22 06:17:47,471][11734] Num frames 4100... |
|
[2024-09-22 06:17:47,656][11734] Num frames 4200... |
|
[2024-09-22 06:17:47,811][11734] Num frames 4300... |
|
[2024-09-22 06:17:47,973][11734] Num frames 4400... |
|
[2024-09-22 06:17:48,146][11734] Num frames 4500... |
|
[2024-09-22 06:17:48,361][11734] Avg episode rewards: #0: 40.936, true rewards: #0: 15.270 |
|
[2024-09-22 06:17:48,364][11734] Avg episode reward: 40.936, avg true_objective: 15.270 |
|
[2024-09-22 06:17:48,400][11734] Num frames 4600... |
|
[2024-09-22 06:17:48,575][11734] Num frames 4700... |
|
[2024-09-22 06:17:48,748][11734] Num frames 4800... |
|
[2024-09-22 06:17:48,937][11734] Num frames 4900... |
|
[2024-09-22 06:17:49,090][11734] Num frames 5000... |
|
[2024-09-22 06:17:49,238][11734] Num frames 5100... |
|
[2024-09-22 06:17:49,398][11734] Num frames 5200... |
|
[2024-09-22 06:17:49,568][11734] Num frames 5300... |
|
[2024-09-22 06:17:49,722][11734] Num frames 5400... |
|
[2024-09-22 06:17:49,869][11734] Num frames 5500... |
|
[2024-09-22 06:17:50,043][11734] Num frames 5600... |
|
[2024-09-22 06:17:50,213][11734] Avg episode rewards: #0: 36.672, true rewards: #0: 14.173 |
|
[2024-09-22 06:17:50,215][11734] Avg episode reward: 36.672, avg true_objective: 14.173 |
|
[2024-09-22 06:17:50,265][11734] Num frames 5700... |
|
[2024-09-22 06:17:50,420][11734] Num frames 5800... |
|
[2024-09-22 06:17:50,573][11734] Num frames 5900... |
|
[2024-09-22 06:17:50,739][11734] Num frames 6000... |
|
[2024-09-22 06:17:50,899][11734] Num frames 6100... |
|
[2024-09-22 06:17:51,053][11734] Num frames 6200... |
|
[2024-09-22 06:17:51,205][11734] Num frames 6300... |
|
[2024-09-22 06:17:51,381][11734] Num frames 6400... |
|
[2024-09-22 06:17:51,535][11734] Num frames 6500... |
|
[2024-09-22 06:17:51,684][11734] Num frames 6600... |
|
[2024-09-22 06:17:51,860][11734] Num frames 6700... |
|
[2024-09-22 06:17:52,030][11734] Num frames 6800... |
|
[2024-09-22 06:17:52,189][11734] Num frames 6900... |
|
[2024-09-22 06:17:52,342][11734] Num frames 7000... |
|
[2024-09-22 06:17:52,514][11734] Num frames 7100... |
|
[2024-09-22 06:17:52,657][11734] Num frames 7200... |
|
[2024-09-22 06:17:52,722][11734] Avg episode rewards: #0: 37.209, true rewards: #0: 14.410 |
|
[2024-09-22 06:17:52,724][11734] Avg episode reward: 37.209, avg true_objective: 14.410 |
|
[2024-09-22 06:17:52,867][11734] Num frames 7300... |
|
[2024-09-22 06:17:53,040][11734] Num frames 7400... |
|
[2024-09-22 06:17:53,184][11734] Num frames 7500... |
|
[2024-09-22 06:17:53,357][11734] Num frames 7600... |
|
[2024-09-22 06:17:53,515][11734] Num frames 7700... |
|
[2024-09-22 06:17:53,682][11734] Num frames 7800... |
|
[2024-09-22 06:17:53,839][11734] Num frames 7900... |
|
[2024-09-22 06:17:53,941][11734] Avg episode rewards: #0: 34.208, true rewards: #0: 13.208 |
|
[2024-09-22 06:17:53,943][11734] Avg episode reward: 34.208, avg true_objective: 13.208 |
|
[2024-09-22 06:17:54,063][11734] Num frames 8000... |
|
[2024-09-22 06:17:54,241][11734] Num frames 8100... |
|
[2024-09-22 06:17:54,386][11734] Num frames 8200... |
|
[2024-09-22 06:17:54,554][11734] Num frames 8300... |
|
[2024-09-22 06:17:54,725][11734] Num frames 8400... |
|
[2024-09-22 06:17:54,892][11734] Num frames 8500... |
|
[2024-09-22 06:17:55,045][11734] Num frames 8600... |
|
[2024-09-22 06:17:55,208][11734] Num frames 8700... |
|
[2024-09-22 06:17:55,387][11734] Num frames 8800... |
|
[2024-09-22 06:17:55,541][11734] Num frames 8900... |
|
[2024-09-22 06:17:55,707][11734] Num frames 9000... |
|
[2024-09-22 06:17:55,875][11734] Num frames 9100... |
|
[2024-09-22 06:17:56,028][11734] Num frames 9200... |
|
[2024-09-22 06:17:56,185][11734] Num frames 9300... |
|
[2024-09-22 06:17:56,367][11734] Num frames 9400... |
|
[2024-09-22 06:17:56,535][11734] Num frames 9500... |
|
[2024-09-22 06:17:56,692][11734] Num frames 9600... |
|
[2024-09-22 06:17:56,844][11734] Num frames 9700... |
|
[2024-09-22 06:17:57,016][11734] Num frames 9800... |
|
[2024-09-22 06:17:57,179][11734] Num frames 9900... |
|
[2024-09-22 06:17:57,344][11734] Num frames 10000... |
|
[2024-09-22 06:17:57,446][11734] Avg episode rewards: #0: 37.607, true rewards: #0: 14.321 |
|
[2024-09-22 06:17:57,448][11734] Avg episode reward: 37.607, avg true_objective: 14.321 |
|
[2024-09-22 06:17:57,563][11734] Num frames 10100... |
|
[2024-09-22 06:17:57,749][11734] Num frames 10200... |
|
[2024-09-22 06:17:57,907][11734] Num frames 10300... |
|
[2024-09-22 06:17:58,065][11734] Num frames 10400... |
|
[2024-09-22 06:17:58,228][11734] Num frames 10500... |
|
[2024-09-22 06:17:58,400][11734] Num frames 10600... |
|
[2024-09-22 06:17:58,559][11734] Num frames 10700... |
|
[2024-09-22 06:17:58,731][11734] Num frames 10800... |
|
[2024-09-22 06:17:58,890][11734] Num frames 10900... |
|
[2024-09-22 06:17:59,071][11734] Num frames 11000... |
|
[2024-09-22 06:17:59,239][11734] Num frames 11100... |
|
[2024-09-22 06:17:59,394][11734] Num frames 11200... |
|
[2024-09-22 06:17:59,587][11734] Num frames 11300... |
|
[2024-09-22 06:17:59,757][11734] Num frames 11400... |
|
[2024-09-22 06:17:59,924][11734] Num frames 11500... |
|
[2024-09-22 06:18:00,108][11734] Num frames 11600... |
|
[2024-09-22 06:18:00,295][11734] Num frames 11700... |
|
[2024-09-22 06:18:00,474][11734] Num frames 11800... |
|
[2024-09-22 06:18:00,643][11734] Num frames 11900... |
|
[2024-09-22 06:18:00,829][11734] Num frames 12000... |
|
[2024-09-22 06:18:00,980][11734] Num frames 12100... |
|
[2024-09-22 06:18:01,084][11734] Avg episode rewards: #0: 40.406, true rewards: #0: 15.156 |
|
[2024-09-22 06:18:01,086][11734] Avg episode reward: 40.406, avg true_objective: 15.156 |
|
[2024-09-22 06:18:01,199][11734] Num frames 12200... |
|
[2024-09-22 06:18:01,366][11734] Num frames 12300... |
|
[2024-09-22 06:18:01,519][11734] Num frames 12400... |
|
[2024-09-22 06:18:01,673][11734] Num frames 12500... |
|
[2024-09-22 06:18:01,836][11734] Num frames 12600... |
|
[2024-09-22 06:18:02,021][11734] Num frames 12700... |
|
[2024-09-22 06:18:02,176][11734] Num frames 12800... |
|
[2024-09-22 06:18:02,321][11734] Num frames 12900... |
|
[2024-09-22 06:18:02,447][11734] Avg episode rewards: #0: 37.938, true rewards: #0: 14.383 |
|
[2024-09-22 06:18:02,449][11734] Avg episode reward: 37.938, avg true_objective: 14.383 |
|
[2024-09-22 06:18:02,543][11734] Num frames 13000... |
|
[2024-09-22 06:18:02,691][11734] Num frames 13100... |
|
[2024-09-22 06:18:02,860][11734] Num frames 13200... |
|
[2024-09-22 06:18:03,019][11734] Num frames 13300... |
|
[2024-09-22 06:18:03,185][11734] Avg episode rewards: #0: 34.760, true rewards: #0: 13.361 |
|
[2024-09-22 06:18:03,187][11734] Avg episode reward: 34.760, avg true_objective: 13.361 |
|
[2024-09-22 06:18:40,287][11734] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2024-09-22 06:18:40,316][11734] Environment doom_basic already registered, overwriting... |
|
[2024-09-22 06:18:40,319][11734] Environment doom_two_colors_easy already registered, overwriting... |
|
[2024-09-22 06:18:40,320][11734] Environment doom_two_colors_hard already registered, overwriting... |
|
[2024-09-22 06:18:40,321][11734] Environment doom_dm already registered, overwriting... |
|
[2024-09-22 06:18:40,322][11734] Environment doom_dwango5 already registered, overwriting... |
|
[2024-09-22 06:18:40,325][11734] Environment doom_my_way_home_flat_actions already registered, overwriting... |
|
[2024-09-22 06:18:40,326][11734] Environment doom_defend_the_center_flat_actions already registered, overwriting... |
|
[2024-09-22 06:18:40,327][11734] Environment doom_my_way_home already registered, overwriting... |
|
[2024-09-22 06:18:40,328][11734] Environment doom_deadly_corridor already registered, overwriting... |
|
[2024-09-22 06:18:40,331][11734] Environment doom_defend_the_center already registered, overwriting... |
|
[2024-09-22 06:18:40,332][11734] Environment doom_defend_the_line already registered, overwriting... |
|
[2024-09-22 06:18:40,334][11734] Environment doom_health_gathering already registered, overwriting... |
|
[2024-09-22 06:18:40,336][11734] Environment doom_health_gathering_supreme already registered, overwriting... |
|
[2024-09-22 06:18:40,337][11734] Environment doom_battle already registered, overwriting... |
|
[2024-09-22 06:18:40,338][11734] Environment doom_battle2 already registered, overwriting... |
|
[2024-09-22 06:18:40,339][11734] Environment doom_duel_bots already registered, overwriting... |
|
[2024-09-22 06:18:40,342][11734] Environment doom_deathmatch_bots already registered, overwriting... |
|
[2024-09-22 06:18:40,343][11734] Environment doom_duel already registered, overwriting... |
|
[2024-09-22 06:18:40,345][11734] Environment doom_deathmatch_full already registered, overwriting... |
|
[2024-09-22 06:18:40,346][11734] Environment doom_benchmark already registered, overwriting... |
|
[2024-09-22 06:18:40,348][11734] register_encoder_factory: <function make_vizdoom_encoder at 0x7b467399b910> |
|
[2024-09-22 06:18:40,360][11734] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-09-22 06:18:40,366][11734] Experiment dir /content/train_dir/default_experiment already exists! |
|
[2024-09-22 06:18:40,367][11734] Resuming existing experiment from /content/train_dir/default_experiment... |
|
[2024-09-22 06:18:40,369][11734] Weights and Biases integration disabled |
|
[2024-09-22 06:18:40,372][11734] Environment var CUDA_VISIBLE_DEVICES is 0 |
|
|
|
[2024-09-22 06:18:42,971][11734] Starting experiment with the following configuration: |
|
help=False |
|
algo=APPO |
|
env=doom_health_gathering_supreme |
|
experiment=default_experiment |
|
train_dir=/content/train_dir |
|
restart_behavior=resume |
|
device=gpu |
|
seed=None |
|
num_policies=1 |
|
async_rl=True |
|
serial_mode=False |
|
batched_sampling=False |
|
num_batches_to_accumulate=2 |
|
worker_num_splits=2 |
|
policy_workers_per_policy=1 |
|
max_policy_lag=1000 |
|
num_workers=8 |
|
num_envs_per_worker=4 |
|
batch_size=1024 |
|
num_batches_per_epoch=1 |
|
num_epochs=1 |
|
rollout=32 |
|
recurrence=32 |
|
shuffle_minibatches=False |
|
gamma=0.99 |
|
reward_scale=1.0 |
|
reward_clip=1000.0 |
|
value_bootstrap=False |
|
normalize_returns=True |
|
exploration_loss_coeff=0.001 |
|
value_loss_coeff=0.5 |
|
kl_loss_coeff=0.0 |
|
exploration_loss=symmetric_kl |
|
gae_lambda=0.95 |
|
ppo_clip_ratio=0.1 |
|
ppo_clip_value=0.2 |
|
with_vtrace=False |
|
vtrace_rho=1.0 |
|
vtrace_c=1.0 |
|
optimizer=adam |
|
adam_eps=1e-06 |
|
adam_beta1=0.9 |
|
adam_beta2=0.999 |
|
max_grad_norm=4.0 |
|
learning_rate=0.0001 |
|
lr_schedule=constant |
|
lr_schedule_kl_threshold=0.008 |
|
lr_adaptive_min=1e-06 |
|
lr_adaptive_max=0.01 |
|
obs_subtract_mean=0.0 |
|
obs_scale=255.0 |
|
normalize_input=True |
|
normalize_input_keys=None |
|
decorrelate_experience_max_seconds=0 |
|
decorrelate_envs_on_one_worker=True |
|
actor_worker_gpus=[] |
|
set_workers_cpu_affinity=True |
|
force_envs_single_thread=False |
|
default_niceness=0 |
|
log_to_file=True |
|
experiment_summaries_interval=10 |
|
flush_summaries_interval=30 |
|
stats_avg=100 |
|
summaries_use_frameskip=True |
|
heartbeat_interval=20 |
|
heartbeat_reporting_interval=600 |
|
train_for_env_steps=10000000 |
|
train_for_seconds=10000000000 |
|
save_every_sec=120 |
|
keep_checkpoints=2 |
|
load_checkpoint_kind=latest |
|
save_milestones_sec=-1 |
|
save_best_every_sec=5 |
|
save_best_metric=reward |
|
save_best_after=100000 |
|
benchmark=False |
|
encoder_mlp_layers=[512, 512] |
|
encoder_conv_architecture=convnet_simple |
|
encoder_conv_mlp_layers=[512] |
|
use_rnn=True |
|
rnn_size=512 |
|
rnn_type=gru |
|
rnn_num_layers=1 |
|
decoder_mlp_layers=[] |
|
nonlinearity=elu |
|
policy_initialization=orthogonal |
|
policy_init_gain=1.0 |
|
actor_critic_share_weights=True |
|
adaptive_stddev=True |
|
continuous_tanh_scale=0.0 |
|
initial_stddev=1.0 |
|
use_env_info_cache=False |
|
env_gpu_actions=False |
|
env_gpu_observations=True |
|
env_frameskip=4 |
|
env_framestack=1 |
|
pixel_format=CHW |
|
use_record_episode_statistics=False |
|
with_wandb=False |
|
wandb_user=None |
|
wandb_project=sample_factory |
|
wandb_group=None |
|
wandb_job_type=SF |
|
wandb_tags=[] |
|
with_pbt=False |
|
pbt_mix_policies_in_one_env=True |
|
pbt_period_env_steps=5000000 |
|
pbt_start_mutation=20000000 |
|
pbt_replace_fraction=0.3 |
|
pbt_mutation_rate=0.15 |
|
pbt_replace_reward_gap=0.1 |
|
pbt_replace_reward_gap_absolute=1e-06 |
|
pbt_optimize_gamma=False |
|
pbt_target_objective=true_objective |
|
pbt_perturb_min=1.1 |
|
pbt_perturb_max=1.5 |
|
num_agents=-1 |
|
num_humans=0 |
|
num_bots=-1 |
|
start_bot_difficulty=None |
|
timelimit=None |
|
res_w=128 |
|
res_h=72 |
|
wide_aspect_ratio=False |
|
eval_env_frameskip=1 |
|
fps=35 |
|
command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=10000000 |
|
cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 10000000} |
|
git_hash=unknown |
|
git_repo_name=not a git repository |
|
[2024-09-22 06:18:42,973][11734] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2024-09-22 06:18:42,976][11734] Rollout worker 0 uses device cpu |
|
[2024-09-22 06:18:42,977][11734] Rollout worker 1 uses device cpu |
|
[2024-09-22 06:18:42,978][11734] Rollout worker 2 uses device cpu |
|
[2024-09-22 06:18:42,981][11734] Rollout worker 3 uses device cpu |
|
[2024-09-22 06:18:42,982][11734] Rollout worker 4 uses device cpu |
|
[2024-09-22 06:18:42,983][11734] Rollout worker 5 uses device cpu |
|
[2024-09-22 06:18:42,986][11734] Rollout worker 6 uses device cpu |
|
[2024-09-22 06:18:42,987][11734] Rollout worker 7 uses device cpu |
|
[2024-09-22 06:18:43,028][11734] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:18:43,030][11734] InferenceWorker_p0-w0: min num requests: 2 |
|
[2024-09-22 06:18:43,068][11734] Starting all processes... |
|
[2024-09-22 06:18:43,070][11734] Starting process learner_proc0 |
|
[2024-09-22 06:18:43,118][11734] Starting all processes... |
|
[2024-09-22 06:18:43,123][11734] Starting process inference_proc0-0 |
|
[2024-09-22 06:18:43,125][11734] Starting process rollout_proc0 |
|
[2024-09-22 06:18:43,126][11734] Starting process rollout_proc1 |
|
[2024-09-22 06:18:43,129][11734] Starting process rollout_proc2 |
|
[2024-09-22 06:18:43,131][11734] Starting process rollout_proc3 |
|
[2024-09-22 06:18:43,133][11734] Starting process rollout_proc4 |
|
[2024-09-22 06:18:43,140][11734] Starting process rollout_proc5 |
|
[2024-09-22 06:18:43,141][11734] Starting process rollout_proc6 |
|
[2024-09-22 06:18:43,155][11734] Starting process rollout_proc7 |
|
[2024-09-22 06:18:47,451][13282] Worker 4 uses CPU cores [4] |
|
[2024-09-22 06:18:47,487][13280] Worker 2 uses CPU cores [2] |
|
[2024-09-22 06:18:47,594][13281] Worker 3 uses CPU cores [3] |
|
[2024-09-22 06:18:47,605][13284] Worker 6 uses CPU cores [6] |
|
[2024-09-22 06:18:47,701][13285] Worker 7 uses CPU cores [7] |
|
[2024-09-22 06:18:47,722][13283] Worker 5 uses CPU cores [5] |
|
[2024-09-22 06:18:47,769][13260] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:18:47,769][13260] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2024-09-22 06:18:47,788][13260] Num visible devices: 1 |
|
[2024-09-22 06:18:47,794][13279] Worker 1 uses CPU cores [1] |
|
[2024-09-22 06:18:47,809][13260] Starting seed is not provided |
|
[2024-09-22 06:18:47,809][13260] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:18:47,809][13260] Initializing actor-critic model on device cuda:0 |
|
[2024-09-22 06:18:47,810][13260] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:18:47,811][13260] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:18:47,832][13260] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:18:47,846][13274] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:18:47,846][13274] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2024-09-22 06:18:47,862][13274] Num visible devices: 1 |
|
[2024-09-22 06:18:47,918][13278] Worker 0 uses CPU cores [0] |
|
[2024-09-22 06:18:47,975][13260] Conv encoder output size: 512 |
|
[2024-09-22 06:18:47,975][13260] Policy head output size: 512 |
|
[2024-09-22 06:18:47,992][13260] Created Actor Critic model with architecture: |
|
[2024-09-22 06:18:47,992][13260] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2024-09-22 06:18:48,219][13260] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2024-09-22 06:18:48,952][13260] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002445_10014720.pth... |
|
[2024-09-22 06:18:48,997][13260] Loading model from checkpoint |
|
[2024-09-22 06:18:48,999][13260] Loaded experiment state at self.train_step=2445, self.env_steps=10014720 |
|
[2024-09-22 06:18:49,000][13260] Initialized policy 0 weights for model version 2445 |
|
[2024-09-22 06:18:49,004][13260] LearnerWorker_p0 finished initialization! |
|
[2024-09-22 06:18:49,004][13260] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-09-22 06:18:49,191][13274] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:18:49,193][13274] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:18:49,207][13274] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:18:49,327][13274] Conv encoder output size: 512 |
|
[2024-09-22 06:18:49,327][13274] Policy head output size: 512 |
|
[2024-09-22 06:18:49,388][11734] Inference worker 0-0 is ready! |
|
[2024-09-22 06:18:49,390][11734] All inference workers are ready! Signal rollout workers to start! |
|
[2024-09-22 06:18:49,443][13278] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,446][13284] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,446][13279] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,449][13281] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,450][13283] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,451][13285] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,461][13280] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,508][13282] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-09-22 06:18:49,858][13281] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:49,938][13284] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:49,938][13278] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:49,957][13279] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:49,957][13283] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:49,958][13280] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:49,978][13282] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:50,224][13278] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,232][13283] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,240][13280] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,373][11734] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 10014720. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-09-22 06:18:50,377][13281] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,407][13285] Decorrelating experience for 0 frames... |
|
[2024-09-22 06:18:50,544][13282] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,583][13283] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:50,706][13285] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,833][13280] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:50,853][13279] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:50,868][13284] Decorrelating experience for 32 frames... |
|
[2024-09-22 06:18:51,025][13278] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:51,063][13282] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:51,222][13285] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:51,262][13280] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,307][13279] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:51,314][13281] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:51,366][13282] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,460][13284] Decorrelating experience for 64 frames... |
|
[2024-09-22 06:18:51,469][13283] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,703][13279] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,756][13284] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,759][13281] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,876][13278] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:51,905][13285] Decorrelating experience for 96 frames... |
|
[2024-09-22 06:18:53,021][13260] Signal inference workers to stop experience collection... |
|
[2024-09-22 06:18:53,028][13274] InferenceWorker_p0-w0: stopping experience collection |
|
[2024-09-22 06:18:55,372][11734] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 10014720. Throughput: 0: 84.8. Samples: 424. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-09-22 06:18:55,375][11734] Avg episode reward: [(0, '2.298')] |
|
[2024-09-22 06:18:55,699][13260] Signal inference workers to resume experience collection... |
|
[2024-09-22 06:18:55,700][13260] Stopping Batcher_0... |
|
[2024-09-22 06:18:55,701][13260] Loop batcher_evt_loop terminating... |
|
[2024-09-22 06:18:55,713][11734] Component Batcher_0 stopped! |
|
[2024-09-22 06:18:55,723][13274] Weights refcount: 2 0 |
|
[2024-09-22 06:18:55,725][13274] Stopping InferenceWorker_p0-w0... |
|
[2024-09-22 06:18:55,726][13274] Loop inference_proc0-0_evt_loop terminating... |
|
[2024-09-22 06:18:55,726][11734] Component InferenceWorker_p0-w0 stopped! |
|
[2024-09-22 06:18:55,748][13281] Stopping RolloutWorker_w3... |
|
[2024-09-22 06:18:55,749][13281] Loop rollout_proc3_evt_loop terminating... |
|
[2024-09-22 06:18:55,749][13278] Stopping RolloutWorker_w0... |
|
[2024-09-22 06:18:55,750][13278] Loop rollout_proc0_evt_loop terminating... |
|
[2024-09-22 06:18:55,749][11734] Component RolloutWorker_w3 stopped! |
|
[2024-09-22 06:18:55,751][11734] Component RolloutWorker_w0 stopped! |
|
[2024-09-22 06:18:55,753][13279] Stopping RolloutWorker_w1... |
|
[2024-09-22 06:18:55,753][13279] Loop rollout_proc1_evt_loop terminating... |
|
[2024-09-22 06:18:55,755][13280] Stopping RolloutWorker_w2... |
|
[2024-09-22 06:18:55,755][13280] Loop rollout_proc2_evt_loop terminating... |
|
[2024-09-22 06:18:55,752][13282] Stopping RolloutWorker_w4... |
|
[2024-09-22 06:18:55,756][13282] Loop rollout_proc4_evt_loop terminating... |
|
[2024-09-22 06:18:55,759][13284] Stopping RolloutWorker_w6... |
|
[2024-09-22 06:18:55,754][11734] Component RolloutWorker_w4 stopped! |
|
[2024-09-22 06:18:55,760][13284] Loop rollout_proc6_evt_loop terminating... |
|
[2024-09-22 06:18:55,760][11734] Component RolloutWorker_w1 stopped! |
|
[2024-09-22 06:18:55,761][11734] Component RolloutWorker_w2 stopped! |
|
[2024-09-22 06:18:55,764][11734] Component RolloutWorker_w6 stopped! |
|
[2024-09-22 06:18:55,838][13283] Stopping RolloutWorker_w5... |
|
[2024-09-22 06:18:55,839][13283] Loop rollout_proc5_evt_loop terminating... |
|
[2024-09-22 06:18:55,838][11734] Component RolloutWorker_w5 stopped! |
|
[2024-09-22 06:18:55,956][13285] Stopping RolloutWorker_w7... |
|
[2024-09-22 06:18:55,957][13285] Loop rollout_proc7_evt_loop terminating... |
|
[2024-09-22 06:18:55,957][11734] Component RolloutWorker_w7 stopped! |
|
[2024-09-22 06:18:56,413][13260] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002447_10022912.pth... |
|
[2024-09-22 06:18:56,518][13260] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth |
|
[2024-09-22 06:18:56,532][13260] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002447_10022912.pth... |
|
[2024-09-22 06:18:56,656][13260] Stopping LearnerWorker_p0... |
|
[2024-09-22 06:18:56,656][13260] Loop learner_proc0_evt_loop terminating... |
|
[2024-09-22 06:18:56,656][11734] Component LearnerWorker_p0 stopped! |
|
[2024-09-22 06:18:56,658][11734] Waiting for process learner_proc0 to stop... |
|
[2024-09-22 06:18:57,551][11734] Waiting for process inference_proc0-0 to join... |
|
[2024-09-22 06:18:57,555][11734] Waiting for process rollout_proc0 to join... |
|
[2024-09-22 06:18:57,557][11734] Waiting for process rollout_proc1 to join... |
|
[2024-09-22 06:18:57,560][11734] Waiting for process rollout_proc2 to join... |
|
[2024-09-22 06:18:57,562][11734] Waiting for process rollout_proc3 to join... |
|
[2024-09-22 06:18:57,564][11734] Waiting for process rollout_proc4 to join... |
|
[2024-09-22 06:18:57,567][11734] Waiting for process rollout_proc5 to join... |
|
[2024-09-22 06:18:57,569][11734] Waiting for process rollout_proc6 to join... |
|
[2024-09-22 06:18:57,572][11734] Waiting for process rollout_proc7 to join... |
|
[2024-09-22 06:18:57,574][11734] Batcher 0 profile tree view: |
|
batching: 0.0279, releasing_batches: 0.0006 |
|
[2024-09-22 06:18:57,575][11734] InferenceWorker_p0-w0 profile tree view: |
|
update_model: 0.0078 |
|
wait_policy: 0.0001 |
|
wait_policy_total: 1.9361 |
|
one_step: 0.0729 |
|
handle_policy_step: 1.6765 |
|
deserialize: 0.0400, stack: 0.0044, obs_to_device_normalize: 0.3127, forward: 1.1167, send_messages: 0.1127 |
|
prepare_outputs: 0.0566 |
|
to_cpu: 0.0275 |
|
[2024-09-22 06:18:57,577][11734] Learner 0 profile tree view: |
|
misc: 0.0000, prepare_batch: 1.4660 |
|
train: 2.3107 |
|
epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0004, kl_divergence: 0.0121, after_optimizer: 0.0508 |
|
calculate_losses: 0.8574 |
|
losses_init: 0.0000, forward_head: 0.2691, bptt_initial: 0.5100, tail: 0.0361, advantages_returns: 0.0010, losses: 0.0366 |
|
bptt: 0.0041 |
|
bptt_forward_core: 0.0039 |
|
update: 1.3889 |
|
clip: 0.0478 |
|
[2024-09-22 06:18:57,579][11734] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.0006, enqueue_policy_requests: 0.0289, env_step: 0.2787, overhead: 0.0196, complete_rollouts: 0.0007 |
|
save_policy_outputs: 0.0275 |
|
split_output_tensors: 0.0102 |
|
[2024-09-22 06:18:57,581][11734] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.0006, enqueue_policy_requests: 0.0295, env_step: 0.2849, overhead: 0.0200, complete_rollouts: 0.0007 |
|
save_policy_outputs: 0.0273 |
|
split_output_tensors: 0.0109 |
|
[2024-09-22 06:18:57,584][11734] Loop Runner_EvtLoop terminating... |
|
[2024-09-22 06:18:57,586][11734] Runner profile tree view: |
|
main_loop: 14.5178 |
|
[2024-09-22 06:18:57,587][11734] Collected {0: 10022912}, FPS: 564.3 |
|
[2024-09-22 06:19:03,049][11734] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-09-22 06:19:03,050][11734] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-09-22 06:19:03,052][11734] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-09-22 06:19:03,054][11734] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-09-22 06:19:03,055][11734] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:19:03,058][11734] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-09-22 06:19:03,059][11734] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:19:03,060][11734] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-09-22 06:19:03,061][11734] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2024-09-22 06:19:03,066][11734] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2024-09-22 06:19:03,067][11734] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-09-22 06:19:03,070][11734] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-09-22 06:19:03,071][11734] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-09-22 06:19:03,072][11734] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-09-22 06:19:03,076][11734] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-09-22 06:19:03,102][11734] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:19:03,105][11734] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:19:03,119][11734] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:19:03,167][11734] Conv encoder output size: 512 |
|
[2024-09-22 06:19:03,170][11734] Policy head output size: 512 |
|
[2024-09-22 06:19:03,192][11734] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002447_10022912.pth... |
|
[2024-09-22 06:19:03,692][11734] Num frames 100... |
|
[2024-09-22 06:19:03,869][11734] Num frames 200... |
|
[2024-09-22 06:19:04,041][11734] Num frames 300... |
|
[2024-09-22 06:19:04,201][11734] Num frames 400... |
|
[2024-09-22 06:19:04,362][11734] Num frames 500... |
|
[2024-09-22 06:19:04,543][11734] Num frames 600... |
|
[2024-09-22 06:19:04,707][11734] Num frames 700... |
|
[2024-09-22 06:19:04,862][11734] Num frames 800... |
|
[2024-09-22 06:19:05,010][11734] Num frames 900... |
|
[2024-09-22 06:19:05,165][11734] Avg episode rewards: #0: 20.600, true rewards: #0: 9.600 |
|
[2024-09-22 06:19:05,167][11734] Avg episode reward: 20.600, avg true_objective: 9.600 |
|
[2024-09-22 06:19:05,227][11734] Num frames 1000... |
|
[2024-09-22 06:19:05,361][11734] Num frames 1100... |
|
[2024-09-22 06:19:05,507][11734] Num frames 1200... |
|
[2024-09-22 06:19:05,640][11734] Avg episode rewards: #0: 13.240, true rewards: #0: 6.240 |
|
[2024-09-22 06:19:05,643][11734] Avg episode reward: 13.240, avg true_objective: 6.240 |
|
[2024-09-22 06:19:05,721][11734] Num frames 1300... |
|
[2024-09-22 06:19:05,869][11734] Num frames 1400... |
|
[2024-09-22 06:19:06,016][11734] Num frames 1500... |
|
[2024-09-22 06:19:06,183][11734] Num frames 1600... |
|
[2024-09-22 06:19:06,341][11734] Num frames 1700... |
|
[2024-09-22 06:19:06,500][11734] Num frames 1800... |
|
[2024-09-22 06:19:06,663][11734] Num frames 1900... |
|
[2024-09-22 06:19:06,821][11734] Num frames 2000... |
|
[2024-09-22 06:19:06,965][11734] Num frames 2100... |
|
[2024-09-22 06:19:07,115][11734] Num frames 2200... |
|
[2024-09-22 06:19:07,302][11734] Num frames 2300... |
|
[2024-09-22 06:19:07,457][11734] Num frames 2400... |
|
[2024-09-22 06:19:07,631][11734] Num frames 2500... |
|
[2024-09-22 06:19:07,787][11734] Num frames 2600... |
|
[2024-09-22 06:19:07,949][11734] Num frames 2700... |
|
[2024-09-22 06:19:08,090][11734] Num frames 2800... |
|
[2024-09-22 06:19:08,258][11734] Num frames 2900... |
|
[2024-09-22 06:19:08,442][11734] Num frames 3000... |
|
[2024-09-22 06:19:08,603][11734] Num frames 3100... |
|
[2024-09-22 06:19:08,755][11734] Num frames 3200... |
|
[2024-09-22 06:19:08,907][11734] Avg episode rewards: #0: 28.213, true rewards: #0: 10.880 |
|
[2024-09-22 06:19:08,908][11734] Avg episode reward: 28.213, avg true_objective: 10.880 |
|
[2024-09-22 06:19:08,968][11734] Num frames 3300... |
|
[2024-09-22 06:19:09,139][11734] Num frames 3400... |
|
[2024-09-22 06:19:09,305][11734] Num frames 3500... |
|
[2024-09-22 06:19:09,472][11734] Num frames 3600... |
|
[2024-09-22 06:19:09,637][11734] Num frames 3700... |
|
[2024-09-22 06:19:09,818][11734] Num frames 3800... |
|
[2024-09-22 06:19:09,973][11734] Num frames 3900... |
|
[2024-09-22 06:19:10,128][11734] Num frames 4000... |
|
[2024-09-22 06:19:10,281][11734] Num frames 4100... |
|
[2024-09-22 06:19:10,462][11734] Num frames 4200... |
|
[2024-09-22 06:19:10,633][11734] Num frames 4300... |
|
[2024-09-22 06:19:10,796][11734] Num frames 4400... |
|
[2024-09-22 06:19:10,956][11734] Num frames 4500... |
|
[2024-09-22 06:19:11,119][11734] Num frames 4600... |
|
[2024-09-22 06:19:11,264][11734] Num frames 4700... |
|
[2024-09-22 06:19:11,418][11734] Num frames 4800... |
|
[2024-09-22 06:19:11,608][11734] Num frames 4900... |
|
[2024-09-22 06:19:11,775][11734] Num frames 5000... |
|
[2024-09-22 06:19:11,925][11734] Num frames 5100... |
|
[2024-09-22 06:19:12,103][11734] Num frames 5200... |
|
[2024-09-22 06:19:12,245][11734] Avg episode rewards: #0: 34.384, true rewards: #0: 13.135 |
|
[2024-09-22 06:19:12,247][11734] Avg episode reward: 34.384, avg true_objective: 13.135 |
|
[2024-09-22 06:19:12,324][11734] Num frames 5300... |
|
[2024-09-22 06:19:12,481][11734] Num frames 5400... |
|
[2024-09-22 06:19:12,659][11734] Num frames 5500... |
|
[2024-09-22 06:19:12,820][11734] Num frames 5600... |
|
[2024-09-22 06:19:13,009][11734] Num frames 5700... |
|
[2024-09-22 06:19:13,209][11734] Num frames 5800... |
|
[2024-09-22 06:19:13,384][11734] Num frames 5900... |
|
[2024-09-22 06:19:13,549][11734] Num frames 6000... |
|
[2024-09-22 06:19:13,703][11734] Num frames 6100... |
|
[2024-09-22 06:19:13,876][11734] Num frames 6200... |
|
[2024-09-22 06:19:14,040][11734] Num frames 6300... |
|
[2024-09-22 06:19:14,199][11734] Num frames 6400... |
|
[2024-09-22 06:19:14,363][11734] Num frames 6500... |
|
[2024-09-22 06:19:14,533][11734] Num frames 6600... |
|
[2024-09-22 06:19:14,702][11734] Num frames 6700... |
|
[2024-09-22 06:19:14,859][11734] Num frames 6800... |
|
[2024-09-22 06:19:14,923][11734] Avg episode rewards: #0: 36.407, true rewards: #0: 13.608 |
|
[2024-09-22 06:19:14,925][11734] Avg episode reward: 36.407, avg true_objective: 13.608 |
|
[2024-09-22 06:19:15,075][11734] Num frames 6900... |
|
[2024-09-22 06:19:15,243][11734] Num frames 7000... |
|
[2024-09-22 06:19:15,402][11734] Num frames 7100... |
|
[2024-09-22 06:19:15,560][11734] Num frames 7200... |
|
[2024-09-22 06:19:15,724][11734] Num frames 7300... |
|
[2024-09-22 06:19:15,885][11734] Num frames 7400... |
|
[2024-09-22 06:19:16,037][11734] Num frames 7500... |
|
[2024-09-22 06:19:16,107][11734] Avg episode rewards: #0: 33.013, true rewards: #0: 12.513 |
|
[2024-09-22 06:19:16,109][11734] Avg episode reward: 33.013, avg true_objective: 12.513 |
|
[2024-09-22 06:19:16,246][11734] Num frames 7600... |
|
[2024-09-22 06:19:16,411][11734] Num frames 7700... |
|
[2024-09-22 06:19:16,545][11734] Num frames 7800... |
|
[2024-09-22 06:19:16,699][11734] Num frames 7900... |
|
[2024-09-22 06:19:16,858][11734] Num frames 8000... |
|
[2024-09-22 06:19:17,023][11734] Num frames 8100... |
|
[2024-09-22 06:19:17,104][11734] Avg episode rewards: #0: 29.880, true rewards: #0: 11.594 |
|
[2024-09-22 06:19:17,106][11734] Avg episode reward: 29.880, avg true_objective: 11.594 |
|
[2024-09-22 06:19:17,228][11734] Num frames 8200... |
|
[2024-09-22 06:19:17,390][11734] Num frames 8300... |
|
[2024-09-22 06:19:17,565][11734] Num frames 8400... |
|
[2024-09-22 06:19:17,734][11734] Num frames 8500... |
|
[2024-09-22 06:19:17,881][11734] Num frames 8600... |
|
[2024-09-22 06:19:18,053][11734] Num frames 8700... |
|
[2024-09-22 06:19:18,219][11734] Num frames 8800... |
|
[2024-09-22 06:19:18,361][11734] Num frames 8900... |
|
[2024-09-22 06:19:18,528][11734] Num frames 9000... |
|
[2024-09-22 06:19:18,701][11734] Num frames 9100... |
|
[2024-09-22 06:19:18,851][11734] Num frames 9200... |
|
[2024-09-22 06:19:19,007][11734] Num frames 9300... |
|
[2024-09-22 06:19:19,183][11734] Num frames 9400... |
|
[2024-09-22 06:19:19,348][11734] Num frames 9500... |
|
[2024-09-22 06:19:19,502][11734] Num frames 9600... |
|
[2024-09-22 06:19:19,654][11734] Num frames 9700... |
|
[2024-09-22 06:19:19,836][11734] Num frames 9800... |
|
[2024-09-22 06:19:19,986][11734] Num frames 9900... |
|
[2024-09-22 06:19:20,145][11734] Num frames 10000... |
|
[2024-09-22 06:19:20,313][11734] Num frames 10100... |
|
[2024-09-22 06:19:20,488][11734] Num frames 10200... |
|
[2024-09-22 06:19:20,563][11734] Avg episode rewards: #0: 33.016, true rewards: #0: 12.766 |
|
[2024-09-22 06:19:20,565][11734] Avg episode reward: 33.016, avg true_objective: 12.766 |
|
[2024-09-22 06:19:20,694][11734] Num frames 10300... |
|
[2024-09-22 06:19:20,851][11734] Num frames 10400... |
|
[2024-09-22 06:19:21,017][11734] Num frames 10500... |
|
[2024-09-22 06:19:21,188][11734] Num frames 10600... |
|
[2024-09-22 06:19:21,347][11734] Num frames 10700... |
|
[2024-09-22 06:19:21,507][11734] Num frames 10800... |
|
[2024-09-22 06:19:21,687][11734] Num frames 10900... |
|
[2024-09-22 06:19:21,845][11734] Num frames 11000... |
|
[2024-09-22 06:19:21,991][11734] Num frames 11100... |
|
[2024-09-22 06:19:22,162][11734] Num frames 11200... |
|
[2024-09-22 06:19:22,313][11734] Num frames 11300... |
|
[2024-09-22 06:19:22,467][11734] Num frames 11400... |
|
[2024-09-22 06:19:22,623][11734] Num frames 11500... |
|
[2024-09-22 06:19:22,726][11734] Avg episode rewards: #0: 33.694, true rewards: #0: 12.806 |
|
[2024-09-22 06:19:22,727][11734] Avg episode reward: 33.694, avg true_objective: 12.806 |
|
[2024-09-22 06:19:22,847][11734] Num frames 11600... |
|
[2024-09-22 06:19:23,010][11734] Num frames 11700... |
|
[2024-09-22 06:19:23,161][11734] Num frames 11800... |
|
[2024-09-22 06:19:23,332][11734] Avg episode rewards: #0: 30.777, true rewards: #0: 11.877 |
|
[2024-09-22 06:19:23,334][11734] Avg episode reward: 30.777, avg true_objective: 11.877 |
|
[2024-09-22 06:19:56,200][11734] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2024-09-22 06:20:31,391][11734] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-09-22 06:20:31,393][11734] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-09-22 06:20:31,394][11734] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-09-22 06:20:31,395][11734] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-09-22 06:20:31,398][11734] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-09-22 06:20:31,400][11734] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-09-22 06:20:31,401][11734] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2024-09-22 06:20:31,402][11734] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-09-22 06:20:31,405][11734] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2024-09-22 06:20:31,405][11734] Adding new argument 'hf_repository'='Vivek-huggingface/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2024-09-22 06:20:31,407][11734] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-09-22 06:20:31,409][11734] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-09-22 06:20:31,411][11734] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-09-22 06:20:31,412][11734] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-09-22 06:20:31,414][11734] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-09-22 06:20:31,440][11734] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-09-22 06:20:31,442][11734] RunningMeanStd input shape: (1,) |
|
[2024-09-22 06:20:31,457][11734] ConvEncoder: input_channels=3 |
|
[2024-09-22 06:20:31,502][11734] Conv encoder output size: 512 |
|
[2024-09-22 06:20:31,504][11734] Policy head output size: 512 |
|
[2024-09-22 06:20:31,532][11734] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002447_10022912.pth... |
|
[2024-09-22 06:20:32,034][11734] Num frames 100... |
|
[2024-09-22 06:20:32,187][11734] Num frames 200... |
|
[2024-09-22 06:20:32,360][11734] Num frames 300... |
|
[2024-09-22 06:20:32,527][11734] Num frames 400... |
|
[2024-09-22 06:20:32,684][11734] Num frames 500... |
|
[2024-09-22 06:20:32,844][11734] Num frames 600... |
|
[2024-09-22 06:20:33,013][11734] Num frames 700... |
|
[2024-09-22 06:20:33,168][11734] Num frames 800... |
|
[2024-09-22 06:20:33,327][11734] Num frames 900... |
|
[2024-09-22 06:20:33,498][11734] Num frames 1000... |
|
[2024-09-22 06:20:33,669][11734] Num frames 1100... |
|
[2024-09-22 06:20:33,839][11734] Num frames 1200... |
|
[2024-09-22 06:20:34,001][11734] Num frames 1300... |
|
[2024-09-22 06:20:34,177][11734] Num frames 1400... |
|
[2024-09-22 06:20:34,354][11734] Avg episode rewards: #0: 33.720, true rewards: #0: 14.720 |
|
[2024-09-22 06:20:34,356][11734] Avg episode reward: 33.720, avg true_objective: 14.720 |
|
[2024-09-22 06:20:34,408][11734] Num frames 1500... |
|
[2024-09-22 06:20:34,590][11734] Num frames 1600... |
|
[2024-09-22 06:20:34,754][11734] Num frames 1700... |
|
[2024-09-22 06:20:34,937][11734] Num frames 1800... |
|
[2024-09-22 06:20:35,107][11734] Num frames 1900... |
|
[2024-09-22 06:20:35,252][11734] Num frames 2000... |
|
[2024-09-22 06:20:35,398][11734] Avg episode rewards: #0: 21.740, true rewards: #0: 10.240 |
|
[2024-09-22 06:20:35,400][11734] Avg episode reward: 21.740, avg true_objective: 10.240 |
|
[2024-09-22 06:20:35,496][11734] Num frames 2100... |
|
[2024-09-22 06:20:35,668][11734] Num frames 2200... |
|
[2024-09-22 06:20:35,827][11734] Num frames 2300... |
|
[2024-09-22 06:20:36,009][11734] Num frames 2400... |
|
[2024-09-22 06:20:36,163][11734] Num frames 2500... |
|
[2024-09-22 06:20:36,329][11734] Num frames 2600... |
|
[2024-09-22 06:20:36,462][11734] Num frames 2700... |
|
[2024-09-22 06:20:36,591][11734] Num frames 2800... |
|
[2024-09-22 06:20:36,723][11734] Num frames 2900... |
|
[2024-09-22 06:20:36,860][11734] Num frames 3000... |
|
[2024-09-22 06:20:36,996][11734] Num frames 3100... |
|
[2024-09-22 06:20:37,128][11734] Num frames 3200... |
|
[2024-09-22 06:20:37,262][11734] Avg episode rewards: #0: 23.523, true rewards: #0: 10.857 |
|
[2024-09-22 06:20:37,263][11734] Avg episode reward: 23.523, avg true_objective: 10.857 |
|
[2024-09-22 06:20:37,322][11734] Num frames 3300... |
|
[2024-09-22 06:20:37,458][11734] Num frames 3400... |
|
[2024-09-22 06:20:37,594][11734] Num frames 3500... |
|
[2024-09-22 06:20:37,725][11734] Num frames 3600... |
|
[2024-09-22 06:20:37,861][11734] Num frames 3700... |
|
[2024-09-22 06:20:37,993][11734] Num frames 3800... |
|
[2024-09-22 06:20:38,131][11734] Avg episode rewards: #0: 20.913, true rewards: #0: 9.662 |
|
[2024-09-22 06:20:38,132][11734] Avg episode reward: 20.913, avg true_objective: 9.662 |
|
[2024-09-22 06:20:38,181][11734] Num frames 3900... |
|
[2024-09-22 06:20:38,311][11734] Num frames 4000... |
|
[2024-09-22 06:20:38,444][11734] Num frames 4100... |
|
[2024-09-22 06:20:38,577][11734] Num frames 4200... |
|
[2024-09-22 06:20:38,716][11734] Num frames 4300... |
|
[2024-09-22 06:20:38,854][11734] Num frames 4400... |
|
[2024-09-22 06:20:38,984][11734] Num frames 4500... |
|
[2024-09-22 06:20:39,116][11734] Num frames 4600... |
|
[2024-09-22 06:20:39,255][11734] Num frames 4700... |
|
[2024-09-22 06:20:39,393][11734] Num frames 4800... |
|
[2024-09-22 06:20:39,527][11734] Num frames 4900... |
|
[2024-09-22 06:20:39,667][11734] Num frames 5000... |
|
[2024-09-22 06:20:39,798][11734] Num frames 5100... |
|
[2024-09-22 06:20:39,934][11734] Num frames 5200... |
|
[2024-09-22 06:20:40,073][11734] Num frames 5300... |
|
[2024-09-22 06:20:40,211][11734] Num frames 5400... |
|
[2024-09-22 06:20:40,398][11734] Avg episode rewards: #0: 25.394, true rewards: #0: 10.994 |
|
[2024-09-22 06:20:40,400][11734] Avg episode reward: 25.394, avg true_objective: 10.994 |
|
[2024-09-22 06:20:40,406][11734] Num frames 5500... |
|
[2024-09-22 06:20:40,544][11734] Num frames 5600... |
|
[2024-09-22 06:20:40,679][11734] Num frames 5700... |
|
[2024-09-22 06:20:40,809][11734] Num frames 5800... |
|
[2024-09-22 06:20:40,942][11734] Num frames 5900... |
|
[2024-09-22 06:20:41,075][11734] Num frames 6000... |
|
[2024-09-22 06:20:41,205][11734] Num frames 6100... |
|
[2024-09-22 06:20:41,340][11734] Num frames 6200... |
|
[2024-09-22 06:20:41,475][11734] Num frames 6300... |
|
[2024-09-22 06:20:41,608][11734] Num frames 6400... |
|
[2024-09-22 06:20:41,743][11734] Num frames 6500... |
|
[2024-09-22 06:20:41,877][11734] Num frames 6600... |
|
[2024-09-22 06:20:42,043][11734] Avg episode rewards: #0: 25.468, true rewards: #0: 11.135 |
|
[2024-09-22 06:20:42,046][11734] Avg episode reward: 25.468, avg true_objective: 11.135 |
|
[2024-09-22 06:20:42,074][11734] Num frames 6700... |
|
[2024-09-22 06:20:42,211][11734] Num frames 6800... |
|
[2024-09-22 06:20:42,342][11734] Num frames 6900... |
|
[2024-09-22 06:20:42,474][11734] Num frames 7000... |
|
[2024-09-22 06:20:42,605][11734] Num frames 7100... |
|
[2024-09-22 06:20:42,745][11734] Num frames 7200... |
|
[2024-09-22 06:20:42,878][11734] Num frames 7300... |
|
[2024-09-22 06:20:43,012][11734] Num frames 7400... |
|
[2024-09-22 06:20:43,146][11734] Num frames 7500... |
|
[2024-09-22 06:20:43,282][11734] Num frames 7600... |
|
[2024-09-22 06:20:43,370][11734] Avg episode rewards: #0: 24.889, true rewards: #0: 10.889 |
|
[2024-09-22 06:20:43,373][11734] Avg episode reward: 24.889, avg true_objective: 10.889 |
|
[2024-09-22 06:20:43,478][11734] Num frames 7700... |
|
[2024-09-22 06:20:43,615][11734] Num frames 7800... |
|
[2024-09-22 06:20:43,751][11734] Num frames 7900... |
|
[2024-09-22 06:20:43,885][11734] Num frames 8000... |
|
[2024-09-22 06:20:44,019][11734] Num frames 8100... |
|
[2024-09-22 06:20:44,152][11734] Num frames 8200... |
|
[2024-09-22 06:20:44,286][11734] Num frames 8300... |
|
[2024-09-22 06:20:44,421][11734] Num frames 8400... |
|
[2024-09-22 06:20:44,558][11734] Num frames 8500... |
|
[2024-09-22 06:20:44,691][11734] Num frames 8600... |
|
[2024-09-22 06:20:44,825][11734] Num frames 8700... |
|
[2024-09-22 06:20:44,956][11734] Num frames 8800... |
|
[2024-09-22 06:20:45,086][11734] Num frames 8900... |
|
[2024-09-22 06:20:45,213][11734] Num frames 9000... |
|
[2024-09-22 06:20:45,343][11734] Num frames 9100... |
|
[2024-09-22 06:20:45,515][11734] Avg episode rewards: #0: 27.112, true rewards: #0: 11.487 |
|
[2024-09-22 06:20:45,517][11734] Avg episode reward: 27.112, avg true_objective: 11.487 |
|
[2024-09-22 06:20:45,532][11734] Num frames 9200... |
|
[2024-09-22 06:20:45,660][11734] Num frames 9300... |
|
[2024-09-22 06:20:45,798][11734] Num frames 9400... |
|
[2024-09-22 06:20:45,927][11734] Num frames 9500... |
|
[2024-09-22 06:20:46,054][11734] Num frames 9600... |
|
[2024-09-22 06:20:46,184][11734] Num frames 9700... |
|
[2024-09-22 06:20:46,312][11734] Num frames 9800... |
|
[2024-09-22 06:20:46,453][11734] Num frames 9900... |
|
[2024-09-22 06:20:46,589][11734] Num frames 10000... |
|
[2024-09-22 06:20:46,714][11734] Avg episode rewards: #0: 26.282, true rewards: #0: 11.171 |
|
[2024-09-22 06:20:46,716][11734] Avg episode reward: 26.282, avg true_objective: 11.171 |
|
[2024-09-22 06:20:46,780][11734] Num frames 10100... |
|
[2024-09-22 06:20:46,917][11734] Num frames 10200... |
|
[2024-09-22 06:20:47,052][11734] Num frames 10300... |
|
[2024-09-22 06:20:47,191][11734] Num frames 10400... |
|
[2024-09-22 06:20:47,326][11734] Num frames 10500... |
|
[2024-09-22 06:20:47,465][11734] Num frames 10600... |
|
[2024-09-22 06:20:47,603][11734] Num frames 10700... |
|
[2024-09-22 06:20:47,740][11734] Num frames 10800... |
|
[2024-09-22 06:20:47,880][11734] Num frames 10900... |
|
[2024-09-22 06:20:48,017][11734] Num frames 11000... |
|
[2024-09-22 06:20:48,150][11734] Num frames 11100... |
|
[2024-09-22 06:20:48,284][11734] Num frames 11200... |
|
[2024-09-22 06:20:48,413][11734] Num frames 11300... |
|
[2024-09-22 06:20:48,599][11734] Avg episode rewards: #0: 26.798, true rewards: #0: 11.398 |
|
[2024-09-22 06:20:48,601][11734] Avg episode reward: 26.798, avg true_objective: 11.398 |
|
[2024-09-22 06:20:48,606][11734] Num frames 11400... |
|
[2024-09-22 06:21:20,460][11734] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
|