[2024-06-15 11:31:07,246][1648982] Saving configuration to train_dir/atari_2B_atari_carnival_1111/config.json... [2024-06-15 11:31:07,266][1648982] Rollout worker 0 uses device cpu [2024-06-15 11:31:07,266][1648982] Rollout worker 1 uses device cpu [2024-06-15 11:31:07,266][1648982] Rollout worker 2 uses device cpu [2024-06-15 11:31:07,267][1648982] Rollout worker 3 uses device cpu [2024-06-15 11:31:09,482][1648982] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-15 11:31:09,482][1648982] InferenceWorker_p0-w0: min num requests: 1 [2024-06-15 11:31:09,496][1648982] Starting all processes... [2024-06-15 11:31:09,496][1648982] Starting process learner_proc0 [2024-06-15 11:31:12,351][1648982] Starting all processes... [2024-06-15 11:31:12,354][1648982] Starting process inference_proc0-0 [2024-06-15 11:31:12,355][1648982] Starting process rollout_proc0 [2024-06-15 11:31:12,355][1648982] Starting process rollout_proc1 [2024-06-15 11:31:12,356][1651596] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-15 11:31:12,356][1651596] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-15 11:31:12,355][1648982] Starting process rollout_proc2 [2024-06-15 11:31:12,355][1648982] Starting process rollout_proc3 [2024-06-15 11:31:12,419][1651596] Num visible devices: 1 [2024-06-15 11:31:12,508][1651596] Setting fixed seed 1111 [2024-06-15 11:31:12,510][1651596] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-15 11:31:12,510][1651596] Initializing actor-critic model on device cuda:0 [2024-06-15 11:31:12,511][1651596] RunningMeanStd input shape: (4, 84, 84) [2024-06-15 11:31:12,512][1651596] RunningMeanStd input shape: (1,) [2024-06-15 11:31:12,524][1651596] ConvEncoder: input_channels=4 [2024-06-15 11:31:12,621][1651596] Conv encoder output size: 512 [2024-06-15 11:31:12,624][1651596] Created Actor Critic model with architecture: [2024-06-15 11:31:12,624][1651596] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=6, bias=True) ) ) [2024-06-15 11:31:13,129][1651596] Using optimizer [2024-06-15 11:31:14,002][1651596] No checkpoints found [2024-06-15 11:31:14,003][1651596] Did not load from checkpoint, starting from scratch! [2024-06-15 11:31:14,003][1651596] Initialized policy 0 weights for model version 0 [2024-06-15 11:31:14,006][1651596] LearnerWorker_p0 finished initialization! [2024-06-15 11:31:14,006][1651596] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-15 11:31:14,420][1653647] Worker 2 uses CPU cores [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71] [2024-06-15 11:31:14,496][1653645] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-15 11:31:14,496][1653645] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-15 11:31:14,558][1653645] Num visible devices: 1 [2024-06-15 11:31:14,630][1653646] Worker 0 uses CPU cores [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23] [2024-06-15 11:31:14,668][1653650] Worker 3 uses CPU cores [72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95] [2024-06-15 11:31:14,716][1648982] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:14,726][1653648] Worker 1 uses CPU cores [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47] [2024-06-15 11:31:15,080][1653645] RunningMeanStd input shape: (4, 84, 84) [2024-06-15 11:31:15,081][1653645] RunningMeanStd input shape: (1,) [2024-06-15 11:31:15,102][1653645] ConvEncoder: input_channels=4 [2024-06-15 11:31:15,255][1653645] Conv encoder output size: 512 [2024-06-15 11:31:15,263][1648982] Inference worker 0-0 is ready! [2024-06-15 11:31:15,263][1648982] All inference workers are ready! Signal rollout workers to start! [2024-06-15 11:31:15,263][1653647] EnvRunner 2-0 uses policy 0 [2024-06-15 11:31:15,264][1653650] EnvRunner 3-0 uses policy 0 [2024-06-15 11:31:15,264][1653646] EnvRunner 0-0 uses policy 0 [2024-06-15 11:31:15,266][1653648] EnvRunner 1-0 uses policy 0 [2024-06-15 11:31:15,957][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:20,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:25,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:29,476][1648982] Heartbeat connected on Batcher_0 [2024-06-15 11:31:29,479][1648982] Heartbeat connected on LearnerWorker_p0 [2024-06-15 11:31:29,509][1648982] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-15 11:31:30,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:35,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:40,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:44,387][1648982] Heartbeat connected on RolloutWorker_w3 [2024-06-15 11:31:45,809][1648982] Heartbeat connected on RolloutWorker_w0 [2024-06-15 11:31:45,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 16.4. Samples: 512. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:47,230][1648982] Heartbeat connected on RolloutWorker_w2 [2024-06-15 11:31:48,260][1648982] Heartbeat connected on RolloutWorker_w1 [2024-06-15 11:31:50,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 621.6. Samples: 22528. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:54,728][1653650] Worker 3, sleep for 0.750 sec to decorrelate experience collection [2024-06-15 11:31:54,940][1651596] Signal inference workers to stop experience collection... [2024-06-15 11:31:54,967][1653645] InferenceWorker_p0-w0: stopping experience collection [2024-06-15 11:31:55,482][1653650] Worker 3 awakens! [2024-06-15 11:31:55,958][1648982] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2545.0. Samples: 104960. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-15 11:31:57,674][1651596] Signal inference workers to resume experience collection... [2024-06-15 11:31:57,675][1653645] InferenceWorker_p0-w0: resuming experience collection [2024-06-15 11:31:59,697][1653647] Worker 2, sleep for 0.500 sec to decorrelate experience collection [2024-06-15 11:32:00,032][1653645] Updated weights for policy 0, policy_version 81 (0.0013) [2024-06-15 11:32:00,214][1653647] Worker 2 awakens! [2024-06-15 11:32:00,958][1648982] Fps is (10 sec: 22937.7, 60 sec: 4960.4, 300 sec: 4960.4). Total num frames: 229376. Throughput: 0: 3128.9. Samples: 140800. Policy #0 lag: (min: 49.0, avg: 64.8, max: 65.0) [2024-06-15 11:32:01,375][1653648] Worker 1, sleep for 0.250 sec to decorrelate experience collection [2024-06-15 11:32:01,626][1653648] Worker 1 awakens! [2024-06-15 11:32:01,707][1653645] Updated weights for policy 0, policy_version 145 (0.0011) [2024-06-15 11:32:03,179][1653645] Updated weights for policy 0, policy_version 208 (0.0012) [2024-06-15 11:32:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 10231.7, 300 sec: 10231.7). Total num frames: 524288. Throughput: 0: 3777.4. Samples: 169984. Policy #0 lag: (min: 104.0, avg: 162.5, max: 169.0) [2024-06-15 11:32:09,427][1653645] Updated weights for policy 0, policy_version 260 (0.0012) [2024-06-15 11:32:10,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 11070.0, 300 sec: 11070.0). Total num frames: 622592. Throughput: 0: 5586.5. Samples: 251392. Policy #0 lag: (min: 14.0, avg: 73.7, max: 238.0) [2024-06-15 11:32:11,463][1653645] Updated weights for policy 0, policy_version 323 (0.0012) [2024-06-15 11:32:13,603][1653645] Updated weights for policy 0, policy_version 404 (0.0012) [2024-06-15 11:32:15,790][1653645] Updated weights for policy 0, policy_version 496 (0.0185) [2024-06-15 11:32:15,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 16930.1, 300 sec: 16586.9). Total num frames: 1015808. Throughput: 0: 6712.9. Samples: 302080. Policy #0 lag: (min: 77.0, avg: 212.3, max: 333.0) [2024-06-15 11:32:20,962][1648982] Fps is (10 sec: 42580.9, 60 sec: 17475.0, 300 sec: 15828.6). Total num frames: 1048576. Throughput: 0: 7656.5. Samples: 344576. Policy #0 lag: (min: 77.0, avg: 212.3, max: 333.0) [2024-06-15 11:32:21,889][1653645] Updated weights for policy 0, policy_version 529 (0.0011) [2024-06-15 11:32:23,800][1653645] Updated weights for policy 0, policy_version 608 (0.0105) [2024-06-15 11:32:25,306][1653645] Updated weights for policy 0, policy_version 677 (0.0130) [2024-06-15 11:32:25,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 24029.7, 300 sec: 20238.0). Total num frames: 1441792. Throughput: 0: 9079.4. Samples: 408576. Policy #0 lag: (min: 95.0, avg: 171.3, max: 351.0) [2024-06-15 11:32:26,863][1653645] Updated weights for policy 0, policy_version 752 (0.0074) [2024-06-15 11:32:30,958][1648982] Fps is (10 sec: 52450.7, 60 sec: 26214.3, 300 sec: 20630.0). Total num frames: 1572864. Throughput: 0: 10820.3. Samples: 487424. Policy #0 lag: (min: 95.0, avg: 171.3, max: 351.0) [2024-06-15 11:32:32,739][1653645] Updated weights for policy 0, policy_version 800 (0.0012) [2024-06-15 11:32:34,668][1651596] Signal inference workers to stop experience collection... (50 times) [2024-06-15 11:32:34,716][1653645] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-15 11:32:34,719][1653645] Updated weights for policy 0, policy_version 865 (0.0011) [2024-06-15 11:32:34,901][1651596] Signal inference workers to resume experience collection... (50 times) [2024-06-15 11:32:34,903][1653645] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-15 11:32:35,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 31675.7, 300 sec: 23393.8). Total num frames: 1900544. Throughput: 0: 11070.6. Samples: 520704. Policy #0 lag: (min: 63.0, avg: 155.5, max: 335.0) [2024-06-15 11:32:36,416][1653645] Updated weights for policy 0, policy_version 944 (0.0011) [2024-06-15 11:32:37,487][1653645] Updated weights for policy 0, policy_version 993 (0.0026) [2024-06-15 11:32:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 34952.5, 300 sec: 24317.2). Total num frames: 2097152. Throughput: 0: 10797.5. Samples: 590848. Policy #0 lag: (min: 63.0, avg: 155.5, max: 335.0) [2024-06-15 11:32:40,959][1648982] Avg episode reward: [(0, '10.647')] [2024-06-15 11:32:40,961][1651596] Saving new best policy, reward=10.647! [2024-06-15 11:32:42,530][1653645] Updated weights for policy 0, policy_version 1040 (0.0020) [2024-06-15 11:32:43,565][1653645] Updated weights for policy 0, policy_version 1086 (0.0013) [2024-06-15 11:32:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 38775.6, 300 sec: 25498.6). Total num frames: 2326528. Throughput: 0: 11673.6. Samples: 666112. Policy #0 lag: (min: 15.0, avg: 81.5, max: 271.0) [2024-06-15 11:32:45,958][1648982] Avg episode reward: [(0, '11.080')] [2024-06-15 11:32:46,169][1653645] Updated weights for policy 0, policy_version 1156 (0.0022) [2024-06-15 11:32:46,345][1651596] Saving new best policy, reward=11.080! [2024-06-15 11:32:48,363][1653645] Updated weights for policy 0, policy_version 1249 (0.0137) [2024-06-15 11:32:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 27238.2). Total num frames: 2621440. Throughput: 0: 11514.3. Samples: 688128. Policy #0 lag: (min: 152.0, avg: 249.3, max: 376.0) [2024-06-15 11:32:50,958][1648982] Avg episode reward: [(0, '11.320')] [2024-06-15 11:32:50,959][1651596] Saving new best policy, reward=11.320! [2024-06-15 11:32:55,408][1653645] Updated weights for policy 0, policy_version 1328 (0.0029) [2024-06-15 11:32:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 27187.6). Total num frames: 2752512. Throughput: 0: 11605.3. Samples: 773632. Policy #0 lag: (min: 15.0, avg: 81.2, max: 271.0) [2024-06-15 11:32:55,958][1648982] Avg episode reward: [(0, '11.830')] [2024-06-15 11:32:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000001344_2752512.pth... [2024-06-15 11:32:56,020][1651596] Saving new best policy, reward=11.830! [2024-06-15 11:32:57,508][1653645] Updated weights for policy 0, policy_version 1377 (0.0131) [2024-06-15 11:32:59,744][1653645] Updated weights for policy 0, policy_version 1478 (0.0012) [2024-06-15 11:33:00,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 48605.7, 300 sec: 29609.2). Total num frames: 3145728. Throughput: 0: 11423.3. Samples: 816128. Policy #0 lag: (min: 47.0, avg: 134.0, max: 303.0) [2024-06-15 11:33:00,959][1648982] Avg episode reward: [(0, '11.760')] [2024-06-15 11:33:05,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 28278.4). Total num frames: 3145728. Throughput: 0: 11253.7. Samples: 850944. Policy #0 lag: (min: 47.0, avg: 134.0, max: 303.0) [2024-06-15 11:33:05,959][1648982] Avg episode reward: [(0, '12.300')] [2024-06-15 11:33:05,960][1651596] Saving new best policy, reward=12.300! [2024-06-15 11:33:07,567][1653645] Updated weights for policy 0, policy_version 1570 (0.0015) [2024-06-15 11:33:09,053][1653645] Updated weights for policy 0, policy_version 1616 (0.0013) [2024-06-15 11:33:10,584][1653645] Updated weights for policy 0, policy_version 1680 (0.0013) [2024-06-15 11:33:10,958][1648982] Fps is (10 sec: 29490.9, 60 sec: 46967.4, 300 sec: 29599.0). Total num frames: 3440640. Throughput: 0: 11514.3. Samples: 926720. Policy #0 lag: (min: 79.0, avg: 152.8, max: 302.0) [2024-06-15 11:33:10,958][1648982] Avg episode reward: [(0, '14.170')] [2024-06-15 11:33:11,396][1651596] Saving new best policy, reward=14.170! [2024-06-15 11:33:12,677][1651596] Signal inference workers to stop experience collection... (100 times) [2024-06-15 11:33:12,731][1653645] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-15 11:33:12,962][1651596] Signal inference workers to resume experience collection... (100 times) [2024-06-15 11:33:12,964][1653645] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-15 11:33:13,221][1653645] Updated weights for policy 0, policy_version 1788 (0.0183) [2024-06-15 11:33:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 30270.3). Total num frames: 3670016. Throughput: 0: 11002.3. Samples: 982528. Policy #0 lag: (min: 79.0, avg: 152.8, max: 302.0) [2024-06-15 11:33:15,959][1648982] Avg episode reward: [(0, '13.800')] [2024-06-15 11:33:19,840][1653645] Updated weights for policy 0, policy_version 1860 (0.0013) [2024-06-15 11:33:20,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 47516.9, 300 sec: 30888.4). Total num frames: 3899392. Throughput: 0: 11434.7. Samples: 1035264. Policy #0 lag: (min: 15.0, avg: 74.4, max: 271.0) [2024-06-15 11:33:20,958][1648982] Avg episode reward: [(0, '14.460')] [2024-06-15 11:33:21,369][1653645] Updated weights for policy 0, policy_version 1922 (0.0012) [2024-06-15 11:33:21,646][1651596] Saving new best policy, reward=14.460! [2024-06-15 11:33:23,497][1653645] Updated weights for policy 0, policy_version 1985 (0.0013) [2024-06-15 11:33:24,603][1653645] Updated weights for policy 0, policy_version 2036 (0.0012) [2024-06-15 11:33:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.4, 300 sec: 31958.6). Total num frames: 4194304. Throughput: 0: 11025.0. Samples: 1086976. Policy #0 lag: (min: 84.0, avg: 172.8, max: 340.0) [2024-06-15 11:33:25,958][1648982] Avg episode reward: [(0, '14.700')] [2024-06-15 11:33:25,963][1651596] Saving new best policy, reward=14.700! [2024-06-15 11:33:29,989][1653645] Updated weights for policy 0, policy_version 2080 (0.0011) [2024-06-15 11:33:30,960][1648982] Fps is (10 sec: 42598.1, 60 sec: 45875.2, 300 sec: 31747.8). Total num frames: 4325376. Throughput: 0: 11377.8. Samples: 1178112. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 11:33:30,961][1648982] Avg episode reward: [(0, '14.860')] [2024-06-15 11:33:31,688][1651596] Saving new best policy, reward=14.860! [2024-06-15 11:33:32,161][1653645] Updated weights for policy 0, policy_version 2160 (0.0013) [2024-06-15 11:33:34,378][1653645] Updated weights for policy 0, policy_version 2226 (0.0048) [2024-06-15 11:33:35,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 46421.3, 300 sec: 33176.0). Total num frames: 4685824. Throughput: 0: 11389.1. Samples: 1200640. Policy #0 lag: (min: 106.0, avg: 182.8, max: 346.0) [2024-06-15 11:33:35,958][1648982] Avg episode reward: [(0, '15.020')] [2024-06-15 11:33:36,082][1651596] Saving new best policy, reward=15.020! [2024-06-15 11:33:36,093][1653645] Updated weights for policy 0, policy_version 2304 (0.0114) [2024-06-15 11:33:40,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 32489.8). Total num frames: 4751360. Throughput: 0: 11093.3. Samples: 1272832. Policy #0 lag: (min: 12.0, avg: 81.3, max: 268.0) [2024-06-15 11:33:40,958][1648982] Avg episode reward: [(0, '15.180')] [2024-06-15 11:33:41,217][1651596] Saving new best policy, reward=15.180! [2024-06-15 11:33:41,979][1653645] Updated weights for policy 0, policy_version 2357 (0.0013) [2024-06-15 11:33:43,555][1653645] Updated weights for policy 0, policy_version 2416 (0.0012) [2024-06-15 11:33:44,971][1653645] Updated weights for policy 0, policy_version 2465 (0.0012) [2024-06-15 11:33:45,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46421.4, 300 sec: 33799.0). Total num frames: 5111808. Throughput: 0: 11514.3. Samples: 1334272. Policy #0 lag: (min: 53.0, avg: 169.5, max: 341.0) [2024-06-15 11:33:45,958][1648982] Avg episode reward: [(0, '15.090')] [2024-06-15 11:33:46,895][1653645] Updated weights for policy 0, policy_version 2544 (0.0013) [2024-06-15 11:33:50,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 33556.2). Total num frames: 5242880. Throughput: 0: 11411.9. Samples: 1364480. Policy #0 lag: (min: 53.0, avg: 169.5, max: 341.0) [2024-06-15 11:33:50,959][1648982] Avg episode reward: [(0, '14.130')] [2024-06-15 11:33:52,964][1653645] Updated weights for policy 0, policy_version 2593 (0.0012) [2024-06-15 11:33:55,040][1653645] Updated weights for policy 0, policy_version 2644 (0.0024) [2024-06-15 11:33:55,905][1651596] Signal inference workers to stop experience collection... (150 times) [2024-06-15 11:33:55,958][1648982] Fps is (10 sec: 36043.4, 60 sec: 45328.8, 300 sec: 33938.2). Total num frames: 5472256. Throughput: 0: 11559.8. Samples: 1446912. Policy #0 lag: (min: 15.0, avg: 78.2, max: 271.0) [2024-06-15 11:33:55,959][1648982] Avg episode reward: [(0, '12.200')] [2024-06-15 11:33:56,066][1653645] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-15 11:33:56,206][1651596] Signal inference workers to resume experience collection... (150 times) [2024-06-15 11:33:56,206][1653645] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-15 11:33:56,806][1653645] Updated weights for policy 0, policy_version 2708 (0.0016) [2024-06-15 11:33:58,879][1653645] Updated weights for policy 0, policy_version 2787 (0.0013) [2024-06-15 11:34:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 34691.5). Total num frames: 5767168. Throughput: 0: 11571.2. Samples: 1503232. Policy #0 lag: (min: 60.0, avg: 136.5, max: 236.0) [2024-06-15 11:34:00,958][1648982] Avg episode reward: [(0, '12.340')] [2024-06-15 11:34:04,077][1653645] Updated weights for policy 0, policy_version 2820 (0.0011) [2024-06-15 11:34:05,260][1653645] Updated weights for policy 0, policy_version 2873 (0.0012) [2024-06-15 11:34:05,962][1648982] Fps is (10 sec: 42581.5, 60 sec: 45871.9, 300 sec: 34443.1). Total num frames: 5898240. Throughput: 0: 11342.6. Samples: 1545728. Policy #0 lag: (min: 15.0, avg: 92.2, max: 271.0) [2024-06-15 11:34:05,963][1648982] Avg episode reward: [(0, '12.440')] [2024-06-15 11:34:07,864][1653645] Updated weights for policy 0, policy_version 2944 (0.0013) [2024-06-15 11:34:09,645][1653645] Updated weights for policy 0, policy_version 3012 (0.0012) [2024-06-15 11:34:10,879][1653645] Updated weights for policy 0, policy_version 3065 (0.0011) [2024-06-15 11:34:10,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 46967.6, 300 sec: 35512.0). Total num frames: 6258688. Throughput: 0: 11468.8. Samples: 1603072. Policy #0 lag: (min: 76.0, avg: 175.0, max: 332.0) [2024-06-15 11:34:10,958][1648982] Avg episode reward: [(0, '12.470')] [2024-06-15 11:34:15,958][1648982] Fps is (10 sec: 42617.0, 60 sec: 44236.9, 300 sec: 34893.9). Total num frames: 6324224. Throughput: 0: 11059.2. Samples: 1675776. Policy #0 lag: (min: 15.0, avg: 93.8, max: 271.0) [2024-06-15 11:34:15,958][1648982] Avg episode reward: [(0, '13.000')] [2024-06-15 11:34:16,644][1653645] Updated weights for policy 0, policy_version 3124 (0.0088) [2024-06-15 11:34:19,903][1653645] Updated weights for policy 0, policy_version 3187 (0.0013) [2024-06-15 11:34:20,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 35540.6). Total num frames: 6619136. Throughput: 0: 11241.2. Samples: 1706496. Policy #0 lag: (min: 63.0, avg: 148.5, max: 319.0) [2024-06-15 11:34:20,958][1648982] Avg episode reward: [(0, '13.340')] [2024-06-15 11:34:21,734][1653645] Updated weights for policy 0, policy_version 3267 (0.0013) [2024-06-15 11:34:22,771][1653645] Updated weights for policy 0, policy_version 3328 (0.0012) [2024-06-15 11:34:25,960][1648982] Fps is (10 sec: 49138.9, 60 sec: 43688.8, 300 sec: 35639.0). Total num frames: 6815744. Throughput: 0: 10990.3. Samples: 1767424. Policy #0 lag: (min: 63.0, avg: 148.5, max: 319.0) [2024-06-15 11:34:25,961][1648982] Avg episode reward: [(0, '13.160')] [2024-06-15 11:34:28,123][1653645] Updated weights for policy 0, policy_version 3390 (0.0015) [2024-06-15 11:34:30,970][1648982] Fps is (10 sec: 39272.5, 60 sec: 44773.6, 300 sec: 35731.0). Total num frames: 7012352. Throughput: 0: 11351.8. Samples: 1845248. Policy #0 lag: (min: 15.0, avg: 98.1, max: 271.0) [2024-06-15 11:34:30,971][1648982] Avg episode reward: [(0, '14.060')] [2024-06-15 11:34:31,761][1653645] Updated weights for policy 0, policy_version 3458 (0.0145) [2024-06-15 11:34:33,113][1653645] Updated weights for policy 0, policy_version 3520 (0.0118) [2024-06-15 11:34:34,214][1653645] Updated weights for policy 0, policy_version 3577 (0.0014) [2024-06-15 11:34:35,958][1648982] Fps is (10 sec: 52440.6, 60 sec: 44236.5, 300 sec: 36473.7). Total num frames: 7340032. Throughput: 0: 11172.9. Samples: 1867264. Policy #0 lag: (min: 49.0, avg: 117.0, max: 262.0) [2024-06-15 11:34:35,959][1648982] Avg episode reward: [(0, '14.410')] [2024-06-15 11:34:37,463][1651596] Signal inference workers to stop experience collection... (200 times) [2024-06-15 11:34:37,515][1653645] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-15 11:34:37,870][1651596] Signal inference workers to resume experience collection... (200 times) [2024-06-15 11:34:37,870][1653645] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-15 11:34:38,268][1653645] Updated weights for policy 0, policy_version 3616 (0.0012) [2024-06-15 11:34:38,907][1653645] Updated weights for policy 0, policy_version 3647 (0.0012) [2024-06-15 11:34:40,958][1648982] Fps is (10 sec: 45932.6, 60 sec: 45329.0, 300 sec: 36225.0). Total num frames: 7471104. Throughput: 0: 11275.5. Samples: 1954304. Policy #0 lag: (min: 15.0, avg: 108.4, max: 271.0) [2024-06-15 11:34:40,958][1648982] Avg episode reward: [(0, '14.610')] [2024-06-15 11:34:42,158][1653645] Updated weights for policy 0, policy_version 3704 (0.0012) [2024-06-15 11:34:43,868][1653645] Updated weights for policy 0, policy_version 3776 (0.0013) [2024-06-15 11:34:45,127][1653645] Updated weights for policy 0, policy_version 3833 (0.0013) [2024-06-15 11:34:45,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 45875.2, 300 sec: 37229.0). Total num frames: 7864320. Throughput: 0: 11423.3. Samples: 2017280. Policy #0 lag: (min: 63.0, avg: 139.5, max: 319.0) [2024-06-15 11:34:45,958][1648982] Avg episode reward: [(0, '14.430')] [2024-06-15 11:34:49,052][1653645] Updated weights for policy 0, policy_version 3888 (0.0011) [2024-06-15 11:34:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 36974.3). Total num frames: 7995392. Throughput: 0: 11538.2. Samples: 2064896. Policy #0 lag: (min: 10.0, avg: 114.9, max: 266.0) [2024-06-15 11:34:50,958][1648982] Avg episode reward: [(0, '14.410')] [2024-06-15 11:34:51,700][1653645] Updated weights for policy 0, policy_version 3922 (0.0011) [2024-06-15 11:34:53,119][1653645] Updated weights for policy 0, policy_version 3989 (0.0013) [2024-06-15 11:34:54,196][1653645] Updated weights for policy 0, policy_version 4039 (0.0013) [2024-06-15 11:34:55,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 48605.9, 300 sec: 37916.0). Total num frames: 8388608. Throughput: 0: 11719.0. Samples: 2130432. Policy #0 lag: (min: 63.0, avg: 139.2, max: 314.0) [2024-06-15 11:34:55,959][1648982] Avg episode reward: [(0, '14.530')] [2024-06-15 11:34:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000004096_8388608.pth... [2024-06-15 11:34:58,644][1653645] Updated weights for policy 0, policy_version 4100 (0.0014) [2024-06-15 11:34:59,891][1653645] Updated weights for policy 0, policy_version 4160 (0.0014) [2024-06-15 11:35:00,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 37657.5). Total num frames: 8519680. Throughput: 0: 12003.6. Samples: 2215936. Policy #0 lag: (min: 15.0, avg: 116.2, max: 271.0) [2024-06-15 11:35:00,958][1648982] Avg episode reward: [(0, '14.330')] [2024-06-15 11:35:02,473][1653645] Updated weights for policy 0, policy_version 4196 (0.0016) [2024-06-15 11:35:04,230][1653645] Updated weights for policy 0, policy_version 4272 (0.0117) [2024-06-15 11:35:05,895][1653645] Updated weights for policy 0, policy_version 4349 (0.0014) [2024-06-15 11:35:05,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 50247.9, 300 sec: 38543.7). Total num frames: 8912896. Throughput: 0: 12106.0. Samples: 2251264. Policy #0 lag: (min: 78.0, avg: 159.4, max: 303.0) [2024-06-15 11:35:05,958][1648982] Avg episode reward: [(0, '14.570')] [2024-06-15 11:35:10,790][1653645] Updated weights for policy 0, policy_version 4415 (0.0017) [2024-06-15 11:35:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 46421.4, 300 sec: 38282.7). Total num frames: 9043968. Throughput: 0: 12231.8. Samples: 2317824. Policy #0 lag: (min: 3.0, avg: 111.9, max: 259.0) [2024-06-15 11:35:10,958][1648982] Avg episode reward: [(0, '15.510')] [2024-06-15 11:35:10,962][1651596] Saving new best policy, reward=15.510! [2024-06-15 11:35:14,584][1653645] Updated weights for policy 0, policy_version 4479 (0.0115) [2024-06-15 11:35:15,114][1651596] Signal inference workers to stop experience collection... (250 times) [2024-06-15 11:35:15,166][1653645] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-15 11:35:15,295][1651596] Signal inference workers to resume experience collection... (250 times) [2024-06-15 11:35:15,296][1653645] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-15 11:35:15,952][1653645] Updated weights for policy 0, policy_version 4544 (0.0065) [2024-06-15 11:35:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 49698.1, 300 sec: 38575.9). Total num frames: 9306112. Throughput: 0: 11995.5. Samples: 2384896. Policy #0 lag: (min: 63.0, avg: 142.2, max: 319.0) [2024-06-15 11:35:15,958][1648982] Avg episode reward: [(0, '15.710')] [2024-06-15 11:35:16,289][1651596] Saving new best policy, reward=15.710! [2024-06-15 11:35:20,370][1653645] Updated weights for policy 0, policy_version 4609 (0.0013) [2024-06-15 11:35:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 47513.7, 300 sec: 38458.0). Total num frames: 9469952. Throughput: 0: 12208.5. Samples: 2416640. Policy #0 lag: (min: 0.0, avg: 109.5, max: 256.0) [2024-06-15 11:35:20,959][1648982] Avg episode reward: [(0, '16.100')] [2024-06-15 11:35:21,580][1651596] Saving new best policy, reward=16.100! [2024-06-15 11:35:21,872][1653645] Updated weights for policy 0, policy_version 4665 (0.0012) [2024-06-15 11:35:25,488][1653645] Updated weights for policy 0, policy_version 4708 (0.0015) [2024-06-15 11:35:25,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 47515.7, 300 sec: 38475.2). Total num frames: 9666560. Throughput: 0: 12049.1. Samples: 2496512. Policy #0 lag: (min: 33.0, avg: 110.3, max: 289.0) [2024-06-15 11:35:25,958][1648982] Avg episode reward: [(0, '16.890')] [2024-06-15 11:35:26,658][1651596] Saving new best policy, reward=16.890! [2024-06-15 11:35:28,220][1653645] Updated weights for policy 0, policy_version 4816 (0.0120) [2024-06-15 11:35:29,146][1653645] Updated weights for policy 0, policy_version 4861 (0.0012) [2024-06-15 11:35:30,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 49162.2, 300 sec: 38875.3). Total num frames: 9961472. Throughput: 0: 11810.1. Samples: 2548736. Policy #0 lag: (min: 33.0, avg: 110.3, max: 289.0) [2024-06-15 11:35:30,959][1648982] Avg episode reward: [(0, '17.230')] [2024-06-15 11:35:30,959][1651596] Saving new best policy, reward=17.230! [2024-06-15 11:35:33,505][1653645] Updated weights for policy 0, policy_version 4912 (0.0017) [2024-06-15 11:35:35,957][1648982] Fps is (10 sec: 42598.9, 60 sec: 45875.6, 300 sec: 38633.0). Total num frames: 10092544. Throughput: 0: 11525.8. Samples: 2583552. Policy #0 lag: (min: 9.0, avg: 126.0, max: 265.0) [2024-06-15 11:35:35,958][1648982] Avg episode reward: [(0, '17.310')] [2024-06-15 11:35:35,959][1651596] Saving new best policy, reward=17.310! [2024-06-15 11:35:36,826][1653645] Updated weights for policy 0, policy_version 4947 (0.0011) [2024-06-15 11:35:38,388][1653645] Updated weights for policy 0, policy_version 5011 (0.0042) [2024-06-15 11:35:39,448][1653645] Updated weights for policy 0, policy_version 5060 (0.0013) [2024-06-15 11:35:40,834][1653645] Updated weights for policy 0, policy_version 5116 (0.0013) [2024-06-15 11:35:40,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 50244.3, 300 sec: 39384.4). Total num frames: 10485760. Throughput: 0: 11764.7. Samples: 2659840. Policy #0 lag: (min: 52.0, avg: 125.3, max: 308.0) [2024-06-15 11:35:40,958][1648982] Avg episode reward: [(0, '18.010')] [2024-06-15 11:35:40,988][1651596] Saving new best policy, reward=18.010! [2024-06-15 11:35:44,550][1653645] Updated weights for policy 0, policy_version 5183 (0.0032) [2024-06-15 11:35:45,958][1648982] Fps is (10 sec: 52426.9, 60 sec: 45875.1, 300 sec: 39141.6). Total num frames: 10616832. Throughput: 0: 11355.0. Samples: 2726912. Policy #0 lag: (min: 15.0, avg: 132.5, max: 271.0) [2024-06-15 11:35:45,959][1648982] Avg episode reward: [(0, '17.760')] [2024-06-15 11:35:48,358][1653645] Updated weights for policy 0, policy_version 5223 (0.0013) [2024-06-15 11:35:49,352][1653645] Updated weights for policy 0, policy_version 5264 (0.0015) [2024-06-15 11:35:50,699][1653645] Updated weights for policy 0, policy_version 5328 (0.0013) [2024-06-15 11:35:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 48606.0, 300 sec: 39500.7). Total num frames: 10911744. Throughput: 0: 11480.2. Samples: 2767872. Policy #0 lag: (min: 55.0, avg: 133.5, max: 311.0) [2024-06-15 11:35:50,958][1648982] Avg episode reward: [(0, '18.620')] [2024-06-15 11:35:51,416][1651596] Saving new best policy, reward=18.620! [2024-06-15 11:35:51,702][1653645] Updated weights for policy 0, policy_version 5370 (0.0016) [2024-06-15 11:35:54,455][1651596] Signal inference workers to stop experience collection... (300 times) [2024-06-15 11:35:54,482][1653645] Updated weights for policy 0, policy_version 5412 (0.0013) [2024-06-15 11:35:54,526][1653645] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-15 11:35:54,737][1651596] Signal inference workers to resume experience collection... (300 times) [2024-06-15 11:35:54,739][1653645] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-15 11:35:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.3, 300 sec: 39614.0). Total num frames: 11141120. Throughput: 0: 11605.3. Samples: 2840064. Policy #0 lag: (min: 31.0, avg: 158.0, max: 287.0) [2024-06-15 11:35:55,959][1648982] Avg episode reward: [(0, '19.180')] [2024-06-15 11:35:55,973][1651596] Saving new best policy, reward=19.180! [2024-06-15 11:35:57,896][1653645] Updated weights for policy 0, policy_version 5442 (0.0013) [2024-06-15 11:35:58,866][1653645] Updated weights for policy 0, policy_version 5495 (0.0011) [2024-06-15 11:36:00,471][1653645] Updated weights for policy 0, policy_version 5538 (0.0013) [2024-06-15 11:36:00,974][1648982] Fps is (10 sec: 45798.1, 60 sec: 47500.2, 300 sec: 39721.1). Total num frames: 11370496. Throughput: 0: 11817.1. Samples: 2916864. Policy #0 lag: (min: 31.0, avg: 134.4, max: 287.0) [2024-06-15 11:36:00,975][1648982] Avg episode reward: [(0, '20.010')] [2024-06-15 11:36:01,327][1651596] Saving new best policy, reward=20.010! [2024-06-15 11:36:02,090][1653645] Updated weights for policy 0, policy_version 5618 (0.0011) [2024-06-15 11:36:03,881][1653645] Updated weights for policy 0, policy_version 5636 (0.0013) [2024-06-15 11:36:05,311][1653645] Updated weights for policy 0, policy_version 5694 (0.0042) [2024-06-15 11:36:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45875.2, 300 sec: 40054.1). Total num frames: 11665408. Throughput: 0: 12105.9. Samples: 2961408. Policy #0 lag: (min: 31.0, avg: 134.4, max: 287.0) [2024-06-15 11:36:05,958][1648982] Avg episode reward: [(0, '21.690')] [2024-06-15 11:36:05,959][1651596] Saving new best policy, reward=21.690! [2024-06-15 11:36:09,505][1653645] Updated weights for policy 0, policy_version 5753 (0.0013) [2024-06-15 11:36:10,880][1653645] Updated weights for policy 0, policy_version 5797 (0.0053) [2024-06-15 11:36:10,958][1648982] Fps is (10 sec: 49233.4, 60 sec: 46967.2, 300 sec: 40210.2). Total num frames: 11862016. Throughput: 0: 11821.4. Samples: 3028480. Policy #0 lag: (min: 15.0, avg: 101.1, max: 271.0) [2024-06-15 11:36:10,959][1648982] Avg episode reward: [(0, '22.750')] [2024-06-15 11:36:11,306][1651596] Saving new best policy, reward=22.750! [2024-06-15 11:36:12,419][1653645] Updated weights for policy 0, policy_version 5879 (0.0029) [2024-06-15 11:36:15,783][1653645] Updated weights for policy 0, policy_version 5920 (0.0011) [2024-06-15 11:36:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46967.5, 300 sec: 41098.8). Total num frames: 12124160. Throughput: 0: 12333.6. Samples: 3103744. Policy #0 lag: (min: 1.0, avg: 113.0, max: 257.0) [2024-06-15 11:36:15,958][1648982] Avg episode reward: [(0, '22.240')] [2024-06-15 11:36:19,897][1653645] Updated weights for policy 0, policy_version 5958 (0.0020) [2024-06-15 11:36:20,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46967.3, 300 sec: 41654.2). Total num frames: 12288000. Throughput: 0: 12299.3. Samples: 3137024. Policy #0 lag: (min: 40.0, avg: 116.7, max: 296.0) [2024-06-15 11:36:20,959][1648982] Avg episode reward: [(0, '22.260')] [2024-06-15 11:36:21,170][1653645] Updated weights for policy 0, policy_version 6009 (0.0014) [2024-06-15 11:36:23,287][1653645] Updated weights for policy 0, policy_version 6112 (0.0014) [2024-06-15 11:36:25,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 48605.8, 300 sec: 42653.9). Total num frames: 12582912. Throughput: 0: 11935.3. Samples: 3196928. Policy #0 lag: (min: 40.0, avg: 116.7, max: 296.0) [2024-06-15 11:36:25,958][1648982] Avg episode reward: [(0, '20.280')] [2024-06-15 11:36:26,865][1653645] Updated weights for policy 0, policy_version 6160 (0.0013) [2024-06-15 11:36:27,794][1653645] Updated weights for policy 0, policy_version 6202 (0.0011) [2024-06-15 11:36:30,961][1648982] Fps is (10 sec: 45863.6, 60 sec: 46419.3, 300 sec: 43208.9). Total num frames: 12746752. Throughput: 0: 12287.3. Samples: 3279872. Policy #0 lag: (min: 13.0, avg: 127.5, max: 269.0) [2024-06-15 11:36:30,962][1648982] Avg episode reward: [(0, '19.790')] [2024-06-15 11:36:31,562][1653645] Updated weights for policy 0, policy_version 6244 (0.0012) [2024-06-15 11:36:32,043][1653645] Updated weights for policy 0, policy_version 6270 (0.0011) [2024-06-15 11:36:33,592][1653645] Updated weights for policy 0, policy_version 6320 (0.0150) [2024-06-15 11:36:34,516][1651596] Signal inference workers to stop experience collection... (350 times) [2024-06-15 11:36:34,561][1653645] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-15 11:36:34,726][1651596] Signal inference workers to resume experience collection... (350 times) [2024-06-15 11:36:34,727][1653645] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-15 11:36:35,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 50243.9, 300 sec: 44431.1). Total num frames: 13107200. Throughput: 0: 12026.2. Samples: 3309056. Policy #0 lag: (min: 15.0, avg: 117.9, max: 255.0) [2024-06-15 11:36:35,958][1648982] Avg episode reward: [(0, '19.440')] [2024-06-15 11:36:38,086][1653645] Updated weights for policy 0, policy_version 6405 (0.0017) [2024-06-15 11:36:40,957][1648982] Fps is (10 sec: 49166.2, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 13238272. Throughput: 0: 11980.9. Samples: 3379200. Policy #0 lag: (min: 47.0, avg: 161.5, max: 303.0) [2024-06-15 11:36:40,960][1648982] Avg episode reward: [(0, '19.590')] [2024-06-15 11:36:42,540][1653645] Updated weights for policy 0, policy_version 6482 (0.0014) [2024-06-15 11:36:44,861][1653645] Updated weights for policy 0, policy_version 6546 (0.0026) [2024-06-15 11:36:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 48059.7, 300 sec: 45764.1). Total num frames: 13500416. Throughput: 0: 11689.3. Samples: 3442688. Policy #0 lag: (min: 35.0, avg: 147.7, max: 291.0) [2024-06-15 11:36:45,958][1648982] Avg episode reward: [(0, '20.180')] [2024-06-15 11:36:46,453][1653645] Updated weights for policy 0, policy_version 6610 (0.0013) [2024-06-15 11:36:47,502][1653645] Updated weights for policy 0, policy_version 6656 (0.0013) [2024-06-15 11:36:50,156][1653645] Updated weights for policy 0, policy_version 6704 (0.0010) [2024-06-15 11:36:50,978][1648982] Fps is (10 sec: 52320.8, 60 sec: 47497.3, 300 sec: 46649.5). Total num frames: 13762560. Throughput: 0: 11406.7. Samples: 3474944. Policy #0 lag: (min: 35.0, avg: 147.7, max: 291.0) [2024-06-15 11:36:50,979][1648982] Avg episode reward: [(0, '20.010')] [2024-06-15 11:36:52,972][1653645] Updated weights for policy 0, policy_version 6752 (0.0011) [2024-06-15 11:36:55,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 46421.4, 300 sec: 46430.6). Total num frames: 13926400. Throughput: 0: 11776.0. Samples: 3558400. Policy #0 lag: (min: 7.0, avg: 98.6, max: 263.0) [2024-06-15 11:36:55,959][1648982] Avg episode reward: [(0, '22.080')] [2024-06-15 11:36:56,171][1653645] Updated weights for policy 0, policy_version 6818 (0.0014) [2024-06-15 11:36:56,376][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000006832_13991936.pth... [2024-06-15 11:36:56,541][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000001344_2752512.pth [2024-06-15 11:36:58,191][1653645] Updated weights for policy 0, policy_version 6905 (0.0012) [2024-06-15 11:37:00,958][1648982] Fps is (10 sec: 42685.8, 60 sec: 46980.6, 300 sec: 46319.5). Total num frames: 14188544. Throughput: 0: 11582.6. Samples: 3624960. Policy #0 lag: (min: 14.0, avg: 142.2, max: 270.0) [2024-06-15 11:37:00,958][1648982] Avg episode reward: [(0, '24.280')] [2024-06-15 11:37:01,281][1651596] Saving new best policy, reward=24.280! [2024-06-15 11:37:01,282][1653645] Updated weights for policy 0, policy_version 6960 (0.0011) [2024-06-15 11:37:01,684][1653645] Updated weights for policy 0, policy_version 6976 (0.0010) [2024-06-15 11:37:03,853][1653645] Updated weights for policy 0, policy_version 7027 (0.0033) [2024-06-15 11:37:05,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45875.2, 300 sec: 46763.8). Total num frames: 14417920. Throughput: 0: 11685.0. Samples: 3662848. Policy #0 lag: (min: 14.0, avg: 142.2, max: 270.0) [2024-06-15 11:37:05,961][1648982] Avg episode reward: [(0, '26.690')] [2024-06-15 11:37:05,962][1651596] Saving new best policy, reward=26.690! [2024-06-15 11:37:07,392][1653645] Updated weights for policy 0, policy_version 7077 (0.0012) [2024-06-15 11:37:09,080][1653645] Updated weights for policy 0, policy_version 7152 (0.0014) [2024-06-15 11:37:10,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 46967.5, 300 sec: 46319.5). Total num frames: 14680064. Throughput: 0: 11889.7. Samples: 3731968. Policy #0 lag: (min: 21.0, avg: 108.7, max: 277.0) [2024-06-15 11:37:10,958][1648982] Avg episode reward: [(0, '27.290')] [2024-06-15 11:37:11,236][1651596] Saving new best policy, reward=27.290! [2024-06-15 11:37:11,692][1653645] Updated weights for policy 0, policy_version 7200 (0.0012) [2024-06-15 11:37:14,764][1653645] Updated weights for policy 0, policy_version 7268 (0.0018) [2024-06-15 11:37:15,412][1653645] Updated weights for policy 0, policy_version 7296 (0.0010) [2024-06-15 11:37:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 47097.7). Total num frames: 14942208. Throughput: 0: 11651.6. Samples: 3804160. Policy #0 lag: (min: 15.0, avg: 136.9, max: 271.0) [2024-06-15 11:37:15,958][1648982] Avg episode reward: [(0, '27.460')] [2024-06-15 11:37:15,959][1651596] Saving new best policy, reward=27.460! [2024-06-15 11:37:18,643][1653645] Updated weights for policy 0, policy_version 7347 (0.0154) [2024-06-15 11:37:18,972][1651596] Signal inference workers to stop experience collection... (400 times) [2024-06-15 11:37:19,031][1653645] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-15 11:37:19,211][1651596] Signal inference workers to resume experience collection... (400 times) [2024-06-15 11:37:19,212][1653645] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-15 11:37:20,353][1653645] Updated weights for policy 0, policy_version 7417 (0.0014) [2024-06-15 11:37:20,962][1648982] Fps is (10 sec: 52430.0, 60 sec: 48606.1, 300 sec: 46652.8). Total num frames: 15204352. Throughput: 0: 11924.0. Samples: 3845632. Policy #0 lag: (min: 15.0, avg: 136.9, max: 271.0) [2024-06-15 11:37:20,963][1648982] Avg episode reward: [(0, '27.530')] [2024-06-15 11:37:20,963][1651596] Saving new best policy, reward=27.530! [2024-06-15 11:37:23,497][1653645] Updated weights for policy 0, policy_version 7484 (0.0012) [2024-06-15 11:37:25,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 47513.7, 300 sec: 46986.0). Total num frames: 15433728. Throughput: 0: 11673.6. Samples: 3904512. Policy #0 lag: (min: 15.0, avg: 136.3, max: 271.0) [2024-06-15 11:37:25,958][1648982] Avg episode reward: [(0, '25.640')] [2024-06-15 11:37:26,081][1653645] Updated weights for policy 0, policy_version 7540 (0.0011) [2024-06-15 11:37:29,504][1653645] Updated weights for policy 0, policy_version 7574 (0.0020) [2024-06-15 11:37:30,855][1653645] Updated weights for policy 0, policy_version 7634 (0.0014) [2024-06-15 11:37:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 48061.9, 300 sec: 46541.7). Total num frames: 15630336. Throughput: 0: 11924.0. Samples: 3979264. Policy #0 lag: (min: 15.0, avg: 100.1, max: 271.0) [2024-06-15 11:37:30,959][1648982] Avg episode reward: [(0, '24.240')] [2024-06-15 11:37:31,830][1653645] Updated weights for policy 0, policy_version 7676 (0.0020) [2024-06-15 11:37:34,873][1653645] Updated weights for policy 0, policy_version 7721 (0.0089) [2024-06-15 11:37:35,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 45875.2, 300 sec: 46652.7). Total num frames: 15859712. Throughput: 0: 11963.5. Samples: 4013056. Policy #0 lag: (min: 15.0, avg: 100.1, max: 271.0) [2024-06-15 11:37:35,959][1648982] Avg episode reward: [(0, '24.470')] [2024-06-15 11:37:36,858][1653645] Updated weights for policy 0, policy_version 7778 (0.0013) [2024-06-15 11:37:40,805][1653645] Updated weights for policy 0, policy_version 7824 (0.0079) [2024-06-15 11:37:40,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 46421.0, 300 sec: 46430.6). Total num frames: 16023552. Throughput: 0: 11719.1. Samples: 4085760. Policy #0 lag: (min: 1.0, avg: 138.8, max: 257.0) [2024-06-15 11:37:40,959][1648982] Avg episode reward: [(0, '21.760')] [2024-06-15 11:37:42,201][1653645] Updated weights for policy 0, policy_version 7888 (0.0013) [2024-06-15 11:37:43,064][1653645] Updated weights for policy 0, policy_version 7926 (0.0012) [2024-06-15 11:37:45,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 46967.7, 300 sec: 46430.6). Total num frames: 16318464. Throughput: 0: 11753.3. Samples: 4153856. Policy #0 lag: (min: 98.0, avg: 175.4, max: 354.0) [2024-06-15 11:37:45,958][1648982] Avg episode reward: [(0, '21.780')] [2024-06-15 11:37:46,103][1653645] Updated weights for policy 0, policy_version 7984 (0.0013) [2024-06-15 11:37:47,552][1653645] Updated weights for policy 0, policy_version 8048 (0.0013) [2024-06-15 11:37:50,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 45890.9, 300 sec: 46652.8). Total num frames: 16515072. Throughput: 0: 11616.7. Samples: 4185600. Policy #0 lag: (min: 98.0, avg: 175.4, max: 354.0) [2024-06-15 11:37:50,959][1648982] Avg episode reward: [(0, '22.030')] [2024-06-15 11:37:52,509][1653645] Updated weights for policy 0, policy_version 8128 (0.0013) [2024-06-15 11:37:53,664][1653645] Updated weights for policy 0, policy_version 8185 (0.0013) [2024-06-15 11:37:55,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 48059.8, 300 sec: 46319.5). Total num frames: 16809984. Throughput: 0: 11821.5. Samples: 4263936. Policy #0 lag: (min: 1.0, avg: 82.3, max: 257.0) [2024-06-15 11:37:55,958][1648982] Avg episode reward: [(0, '22.560')] [2024-06-15 11:37:56,430][1653645] Updated weights for policy 0, policy_version 8240 (0.0013) [2024-06-15 11:37:57,882][1653645] Updated weights for policy 0, policy_version 8288 (0.0014) [2024-06-15 11:38:00,958][1648982] Fps is (10 sec: 52426.4, 60 sec: 47513.3, 300 sec: 47097.0). Total num frames: 17039360. Throughput: 0: 11764.5. Samples: 4333568. Policy #0 lag: (min: 34.0, avg: 177.3, max: 297.0) [2024-06-15 11:38:00,959][1648982] Avg episode reward: [(0, '25.060')] [2024-06-15 11:38:02,717][1651596] Signal inference workers to stop experience collection... (450 times) [2024-06-15 11:38:02,758][1653645] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-15 11:38:02,941][1651596] Signal inference workers to resume experience collection... (450 times) [2024-06-15 11:38:02,954][1653645] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-15 11:38:03,268][1653645] Updated weights for policy 0, policy_version 8368 (0.0041) [2024-06-15 11:38:04,746][1653645] Updated weights for policy 0, policy_version 8441 (0.0014) [2024-06-15 11:38:05,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 48059.6, 300 sec: 46986.0). Total num frames: 17301504. Throughput: 0: 11605.3. Samples: 4367872. Policy #0 lag: (min: 34.0, avg: 177.3, max: 297.0) [2024-06-15 11:38:05,958][1648982] Avg episode reward: [(0, '27.170')] [2024-06-15 11:38:08,222][1653645] Updated weights for policy 0, policy_version 8505 (0.0086) [2024-06-15 11:38:09,404][1653645] Updated weights for policy 0, policy_version 8560 (0.0013) [2024-06-15 11:38:10,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 48059.9, 300 sec: 47097.1). Total num frames: 17563648. Throughput: 0: 11821.5. Samples: 4436480. Policy #0 lag: (min: 15.0, avg: 126.3, max: 271.0) [2024-06-15 11:38:10,958][1648982] Avg episode reward: [(0, '29.550')] [2024-06-15 11:38:10,975][1651596] Saving new best policy, reward=29.550! [2024-06-15 11:38:13,917][1653645] Updated weights for policy 0, policy_version 8595 (0.0018) [2024-06-15 11:38:15,004][1653645] Updated weights for policy 0, policy_version 8656 (0.0070) [2024-06-15 11:38:15,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 47513.6, 300 sec: 47097.1). Total num frames: 17793024. Throughput: 0: 11821.5. Samples: 4511232. Policy #0 lag: (min: 63.0, avg: 146.3, max: 319.0) [2024-06-15 11:38:15,958][1648982] Avg episode reward: [(0, '30.490')] [2024-06-15 11:38:16,073][1653645] Updated weights for policy 0, policy_version 8701 (0.0012) [2024-06-15 11:38:16,117][1651596] Saving new best policy, reward=30.490! [2024-06-15 11:38:19,075][1653645] Updated weights for policy 0, policy_version 8763 (0.0013) [2024-06-15 11:38:20,917][1653645] Updated weights for policy 0, policy_version 8828 (0.0014) [2024-06-15 11:38:20,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 47513.5, 300 sec: 46986.0). Total num frames: 18055168. Throughput: 0: 11855.7. Samples: 4546560. Policy #0 lag: (min: 63.0, avg: 146.3, max: 319.0) [2024-06-15 11:38:20,958][1648982] Avg episode reward: [(0, '30.590')] [2024-06-15 11:38:20,977][1651596] Saving new best policy, reward=30.590! [2024-06-15 11:38:25,958][1648982] Fps is (10 sec: 36043.4, 60 sec: 45328.7, 300 sec: 46874.9). Total num frames: 18153472. Throughput: 0: 11753.2. Samples: 4614656. Policy #0 lag: (min: 15.0, avg: 96.7, max: 271.0) [2024-06-15 11:38:25,960][1648982] Avg episode reward: [(0, '30.420')] [2024-06-15 11:38:26,360][1653645] Updated weights for policy 0, policy_version 8888 (0.0013) [2024-06-15 11:38:27,550][1653645] Updated weights for policy 0, policy_version 8944 (0.0012) [2024-06-15 11:38:29,908][1653645] Updated weights for policy 0, policy_version 8992 (0.0011) [2024-06-15 11:38:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 47513.6, 300 sec: 46763.8). Total num frames: 18481152. Throughput: 0: 11696.3. Samples: 4680192. Policy #0 lag: (min: 15.0, avg: 96.7, max: 271.0) [2024-06-15 11:38:30,958][1648982] Avg episode reward: [(0, '28.730')] [2024-06-15 11:38:31,274][1653645] Updated weights for policy 0, policy_version 9040 (0.0030) [2024-06-15 11:38:35,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 45875.3, 300 sec: 46986.0). Total num frames: 18612224. Throughput: 0: 11730.5. Samples: 4713472. Policy #0 lag: (min: 15.0, avg: 96.7, max: 271.0) [2024-06-15 11:38:35,959][1648982] Avg episode reward: [(0, '28.520')] [2024-06-15 11:38:37,208][1653645] Updated weights for policy 0, policy_version 9104 (0.0013) [2024-06-15 11:38:39,031][1653645] Updated weights for policy 0, policy_version 9184 (0.0013) [2024-06-15 11:38:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 47513.8, 300 sec: 46652.7). Total num frames: 18874368. Throughput: 0: 11457.4. Samples: 4779520. Policy #0 lag: (min: 15.0, avg: 95.4, max: 271.0) [2024-06-15 11:38:40,958][1648982] Avg episode reward: [(0, '28.690')] [2024-06-15 11:38:41,403][1653645] Updated weights for policy 0, policy_version 9218 (0.0012) [2024-06-15 11:38:42,862][1653645] Updated weights for policy 0, policy_version 9296 (0.0012) [2024-06-15 11:38:43,300][1651596] Signal inference workers to stop experience collection... (500 times) [2024-06-15 11:38:43,351][1653645] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-15 11:38:43,544][1651596] Signal inference workers to resume experience collection... (500 times) [2024-06-15 11:38:43,545][1653645] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-15 11:38:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46967.4, 300 sec: 47097.1). Total num frames: 19136512. Throughput: 0: 11377.9. Samples: 4845568. Policy #0 lag: (min: 15.0, avg: 95.4, max: 271.0) [2024-06-15 11:38:45,961][1648982] Avg episode reward: [(0, '27.730')] [2024-06-15 11:38:48,807][1653645] Updated weights for policy 0, policy_version 9347 (0.0016) [2024-06-15 11:38:50,026][1653645] Updated weights for policy 0, policy_version 9399 (0.0010) [2024-06-15 11:38:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.1, 300 sec: 46763.9). Total num frames: 19267584. Throughput: 0: 11423.3. Samples: 4881920. Policy #0 lag: (min: 2.0, avg: 79.4, max: 258.0) [2024-06-15 11:38:50,958][1648982] Avg episode reward: [(0, '27.240')] [2024-06-15 11:38:51,881][1653645] Updated weights for policy 0, policy_version 9441 (0.0010) [2024-06-15 11:38:53,844][1653645] Updated weights for policy 0, policy_version 9536 (0.0131) [2024-06-15 11:38:54,803][1653645] Updated weights for policy 0, policy_version 9586 (0.0018) [2024-06-15 11:38:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 47513.6, 300 sec: 47097.0). Total num frames: 19660800. Throughput: 0: 11343.6. Samples: 4946944. Policy #0 lag: (min: 2.0, avg: 79.4, max: 258.0) [2024-06-15 11:38:55,958][1648982] Avg episode reward: [(0, '28.750')] [2024-06-15 11:38:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000009600_19660800.pth... [2024-06-15 11:38:56,040][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000004096_8388608.pth [2024-06-15 11:39:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44237.0, 300 sec: 46764.5). Total num frames: 19693568. Throughput: 0: 11343.6. Samples: 5021696. Policy #0 lag: (min: 7.0, avg: 86.9, max: 263.0) [2024-06-15 11:39:00,958][1648982] Avg episode reward: [(0, '28.250')] [2024-06-15 11:39:01,397][1653645] Updated weights for policy 0, policy_version 9648 (0.0012) [2024-06-15 11:39:03,489][1653645] Updated weights for policy 0, policy_version 9681 (0.0011) [2024-06-15 11:39:04,346][1653645] Updated weights for policy 0, policy_version 9731 (0.0012) [2024-06-15 11:39:05,665][1653645] Updated weights for policy 0, policy_version 9794 (0.0011) [2024-06-15 11:39:05,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 46421.4, 300 sec: 46874.9). Total num frames: 20086784. Throughput: 0: 11400.6. Samples: 5059584. Policy #0 lag: (min: 7.0, avg: 86.9, max: 263.0) [2024-06-15 11:39:05,958][1648982] Avg episode reward: [(0, '31.660')] [2024-06-15 11:39:06,233][1651596] Saving new best policy, reward=31.660! [2024-06-15 11:39:07,041][1653645] Updated weights for policy 0, policy_version 9850 (0.0011) [2024-06-15 11:39:10,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43690.6, 300 sec: 46986.0). Total num frames: 20185088. Throughput: 0: 11355.1. Samples: 5125632. Policy #0 lag: (min: 7.0, avg: 86.9, max: 263.0) [2024-06-15 11:39:10,958][1648982] Avg episode reward: [(0, '31.560')] [2024-06-15 11:39:12,897][1653645] Updated weights for policy 0, policy_version 9909 (0.0040) [2024-06-15 11:39:15,368][1653645] Updated weights for policy 0, policy_version 9976 (0.0013) [2024-06-15 11:39:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 46986.0). Total num frames: 20480000. Throughput: 0: 11457.4. Samples: 5195776. Policy #0 lag: (min: 15.0, avg: 103.1, max: 271.0) [2024-06-15 11:39:15,958][1648982] Avg episode reward: [(0, '32.680')] [2024-06-15 11:39:16,367][1651596] Saving new best policy, reward=32.680! [2024-06-15 11:39:16,702][1653645] Updated weights for policy 0, policy_version 10048 (0.0012) [2024-06-15 11:39:20,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 44236.7, 300 sec: 47097.5). Total num frames: 20709376. Throughput: 0: 11309.5. Samples: 5222400. Policy #0 lag: (min: 15.0, avg: 103.1, max: 271.0) [2024-06-15 11:39:20,959][1648982] Avg episode reward: [(0, '32.880')] [2024-06-15 11:39:20,970][1651596] Saving new best policy, reward=32.880! [2024-06-15 11:39:23,744][1653645] Updated weights for policy 0, policy_version 10128 (0.0123) [2024-06-15 11:39:25,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44783.3, 300 sec: 46876.9). Total num frames: 20840448. Throughput: 0: 11389.2. Samples: 5292032. Policy #0 lag: (min: 11.0, avg: 90.6, max: 267.0) [2024-06-15 11:39:25,958][1648982] Avg episode reward: [(0, '32.650')] [2024-06-15 11:39:26,110][1653645] Updated weights for policy 0, policy_version 10184 (0.0036) [2024-06-15 11:39:27,174][1653645] Updated weights for policy 0, policy_version 10240 (0.0012) [2024-06-15 11:39:27,248][1651596] Signal inference workers to stop experience collection... (550 times) [2024-06-15 11:39:27,316][1653645] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-15 11:39:27,512][1651596] Signal inference workers to resume experience collection... (550 times) [2024-06-15 11:39:27,513][1653645] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-15 11:39:28,827][1653645] Updated weights for policy 0, policy_version 10307 (0.0020) [2024-06-15 11:39:30,141][1653645] Updated weights for policy 0, policy_version 10363 (0.0013) [2024-06-15 11:39:30,960][1648982] Fps is (10 sec: 52429.5, 60 sec: 45875.2, 300 sec: 47097.1). Total num frames: 21233664. Throughput: 0: 11286.7. Samples: 5353472. Policy #0 lag: (min: 11.0, avg: 90.6, max: 267.0) [2024-06-15 11:39:30,962][1648982] Avg episode reward: [(0, '32.940')] [2024-06-15 11:39:30,978][1651596] Saving new best policy, reward=32.940! [2024-06-15 11:39:35,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 46763.8). Total num frames: 21266432. Throughput: 0: 11252.6. Samples: 5388288. Policy #0 lag: (min: 5.0, avg: 88.6, max: 261.0) [2024-06-15 11:39:35,958][1648982] Avg episode reward: [(0, '32.580')] [2024-06-15 11:39:36,614][1653645] Updated weights for policy 0, policy_version 10426 (0.0117) [2024-06-15 11:39:38,360][1653645] Updated weights for policy 0, policy_version 10493 (0.0010) [2024-06-15 11:39:39,371][1653645] Updated weights for policy 0, policy_version 10530 (0.0009) [2024-06-15 11:39:40,512][1653645] Updated weights for policy 0, policy_version 10565 (0.0013) [2024-06-15 11:39:40,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 46421.2, 300 sec: 46763.8). Total num frames: 21659648. Throughput: 0: 11343.6. Samples: 5457408. Policy #0 lag: (min: 5.0, avg: 88.6, max: 261.0) [2024-06-15 11:39:40,959][1648982] Avg episode reward: [(0, '32.320')] [2024-06-15 11:39:41,773][1653645] Updated weights for policy 0, policy_version 10621 (0.0067) [2024-06-15 11:39:45,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.6, 300 sec: 46652.7). Total num frames: 21757952. Throughput: 0: 11229.9. Samples: 5527040. Policy #0 lag: (min: 5.0, avg: 88.6, max: 261.0) [2024-06-15 11:39:45,958][1648982] Avg episode reward: [(0, '32.720')] [2024-06-15 11:39:47,963][1653645] Updated weights for policy 0, policy_version 10686 (0.0012) [2024-06-15 11:39:49,852][1653645] Updated weights for policy 0, policy_version 10744 (0.0012) [2024-06-15 11:39:50,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 46967.5, 300 sec: 46430.6). Total num frames: 22085632. Throughput: 0: 11161.6. Samples: 5561856. Policy #0 lag: (min: 15.0, avg: 81.9, max: 271.0) [2024-06-15 11:39:50,961][1648982] Avg episode reward: [(0, '32.960')] [2024-06-15 11:39:51,229][1651596] Saving new best policy, reward=32.960! [2024-06-15 11:39:51,231][1653645] Updated weights for policy 0, policy_version 10800 (0.0010) [2024-06-15 11:39:53,728][1653645] Updated weights for policy 0, policy_version 10852 (0.0069) [2024-06-15 11:39:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 46652.7). Total num frames: 22282240. Throughput: 0: 10979.5. Samples: 5619712. Policy #0 lag: (min: 15.0, avg: 81.9, max: 271.0) [2024-06-15 11:39:55,958][1648982] Avg episode reward: [(0, '33.300')] [2024-06-15 11:39:56,017][1651596] Saving new best policy, reward=33.300! [2024-06-15 11:39:58,581][1653645] Updated weights for policy 0, policy_version 10881 (0.0013) [2024-06-15 11:39:59,610][1653645] Updated weights for policy 0, policy_version 10944 (0.0013) [2024-06-15 11:40:00,959][1648982] Fps is (10 sec: 32767.7, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 22413312. Throughput: 0: 11025.0. Samples: 5691904. Policy #0 lag: (min: 15.0, avg: 89.5, max: 271.0) [2024-06-15 11:40:00,960][1648982] Avg episode reward: [(0, '33.950')] [2024-06-15 11:40:01,505][1651596] Saving new best policy, reward=33.950! [2024-06-15 11:40:02,317][1653645] Updated weights for policy 0, policy_version 11002 (0.0015) [2024-06-15 11:40:03,947][1653645] Updated weights for policy 0, policy_version 11064 (0.0013) [2024-06-15 11:40:05,922][1653645] Updated weights for policy 0, policy_version 11127 (0.0015) [2024-06-15 11:40:05,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44782.7, 300 sec: 46541.6). Total num frames: 22773760. Throughput: 0: 11047.8. Samples: 5719552. Policy #0 lag: (min: 15.0, avg: 89.5, max: 271.0) [2024-06-15 11:40:05,958][1648982] Avg episode reward: [(0, '33.930')] [2024-06-15 11:40:10,965][1648982] Fps is (10 sec: 39294.8, 60 sec: 43685.7, 300 sec: 45763.1). Total num frames: 22806528. Throughput: 0: 11034.7. Samples: 5788672. Policy #0 lag: (min: 15.0, avg: 89.5, max: 271.0) [2024-06-15 11:40:10,965][1648982] Avg episode reward: [(0, '34.690')] [2024-06-15 11:40:11,516][1651596] Saving new best policy, reward=34.690! [2024-06-15 11:40:12,056][1653645] Updated weights for policy 0, policy_version 11196 (0.0013) [2024-06-15 11:40:13,314][1651596] Signal inference workers to stop experience collection... (600 times) [2024-06-15 11:40:13,368][1653645] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-15 11:40:13,615][1651596] Signal inference workers to resume experience collection... (600 times) [2024-06-15 11:40:13,616][1653645] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-15 11:40:14,429][1653645] Updated weights for policy 0, policy_version 11264 (0.0013) [2024-06-15 11:40:15,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44782.9, 300 sec: 46430.6). Total num frames: 23166976. Throughput: 0: 10979.6. Samples: 5847552. Policy #0 lag: (min: 15.0, avg: 91.8, max: 271.0) [2024-06-15 11:40:15,959][1648982] Avg episode reward: [(0, '34.430')] [2024-06-15 11:40:15,969][1653645] Updated weights for policy 0, policy_version 11322 (0.0013) [2024-06-15 11:40:17,685][1653645] Updated weights for policy 0, policy_version 11377 (0.0014) [2024-06-15 11:40:20,958][1648982] Fps is (10 sec: 52465.2, 60 sec: 43690.8, 300 sec: 46319.5). Total num frames: 23330816. Throughput: 0: 10934.0. Samples: 5880320. Policy #0 lag: (min: 15.0, avg: 91.8, max: 271.0) [2024-06-15 11:40:20,958][1648982] Avg episode reward: [(0, '34.590')] [2024-06-15 11:40:24,013][1653645] Updated weights for policy 0, policy_version 11444 (0.0012) [2024-06-15 11:40:25,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 45986.3). Total num frames: 23527424. Throughput: 0: 11002.4. Samples: 5952512. Policy #0 lag: (min: 4.0, avg: 78.2, max: 260.0) [2024-06-15 11:40:25,958][1648982] Avg episode reward: [(0, '34.710')] [2024-06-15 11:40:26,046][1653645] Updated weights for policy 0, policy_version 11504 (0.0012) [2024-06-15 11:40:26,375][1651596] Saving new best policy, reward=34.710! [2024-06-15 11:40:27,555][1653645] Updated weights for policy 0, policy_version 11561 (0.0011) [2024-06-15 11:40:29,198][1653645] Updated weights for policy 0, policy_version 11616 (0.0013) [2024-06-15 11:40:30,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 46652.7). Total num frames: 23855104. Throughput: 0: 10854.4. Samples: 6015488. Policy #0 lag: (min: 4.0, avg: 78.2, max: 260.0) [2024-06-15 11:40:30,958][1648982] Avg episode reward: [(0, '34.790')] [2024-06-15 11:40:30,970][1651596] Saving new best policy, reward=34.790! [2024-06-15 11:40:35,434][1653645] Updated weights for policy 0, policy_version 11705 (0.0016) [2024-06-15 11:40:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 45764.1). Total num frames: 23986176. Throughput: 0: 10979.6. Samples: 6055936. Policy #0 lag: (min: 12.0, avg: 94.5, max: 268.0) [2024-06-15 11:40:35,959][1648982] Avg episode reward: [(0, '34.600')] [2024-06-15 11:40:36,896][1653645] Updated weights for policy 0, policy_version 11748 (0.0013) [2024-06-15 11:40:39,222][1653645] Updated weights for policy 0, policy_version 11808 (0.0014) [2024-06-15 11:40:40,673][1653645] Updated weights for policy 0, policy_version 11858 (0.0013) [2024-06-15 11:40:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.9, 300 sec: 46430.6). Total num frames: 24313856. Throughput: 0: 11059.2. Samples: 6117376. Policy #0 lag: (min: 12.0, avg: 94.5, max: 268.0) [2024-06-15 11:40:40,958][1648982] Avg episode reward: [(0, '35.030')] [2024-06-15 11:40:41,298][1651596] Saving new best policy, reward=35.030! [2024-06-15 11:40:41,686][1653645] Updated weights for policy 0, policy_version 11904 (0.0012) [2024-06-15 11:40:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 45653.0). Total num frames: 24379392. Throughput: 0: 11036.5. Samples: 6188544. Policy #0 lag: (min: 12.0, avg: 94.5, max: 268.0) [2024-06-15 11:40:45,959][1648982] Avg episode reward: [(0, '35.310')] [2024-06-15 11:40:45,959][1651596] Saving new best policy, reward=35.310! [2024-06-15 11:40:47,933][1653645] Updated weights for policy 0, policy_version 11956 (0.0072) [2024-06-15 11:40:49,155][1653645] Updated weights for policy 0, policy_version 12019 (0.0026) [2024-06-15 11:40:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 24707072. Throughput: 0: 11116.1. Samples: 6219776. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 11:40:50,958][1648982] Avg episode reward: [(0, '35.470')] [2024-06-15 11:40:50,984][1653645] Updated weights for policy 0, policy_version 12080 (0.0017) [2024-06-15 11:40:51,270][1651596] Saving new best policy, reward=35.470! [2024-06-15 11:40:52,288][1653645] Updated weights for policy 0, policy_version 12128 (0.0013) [2024-06-15 11:40:55,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 43690.4, 300 sec: 45877.7). Total num frames: 24903680. Throughput: 0: 10992.5. Samples: 6283264. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 11:40:55,959][1648982] Avg episode reward: [(0, '35.580')] [2024-06-15 11:40:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000012160_24903680.pth... [2024-06-15 11:40:56,014][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000006832_13991936.pth [2024-06-15 11:40:56,018][1651596] Saving new best policy, reward=35.580! [2024-06-15 11:40:58,742][1653645] Updated weights for policy 0, policy_version 12176 (0.0027) [2024-06-15 11:40:59,551][1651596] Signal inference workers to stop experience collection... (650 times) [2024-06-15 11:40:59,592][1653645] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-15 11:40:59,847][1651596] Signal inference workers to resume experience collection... (650 times) [2024-06-15 11:40:59,848][1653645] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-15 11:41:00,319][1653645] Updated weights for policy 0, policy_version 12240 (0.0104) [2024-06-15 11:41:00,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44783.0, 300 sec: 45542.0). Total num frames: 25100288. Throughput: 0: 11207.1. Samples: 6351872. Policy #0 lag: (min: 15.0, avg: 76.6, max: 271.0) [2024-06-15 11:41:00,958][1648982] Avg episode reward: [(0, '35.680')] [2024-06-15 11:41:01,356][1651596] Saving new best policy, reward=35.680! [2024-06-15 11:41:02,297][1653645] Updated weights for policy 0, policy_version 12289 (0.0020) [2024-06-15 11:41:04,964][1653645] Updated weights for policy 0, policy_version 12386 (0.0013) [2024-06-15 11:41:05,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 44236.9, 300 sec: 45986.3). Total num frames: 25427968. Throughput: 0: 11252.6. Samples: 6386688. Policy #0 lag: (min: 15.0, avg: 76.6, max: 271.0) [2024-06-15 11:41:05,959][1648982] Avg episode reward: [(0, '35.960')] [2024-06-15 11:41:05,960][1651596] Saving new best policy, reward=35.960! [2024-06-15 11:41:10,959][1648982] Fps is (10 sec: 32765.4, 60 sec: 43695.1, 300 sec: 45097.5). Total num frames: 25427968. Throughput: 0: 10956.6. Samples: 6445568. Policy #0 lag: (min: 15.0, avg: 76.6, max: 271.0) [2024-06-15 11:41:10,959][1648982] Avg episode reward: [(0, '35.730')] [2024-06-15 11:41:11,619][1653645] Updated weights for policy 0, policy_version 12450 (0.0013) [2024-06-15 11:41:12,692][1653645] Updated weights for policy 0, policy_version 12496 (0.0118) [2024-06-15 11:41:13,839][1653645] Updated weights for policy 0, policy_version 12543 (0.0012) [2024-06-15 11:41:15,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 43144.4, 300 sec: 45653.0). Total num frames: 25755648. Throughput: 0: 11127.4. Samples: 6516224. Policy #0 lag: (min: 15.0, avg: 88.3, max: 271.0) [2024-06-15 11:41:15,959][1648982] Avg episode reward: [(0, '35.860')] [2024-06-15 11:41:16,575][1653645] Updated weights for policy 0, policy_version 12609 (0.0013) [2024-06-15 11:41:17,981][1653645] Updated weights for policy 0, policy_version 12664 (0.0026) [2024-06-15 11:41:20,958][1648982] Fps is (10 sec: 52433.5, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 25952256. Throughput: 0: 10729.3. Samples: 6538752. Policy #0 lag: (min: 15.0, avg: 88.3, max: 271.0) [2024-06-15 11:41:20,958][1648982] Avg episode reward: [(0, '35.630')] [2024-06-15 11:41:23,132][1653645] Updated weights for policy 0, policy_version 12704 (0.0014) [2024-06-15 11:41:24,245][1653645] Updated weights for policy 0, policy_version 12752 (0.0013) [2024-06-15 11:41:25,960][1648982] Fps is (10 sec: 45867.4, 60 sec: 44781.5, 300 sec: 45653.2). Total num frames: 26214400. Throughput: 0: 10956.4. Samples: 6610432. Policy #0 lag: (min: 15.0, avg: 97.5, max: 271.0) [2024-06-15 11:41:25,961][1648982] Avg episode reward: [(0, '35.960')] [2024-06-15 11:41:26,837][1653645] Updated weights for policy 0, policy_version 12801 (0.0018) [2024-06-15 11:41:28,643][1653645] Updated weights for policy 0, policy_version 12866 (0.0013) [2024-06-15 11:41:30,302][1653645] Updated weights for policy 0, policy_version 12927 (0.0016) [2024-06-15 11:41:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 26476544. Throughput: 0: 10717.9. Samples: 6670848. Policy #0 lag: (min: 15.0, avg: 97.5, max: 271.0) [2024-06-15 11:41:30,958][1648982] Avg episode reward: [(0, '36.220')] [2024-06-15 11:41:30,959][1651596] Saving new best policy, reward=36.220! [2024-06-15 11:41:35,958][1648982] Fps is (10 sec: 36052.2, 60 sec: 43144.6, 300 sec: 45208.7). Total num frames: 26574848. Throughput: 0: 10831.7. Samples: 6707200. Policy #0 lag: (min: 15.0, avg: 100.8, max: 271.0) [2024-06-15 11:41:35,958][1648982] Avg episode reward: [(0, '35.790')] [2024-06-15 11:41:36,109][1653645] Updated weights for policy 0, policy_version 12992 (0.0012) [2024-06-15 11:41:37,425][1653645] Updated weights for policy 0, policy_version 13051 (0.0013) [2024-06-15 11:41:40,372][1653645] Updated weights for policy 0, policy_version 13092 (0.0013) [2024-06-15 11:41:40,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 42052.3, 300 sec: 45208.7). Total num frames: 26836992. Throughput: 0: 10911.4. Samples: 6774272. Policy #0 lag: (min: 15.0, avg: 100.8, max: 271.0) [2024-06-15 11:41:40,959][1648982] Avg episode reward: [(0, '36.320')] [2024-06-15 11:41:41,183][1651596] Signal inference workers to stop experience collection... (700 times) [2024-06-15 11:41:41,259][1653645] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-15 11:41:41,508][1651596] Saving new best policy, reward=36.320! [2024-06-15 11:41:41,509][1651596] Signal inference workers to resume experience collection... (700 times) [2024-06-15 11:41:41,522][1653645] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-15 11:41:42,762][1653645] Updated weights for policy 0, policy_version 13182 (0.0162) [2024-06-15 11:41:45,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44878.6). Total num frames: 27000832. Throughput: 0: 10615.5. Samples: 6829568. Policy #0 lag: (min: 15.0, avg: 100.8, max: 271.0) [2024-06-15 11:41:45,958][1648982] Avg episode reward: [(0, '36.320')] [2024-06-15 11:41:47,824][1653645] Updated weights for policy 0, policy_version 13236 (0.0014) [2024-06-15 11:41:49,796][1653645] Updated weights for policy 0, policy_version 13303 (0.0013) [2024-06-15 11:41:50,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 45208.8). Total num frames: 27262976. Throughput: 0: 10717.9. Samples: 6868992. Policy #0 lag: (min: 15.0, avg: 104.0, max: 271.0) [2024-06-15 11:41:50,958][1648982] Avg episode reward: [(0, '36.370')] [2024-06-15 11:41:50,959][1651596] Saving new best policy, reward=36.370! [2024-06-15 11:41:54,520][1653645] Updated weights for policy 0, policy_version 13398 (0.0104) [2024-06-15 11:41:55,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43691.0, 300 sec: 45208.7). Total num frames: 27525120. Throughput: 0: 10581.5. Samples: 6921728. Policy #0 lag: (min: 15.0, avg: 104.0, max: 271.0) [2024-06-15 11:41:55,958][1648982] Avg episode reward: [(0, '36.440')] [2024-06-15 11:41:55,967][1651596] Saving new best policy, reward=36.440! [2024-06-15 11:41:58,966][1653645] Updated weights for policy 0, policy_version 13462 (0.0036) [2024-06-15 11:41:59,846][1653645] Updated weights for policy 0, policy_version 13503 (0.0021) [2024-06-15 11:42:00,958][1648982] Fps is (10 sec: 39319.5, 60 sec: 42598.1, 300 sec: 44875.4). Total num frames: 27656192. Throughput: 0: 10569.9. Samples: 6991872. Policy #0 lag: (min: 15.0, avg: 116.2, max: 271.0) [2024-06-15 11:42:00,960][1648982] Avg episode reward: [(0, '36.420')] [2024-06-15 11:42:01,882][1653645] Updated weights for policy 0, policy_version 13552 (0.0012) [2024-06-15 11:42:05,004][1653645] Updated weights for policy 0, policy_version 13600 (0.0013) [2024-06-15 11:42:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 44875.5). Total num frames: 27918336. Throughput: 0: 10934.0. Samples: 7030784. Policy #0 lag: (min: 15.0, avg: 116.2, max: 271.0) [2024-06-15 11:42:05,958][1648982] Avg episode reward: [(0, '36.400')] [2024-06-15 11:42:06,501][1653645] Updated weights for policy 0, policy_version 13664 (0.0014) [2024-06-15 11:42:10,963][1648982] Fps is (10 sec: 42580.1, 60 sec: 44233.9, 300 sec: 44541.5). Total num frames: 28082176. Throughput: 0: 10671.7. Samples: 7090688. Policy #0 lag: (min: 15.0, avg: 118.1, max: 271.0) [2024-06-15 11:42:10,963][1648982] Avg episode reward: [(0, '36.530')] [2024-06-15 11:42:11,012][1653645] Updated weights for policy 0, policy_version 13714 (0.0013) [2024-06-15 11:42:11,635][1651596] Saving new best policy, reward=36.530! [2024-06-15 11:42:11,940][1653645] Updated weights for policy 0, policy_version 13756 (0.0013) [2024-06-15 11:42:13,038][1653645] Updated weights for policy 0, policy_version 13793 (0.0012) [2024-06-15 11:42:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.6, 300 sec: 44431.2). Total num frames: 28311552. Throughput: 0: 10865.8. Samples: 7159808. Policy #0 lag: (min: 15.0, avg: 118.1, max: 271.0) [2024-06-15 11:42:15,958][1648982] Avg episode reward: [(0, '36.660')] [2024-06-15 11:42:15,959][1651596] Saving new best policy, reward=36.660! [2024-06-15 11:42:16,949][1653645] Updated weights for policy 0, policy_version 13860 (0.0013) [2024-06-15 11:42:18,603][1653645] Updated weights for policy 0, policy_version 13909 (0.0015) [2024-06-15 11:42:20,958][1648982] Fps is (10 sec: 49173.8, 60 sec: 43690.4, 300 sec: 44542.2). Total num frames: 28573696. Throughput: 0: 10740.5. Samples: 7190528. Policy #0 lag: (min: 15.0, avg: 118.1, max: 271.0) [2024-06-15 11:42:20,959][1648982] Avg episode reward: [(0, '36.510')] [2024-06-15 11:42:22,688][1653645] Updated weights for policy 0, policy_version 13968 (0.0013) [2024-06-15 11:42:24,231][1653645] Updated weights for policy 0, policy_version 14032 (0.0010) [2024-06-15 11:42:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43692.1, 300 sec: 44764.4). Total num frames: 28835840. Throughput: 0: 10740.7. Samples: 7257600. Policy #0 lag: (min: 2.0, avg: 97.8, max: 258.0) [2024-06-15 11:42:25,958][1648982] Avg episode reward: [(0, '36.510')] [2024-06-15 11:42:27,908][1651596] Signal inference workers to stop experience collection... (750 times) [2024-06-15 11:42:27,949][1653645] Updated weights for policy 0, policy_version 14081 (0.0011) [2024-06-15 11:42:27,985][1653645] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-15 11:42:28,159][1651596] Signal inference workers to resume experience collection... (750 times) [2024-06-15 11:42:28,159][1653645] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-15 11:42:29,640][1653645] Updated weights for policy 0, policy_version 14160 (0.0029) [2024-06-15 11:42:30,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 29097984. Throughput: 0: 10899.9. Samples: 7320064. Policy #0 lag: (min: 2.0, avg: 97.8, max: 258.0) [2024-06-15 11:42:30,958][1648982] Avg episode reward: [(0, '36.620')] [2024-06-15 11:42:35,460][1653645] Updated weights for policy 0, policy_version 14240 (0.0105) [2024-06-15 11:42:35,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.4, 300 sec: 44653.3). Total num frames: 29196288. Throughput: 0: 10854.3. Samples: 7357440. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 11:42:35,959][1648982] Avg episode reward: [(0, '36.670')] [2024-06-15 11:42:36,263][1651596] Saving new best policy, reward=36.670! [2024-06-15 11:42:37,255][1653645] Updated weights for policy 0, policy_version 14307 (0.0012) [2024-06-15 11:42:40,152][1653645] Updated weights for policy 0, policy_version 14352 (0.0012) [2024-06-15 11:42:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 44542.2). Total num frames: 29458432. Throughput: 0: 11081.9. Samples: 7420416. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 11:42:40,958][1648982] Avg episode reward: [(0, '36.710')] [2024-06-15 11:42:41,113][1653645] Updated weights for policy 0, policy_version 14398 (0.0020) [2024-06-15 11:42:41,162][1651596] Saving new best policy, reward=36.710! [2024-06-15 11:42:42,562][1653645] Updated weights for policy 0, policy_version 14457 (0.0015) [2024-06-15 11:42:45,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 29622272. Throughput: 0: 11082.1. Samples: 7490560. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 11:42:45,958][1648982] Avg episode reward: [(0, '36.710')] [2024-06-15 11:42:48,412][1653645] Updated weights for policy 0, policy_version 14529 (0.0013) [2024-06-15 11:42:49,430][1653645] Updated weights for policy 0, policy_version 14586 (0.0014) [2024-06-15 11:42:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 29884416. Throughput: 0: 10865.8. Samples: 7519744. Policy #0 lag: (min: 15.0, avg: 94.6, max: 271.0) [2024-06-15 11:42:50,958][1648982] Avg episode reward: [(0, '36.620')] [2024-06-15 11:42:52,419][1653645] Updated weights for policy 0, policy_version 14649 (0.0014) [2024-06-15 11:42:54,582][1653645] Updated weights for policy 0, policy_version 14704 (0.0013) [2024-06-15 11:42:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 30146560. Throughput: 0: 11026.2. Samples: 7586816. Policy #0 lag: (min: 15.0, avg: 94.6, max: 271.0) [2024-06-15 11:42:55,959][1648982] Avg episode reward: [(0, '36.390')] [2024-06-15 11:42:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000014720_30146560.pth... [2024-06-15 11:42:56,043][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000009600_19660800.pth [2024-06-15 11:42:59,717][1653645] Updated weights for policy 0, policy_version 14784 (0.0013) [2024-06-15 11:43:00,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.3, 300 sec: 44209.0). Total num frames: 30343168. Throughput: 0: 11036.4. Samples: 7656448. Policy #0 lag: (min: 15.0, avg: 92.2, max: 255.0) [2024-06-15 11:43:00,958][1648982] Avg episode reward: [(0, '36.720')] [2024-06-15 11:43:01,106][1653645] Updated weights for policy 0, policy_version 14836 (0.0012) [2024-06-15 11:43:01,339][1651596] Saving new best policy, reward=36.720! [2024-06-15 11:43:03,897][1653645] Updated weights for policy 0, policy_version 14896 (0.0030) [2024-06-15 11:43:05,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 30605312. Throughput: 0: 11138.9. Samples: 7691776. Policy #0 lag: (min: 15.0, avg: 92.2, max: 255.0) [2024-06-15 11:43:05,958][1648982] Avg episode reward: [(0, '36.640')] [2024-06-15 11:43:06,334][1653645] Updated weights for policy 0, policy_version 14964 (0.0017) [2024-06-15 11:43:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43694.1, 300 sec: 43764.7). Total num frames: 30703616. Throughput: 0: 11173.0. Samples: 7760384. Policy #0 lag: (min: 9.0, avg: 78.7, max: 233.0) [2024-06-15 11:43:10,960][1648982] Avg episode reward: [(0, '36.800')] [2024-06-15 11:43:11,507][1651596] Saving new best policy, reward=36.800! [2024-06-15 11:43:11,508][1653645] Updated weights for policy 0, policy_version 15024 (0.0023) [2024-06-15 11:43:12,017][1651596] Signal inference workers to stop experience collection... (800 times) [2024-06-15 11:43:12,083][1653645] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-15 11:43:12,272][1651596] Signal inference workers to resume experience collection... (800 times) [2024-06-15 11:43:12,274][1653645] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-15 11:43:13,131][1653645] Updated weights for policy 0, policy_version 15103 (0.0032) [2024-06-15 11:43:15,454][1653645] Updated weights for policy 0, policy_version 15165 (0.0016) [2024-06-15 11:43:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 31064064. Throughput: 0: 11104.7. Samples: 7819776. Policy #0 lag: (min: 9.0, avg: 78.7, max: 233.0) [2024-06-15 11:43:15,958][1648982] Avg episode reward: [(0, '36.590')] [2024-06-15 11:43:18,349][1653645] Updated weights for policy 0, policy_version 15223 (0.0010) [2024-06-15 11:43:20,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 31195136. Throughput: 0: 11036.4. Samples: 7854080. Policy #0 lag: (min: 9.0, avg: 78.7, max: 233.0) [2024-06-15 11:43:20,959][1648982] Avg episode reward: [(0, '36.670')] [2024-06-15 11:43:23,195][1653645] Updated weights for policy 0, policy_version 15266 (0.0010) [2024-06-15 11:43:24,509][1653645] Updated weights for policy 0, policy_version 15328 (0.0011) [2024-06-15 11:43:25,842][1653645] Updated weights for policy 0, policy_version 15376 (0.0013) [2024-06-15 11:43:25,970][1648982] Fps is (10 sec: 42597.9, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 31490048. Throughput: 0: 11218.5. Samples: 7925248. Policy #0 lag: (min: 7.0, avg: 80.8, max: 263.0) [2024-06-15 11:43:25,970][1648982] Avg episode reward: [(0, '36.700')] [2024-06-15 11:43:29,948][1653645] Updated weights for policy 0, policy_version 15459 (0.0011) [2024-06-15 11:43:30,959][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 31719424. Throughput: 0: 11070.5. Samples: 7988736. Policy #0 lag: (min: 7.0, avg: 80.8, max: 263.0) [2024-06-15 11:43:30,960][1648982] Avg episode reward: [(0, '36.730')] [2024-06-15 11:43:34,994][1653645] Updated weights for policy 0, policy_version 15522 (0.0013) [2024-06-15 11:43:35,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 31850496. Throughput: 0: 11309.5. Samples: 8028672. Policy #0 lag: (min: 9.0, avg: 74.6, max: 265.0) [2024-06-15 11:43:35,958][1648982] Avg episode reward: [(0, '36.690')] [2024-06-15 11:43:37,016][1653645] Updated weights for policy 0, policy_version 15600 (0.0012) [2024-06-15 11:43:38,877][1653645] Updated weights for policy 0, policy_version 15676 (0.0011) [2024-06-15 11:43:40,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44236.6, 300 sec: 43986.8). Total num frames: 32112640. Throughput: 0: 10899.8. Samples: 8077312. Policy #0 lag: (min: 9.0, avg: 74.6, max: 265.0) [2024-06-15 11:43:40,959][1648982] Avg episode reward: [(0, '36.780')] [2024-06-15 11:43:43,557][1653645] Updated weights for policy 0, policy_version 15736 (0.0016) [2024-06-15 11:43:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 32243712. Throughput: 0: 10945.4. Samples: 8148992. Policy #0 lag: (min: 9.0, avg: 74.6, max: 265.0) [2024-06-15 11:43:45,958][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:43:45,959][1651596] Saving new best policy, reward=36.830! [2024-06-15 11:43:48,380][1653645] Updated weights for policy 0, policy_version 15811 (0.0198) [2024-06-15 11:43:50,336][1653645] Updated weights for policy 0, policy_version 15896 (0.0011) [2024-06-15 11:43:50,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 32604160. Throughput: 0: 10831.7. Samples: 8179200. Policy #0 lag: (min: 15.0, avg: 86.9, max: 271.0) [2024-06-15 11:43:50,961][1648982] Avg episode reward: [(0, '36.720')] [2024-06-15 11:43:55,427][1651596] Signal inference workers to stop experience collection... (850 times) [2024-06-15 11:43:55,493][1653645] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-15 11:43:55,717][1651596] Signal inference workers to resume experience collection... (850 times) [2024-06-15 11:43:55,718][1653645] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-15 11:43:55,721][1653645] Updated weights for policy 0, policy_version 15984 (0.0013) [2024-06-15 11:43:55,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 32735232. Throughput: 0: 10877.1. Samples: 8249856. Policy #0 lag: (min: 15.0, avg: 86.9, max: 271.0) [2024-06-15 11:43:55,958][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:43:58,625][1653645] Updated weights for policy 0, policy_version 16008 (0.0012) [2024-06-15 11:44:00,722][1653645] Updated weights for policy 0, policy_version 16085 (0.0121) [2024-06-15 11:44:00,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 32964608. Throughput: 0: 10990.9. Samples: 8314368. Policy #0 lag: (min: 3.0, avg: 79.2, max: 259.0) [2024-06-15 11:44:00,959][1648982] Avg episode reward: [(0, '36.790')] [2024-06-15 11:44:02,380][1653645] Updated weights for policy 0, policy_version 16160 (0.0014) [2024-06-15 11:44:05,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 33161216. Throughput: 0: 10865.8. Samples: 8343040. Policy #0 lag: (min: 3.0, avg: 79.2, max: 259.0) [2024-06-15 11:44:05,958][1648982] Avg episode reward: [(0, '36.810')] [2024-06-15 11:44:06,526][1653645] Updated weights for policy 0, policy_version 16193 (0.0015) [2024-06-15 11:44:07,944][1653645] Updated weights for policy 0, policy_version 16252 (0.0011) [2024-06-15 11:44:10,961][1648982] Fps is (10 sec: 39311.6, 60 sec: 44234.8, 300 sec: 43653.2). Total num frames: 33357824. Throughput: 0: 10978.9. Samples: 8419328. Policy #0 lag: (min: 15.0, avg: 96.7, max: 271.0) [2024-06-15 11:44:10,962][1648982] Avg episode reward: [(0, '36.800')] [2024-06-15 11:44:11,687][1653645] Updated weights for policy 0, policy_version 16323 (0.0011) [2024-06-15 11:44:14,207][1653645] Updated weights for policy 0, policy_version 16432 (0.0171) [2024-06-15 11:44:15,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 33685504. Throughput: 0: 10729.3. Samples: 8471552. Policy #0 lag: (min: 15.0, avg: 96.7, max: 271.0) [2024-06-15 11:44:15,958][1648982] Avg episode reward: [(0, '36.690')] [2024-06-15 11:44:19,855][1653645] Updated weights for policy 0, policy_version 16487 (0.0014) [2024-06-15 11:44:20,958][1648982] Fps is (10 sec: 45888.2, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 33816576. Throughput: 0: 10786.1. Samples: 8514048. Policy #0 lag: (min: 15.0, avg: 96.7, max: 271.0) [2024-06-15 11:44:20,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:44:20,959][1651596] Saving new best policy, reward=36.920! [2024-06-15 11:44:22,563][1653645] Updated weights for policy 0, policy_version 16528 (0.0012) [2024-06-15 11:44:24,012][1653645] Updated weights for policy 0, policy_version 16596 (0.0012) [2024-06-15 11:44:25,296][1653645] Updated weights for policy 0, policy_version 16661 (0.0013) [2024-06-15 11:44:25,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 34177024. Throughput: 0: 11161.6. Samples: 8579584. Policy #0 lag: (min: 4.0, avg: 83.5, max: 260.0) [2024-06-15 11:44:25,959][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 11:44:26,066][1651596] Saving new best policy, reward=36.950! [2024-06-15 11:44:30,053][1653645] Updated weights for policy 0, policy_version 16707 (0.0012) [2024-06-15 11:44:30,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 42598.7, 300 sec: 44097.9). Total num frames: 34275328. Throughput: 0: 11298.1. Samples: 8657408. Policy #0 lag: (min: 4.0, avg: 83.5, max: 260.0) [2024-06-15 11:44:30,958][1648982] Avg episode reward: [(0, '36.820')] [2024-06-15 11:44:31,595][1653645] Updated weights for policy 0, policy_version 16768 (0.0013) [2024-06-15 11:44:34,584][1653645] Updated weights for policy 0, policy_version 16848 (0.0014) [2024-06-15 11:44:35,662][1651596] Signal inference workers to stop experience collection... (900 times) [2024-06-15 11:44:35,735][1653645] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-15 11:44:35,930][1651596] Signal inference workers to resume experience collection... (900 times) [2024-06-15 11:44:35,934][1653645] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-15 11:44:35,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 46421.2, 300 sec: 43986.9). Total num frames: 34635776. Throughput: 0: 11309.5. Samples: 8688128. Policy #0 lag: (min: 31.0, avg: 142.4, max: 287.0) [2024-06-15 11:44:35,959][1648982] Avg episode reward: [(0, '36.820')] [2024-06-15 11:44:36,200][1653645] Updated weights for policy 0, policy_version 16920 (0.0013) [2024-06-15 11:44:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 34734080. Throughput: 0: 11241.3. Samples: 8755712. Policy #0 lag: (min: 31.0, avg: 142.4, max: 287.0) [2024-06-15 11:44:40,958][1648982] Avg episode reward: [(0, '36.630')] [2024-06-15 11:44:42,041][1653645] Updated weights for policy 0, policy_version 16976 (0.0014) [2024-06-15 11:44:43,156][1653645] Updated weights for policy 0, policy_version 17022 (0.0013) [2024-06-15 11:44:45,724][1653645] Updated weights for policy 0, policy_version 17092 (0.0012) [2024-06-15 11:44:45,963][1648982] Fps is (10 sec: 39303.3, 60 sec: 46417.7, 300 sec: 43875.1). Total num frames: 35028992. Throughput: 0: 11342.5. Samples: 8824832. Policy #0 lag: (min: 31.0, avg: 142.4, max: 287.0) [2024-06-15 11:44:45,964][1648982] Avg episode reward: [(0, '36.890')] [2024-06-15 11:44:47,596][1653645] Updated weights for policy 0, policy_version 17184 (0.0013) [2024-06-15 11:44:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 35258368. Throughput: 0: 11275.4. Samples: 8850432. Policy #0 lag: (min: 79.0, avg: 192.1, max: 322.0) [2024-06-15 11:44:50,958][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 11:44:55,321][1653645] Updated weights for policy 0, policy_version 17270 (0.0013) [2024-06-15 11:44:55,958][1648982] Fps is (10 sec: 36060.8, 60 sec: 44236.7, 300 sec: 43986.8). Total num frames: 35389440. Throughput: 0: 11310.1. Samples: 8928256. Policy #0 lag: (min: 79.0, avg: 192.1, max: 322.0) [2024-06-15 11:44:55,959][1648982] Avg episode reward: [(0, '36.780')] [2024-06-15 11:44:56,265][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000017296_35422208.pth... [2024-06-15 11:44:56,411][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000012160_24903680.pth [2024-06-15 11:44:57,159][1653645] Updated weights for policy 0, policy_version 17337 (0.0031) [2024-06-15 11:44:58,972][1653645] Updated weights for policy 0, policy_version 17404 (0.0012) [2024-06-15 11:45:00,643][1653645] Updated weights for policy 0, policy_version 17445 (0.0012) [2024-06-15 11:45:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 46421.5, 300 sec: 43986.9). Total num frames: 35749888. Throughput: 0: 11309.6. Samples: 8980480. Policy #0 lag: (min: 15.0, avg: 177.5, max: 271.0) [2024-06-15 11:45:00,958][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 11:45:01,123][1651596] Saving new best policy, reward=36.970! [2024-06-15 11:45:05,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44236.8, 300 sec: 44099.0). Total num frames: 35815424. Throughput: 0: 11218.5. Samples: 9018880. Policy #0 lag: (min: 15.0, avg: 177.5, max: 271.0) [2024-06-15 11:45:05,959][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 11:45:06,562][1653645] Updated weights for policy 0, policy_version 17506 (0.0018) [2024-06-15 11:45:08,354][1653645] Updated weights for policy 0, policy_version 17572 (0.0019) [2024-06-15 11:45:09,515][1653645] Updated weights for policy 0, policy_version 17616 (0.0013) [2024-06-15 11:45:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 46969.6, 300 sec: 44097.9). Total num frames: 36175872. Throughput: 0: 11411.9. Samples: 9093120. Policy #0 lag: (min: 15.0, avg: 177.5, max: 271.0) [2024-06-15 11:45:10,958][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 11:45:11,526][1653645] Updated weights for policy 0, policy_version 17683 (0.0012) [2024-06-15 11:45:15,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 36306944. Throughput: 0: 11082.0. Samples: 9156096. Policy #0 lag: (min: 15.0, avg: 177.5, max: 271.0) [2024-06-15 11:45:15,958][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:45:17,509][1653645] Updated weights for policy 0, policy_version 17744 (0.0013) [2024-06-15 11:45:18,518][1653645] Updated weights for policy 0, policy_version 17792 (0.0014) [2024-06-15 11:45:20,605][1653645] Updated weights for policy 0, policy_version 17849 (0.0014) [2024-06-15 11:45:20,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 36569088. Throughput: 0: 11173.0. Samples: 9190912. Policy #0 lag: (min: 15.0, avg: 87.7, max: 271.0) [2024-06-15 11:45:20,958][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 11:45:20,959][1651596] Saving new best policy, reward=36.980! [2024-06-15 11:45:22,559][1651596] Signal inference workers to stop experience collection... (950 times) [2024-06-15 11:45:22,584][1653645] Updated weights for policy 0, policy_version 17905 (0.0013) [2024-06-15 11:45:22,639][1653645] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-15 11:45:22,759][1651596] Signal inference workers to resume experience collection... (950 times) [2024-06-15 11:45:22,761][1653645] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-15 11:45:23,861][1653645] Updated weights for policy 0, policy_version 17976 (0.0252) [2024-06-15 11:45:25,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 44236.7, 300 sec: 43986.8). Total num frames: 36831232. Throughput: 0: 11070.5. Samples: 9253888. Policy #0 lag: (min: 15.0, avg: 87.7, max: 271.0) [2024-06-15 11:45:25,959][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 11:45:29,996][1653645] Updated weights for policy 0, policy_version 18043 (0.0020) [2024-06-15 11:45:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 36962304. Throughput: 0: 11105.9. Samples: 9324544. Policy #0 lag: (min: 15.0, avg: 87.7, max: 271.0) [2024-06-15 11:45:30,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 11:45:30,959][1651596] Saving new best policy, reward=37.030! [2024-06-15 11:45:33,045][1653645] Updated weights for policy 0, policy_version 18100 (0.0013) [2024-06-15 11:45:34,555][1653645] Updated weights for policy 0, policy_version 18148 (0.0012) [2024-06-15 11:45:35,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 37289984. Throughput: 0: 11172.9. Samples: 9353216. Policy #0 lag: (min: 9.0, avg: 97.2, max: 265.0) [2024-06-15 11:45:35,959][1648982] Avg episode reward: [(0, '36.790')] [2024-06-15 11:45:36,017][1653645] Updated weights for policy 0, policy_version 18209 (0.0013) [2024-06-15 11:45:40,442][1653645] Updated weights for policy 0, policy_version 18242 (0.0015) [2024-06-15 11:45:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 37388288. Throughput: 0: 11047.9. Samples: 9425408. Policy #0 lag: (min: 9.0, avg: 97.2, max: 265.0) [2024-06-15 11:45:40,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:45:43,715][1653645] Updated weights for policy 0, policy_version 18320 (0.0014) [2024-06-15 11:45:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44240.1, 300 sec: 43986.8). Total num frames: 37683200. Throughput: 0: 11366.3. Samples: 9491968. Policy #0 lag: (min: 0.0, avg: 115.1, max: 256.0) [2024-06-15 11:45:45,959][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 11:45:46,227][1653645] Updated weights for policy 0, policy_version 18416 (0.0012) [2024-06-15 11:45:46,231][1651596] Saving new best policy, reward=37.040! [2024-06-15 11:45:47,985][1653645] Updated weights for policy 0, policy_version 18480 (0.0012) [2024-06-15 11:45:50,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 37879808. Throughput: 0: 11025.1. Samples: 9515008. Policy #0 lag: (min: 0.0, avg: 115.1, max: 256.0) [2024-06-15 11:45:50,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 11:45:52,280][1653645] Updated weights for policy 0, policy_version 18512 (0.0016) [2024-06-15 11:45:55,432][1653645] Updated weights for policy 0, policy_version 18576 (0.0027) [2024-06-15 11:45:55,959][1648982] Fps is (10 sec: 39320.1, 60 sec: 44782.7, 300 sec: 43986.8). Total num frames: 38076416. Throughput: 0: 11081.8. Samples: 9591808. Policy #0 lag: (min: 0.0, avg: 115.1, max: 256.0) [2024-06-15 11:45:55,961][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 11:45:56,450][1653645] Updated weights for policy 0, policy_version 18622 (0.0048) [2024-06-15 11:45:58,025][1653645] Updated weights for policy 0, policy_version 18679 (0.0013) [2024-06-15 11:45:59,688][1653645] Updated weights for policy 0, policy_version 18751 (0.0027) [2024-06-15 11:46:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 38404096. Throughput: 0: 11104.7. Samples: 9655808. Policy #0 lag: (min: 74.0, avg: 187.0, max: 330.0) [2024-06-15 11:46:00,958][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 11:46:04,153][1653645] Updated weights for policy 0, policy_version 18810 (0.0014) [2024-06-15 11:46:05,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 45329.0, 300 sec: 44431.3). Total num frames: 38535168. Throughput: 0: 11263.9. Samples: 9697792. Policy #0 lag: (min: 74.0, avg: 187.0, max: 330.0) [2024-06-15 11:46:05,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 11:46:05,960][1651596] Saving new best policy, reward=37.050! [2024-06-15 11:46:07,868][1653645] Updated weights for policy 0, policy_version 18864 (0.0012) [2024-06-15 11:46:08,387][1651596] Signal inference workers to stop experience collection... (1000 times) [2024-06-15 11:46:08,403][1653645] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-15 11:46:08,566][1651596] Signal inference workers to resume experience collection... (1000 times) [2024-06-15 11:46:08,567][1653645] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-15 11:46:08,721][1653645] Updated weights for policy 0, policy_version 18899 (0.0012) [2024-06-15 11:46:10,458][1653645] Updated weights for policy 0, policy_version 18965 (0.0228) [2024-06-15 11:46:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 38862848. Throughput: 0: 11321.0. Samples: 9763328. Policy #0 lag: (min: 74.0, avg: 187.0, max: 330.0) [2024-06-15 11:46:10,958][1648982] Avg episode reward: [(0, '36.850')] [2024-06-15 11:46:14,679][1653645] Updated weights for policy 0, policy_version 19011 (0.0025) [2024-06-15 11:46:15,649][1653645] Updated weights for policy 0, policy_version 19058 (0.0015) [2024-06-15 11:46:15,959][1648982] Fps is (10 sec: 52427.8, 60 sec: 45874.9, 300 sec: 44431.1). Total num frames: 39059456. Throughput: 0: 11161.5. Samples: 9826816. Policy #0 lag: (min: 15.0, avg: 127.4, max: 271.0) [2024-06-15 11:46:15,960][1648982] Avg episode reward: [(0, '36.940')] [2024-06-15 11:46:19,417][1653645] Updated weights for policy 0, policy_version 19136 (0.0013) [2024-06-15 11:46:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 44209.3). Total num frames: 39256064. Throughput: 0: 11412.0. Samples: 9866752. Policy #0 lag: (min: 15.0, avg: 127.4, max: 271.0) [2024-06-15 11:46:20,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:46:21,708][1651596] Saving new best policy, reward=37.080! [2024-06-15 11:46:22,781][1653645] Updated weights for policy 0, policy_version 19234 (0.0015) [2024-06-15 11:46:25,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.8, 300 sec: 43986.8). Total num frames: 39452672. Throughput: 0: 11036.4. Samples: 9922048. Policy #0 lag: (min: 15.0, avg: 127.4, max: 271.0) [2024-06-15 11:46:25,959][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:46:26,936][1653645] Updated weights for policy 0, policy_version 19283 (0.0022) [2024-06-15 11:46:30,317][1653645] Updated weights for policy 0, policy_version 19331 (0.0037) [2024-06-15 11:46:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 39649280. Throughput: 0: 11184.4. Samples: 9995264. Policy #0 lag: (min: 11.0, avg: 97.7, max: 267.0) [2024-06-15 11:46:30,958][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 11:46:31,554][1653645] Updated weights for policy 0, policy_version 19392 (0.0019) [2024-06-15 11:46:33,977][1653645] Updated weights for policy 0, policy_version 19472 (0.0013) [2024-06-15 11:46:35,961][1648982] Fps is (10 sec: 52429.4, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 39976960. Throughput: 0: 11343.6. Samples: 10025472. Policy #0 lag: (min: 11.0, avg: 97.7, max: 267.0) [2024-06-15 11:46:35,962][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 11:46:38,592][1653645] Updated weights for policy 0, policy_version 19521 (0.0012) [2024-06-15 11:46:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 40108032. Throughput: 0: 11127.6. Samples: 10092544. Policy #0 lag: (min: 11.0, avg: 97.7, max: 267.0) [2024-06-15 11:46:40,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 11:46:42,246][1653645] Updated weights for policy 0, policy_version 19600 (0.0013) [2024-06-15 11:46:44,226][1653645] Updated weights for policy 0, policy_version 19664 (0.0106) [2024-06-15 11:46:45,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45329.2, 300 sec: 44542.3). Total num frames: 40402944. Throughput: 0: 11127.5. Samples: 10156544. Policy #0 lag: (min: 28.0, avg: 140.0, max: 284.0) [2024-06-15 11:46:45,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:46:46,298][1653645] Updated weights for policy 0, policy_version 19744 (0.0126) [2024-06-15 11:46:47,172][1653645] Updated weights for policy 0, policy_version 19776 (0.0011) [2024-06-15 11:46:50,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 40566784. Throughput: 0: 10934.1. Samples: 10189824. Policy #0 lag: (min: 28.0, avg: 140.0, max: 284.0) [2024-06-15 11:46:50,959][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 11:46:51,281][1653645] Updated weights for policy 0, policy_version 19824 (0.0012) [2024-06-15 11:46:54,766][1651596] Signal inference workers to stop experience collection... (1050 times) [2024-06-15 11:46:54,807][1653645] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-15 11:46:54,810][1653645] Updated weights for policy 0, policy_version 19874 (0.0013) [2024-06-15 11:46:55,080][1651596] Signal inference workers to resume experience collection... (1050 times) [2024-06-15 11:46:55,081][1653645] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-15 11:46:55,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 44783.2, 300 sec: 44431.2). Total num frames: 40763392. Throughput: 0: 11104.7. Samples: 10263040. Policy #0 lag: (min: 28.0, avg: 140.0, max: 284.0) [2024-06-15 11:46:55,958][1648982] Avg episode reward: [(0, '36.580')] [2024-06-15 11:46:56,253][1653645] Updated weights for policy 0, policy_version 19921 (0.0012) [2024-06-15 11:46:56,576][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000019936_40828928.pth... [2024-06-15 11:46:56,744][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000014720_30146560.pth [2024-06-15 11:46:58,448][1653645] Updated weights for policy 0, policy_version 20000 (0.0014) [2024-06-15 11:47:00,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 41025536. Throughput: 0: 10956.9. Samples: 10319872. Policy #0 lag: (min: 95.0, avg: 207.6, max: 351.0) [2024-06-15 11:47:00,959][1648982] Avg episode reward: [(0, '36.940')] [2024-06-15 11:47:03,302][1653645] Updated weights for policy 0, policy_version 20064 (0.0012) [2024-06-15 11:47:05,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 43690.9, 300 sec: 44320.8). Total num frames: 41156608. Throughput: 0: 10888.6. Samples: 10356736. Policy #0 lag: (min: 95.0, avg: 207.6, max: 351.0) [2024-06-15 11:47:05,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:47:05,988][1653645] Updated weights for policy 0, policy_version 20098 (0.0011) [2024-06-15 11:47:06,252][1651596] Saving new best policy, reward=37.100! [2024-06-15 11:47:07,206][1653645] Updated weights for policy 0, policy_version 20158 (0.0013) [2024-06-15 11:47:08,642][1653645] Updated weights for policy 0, policy_version 20196 (0.0010) [2024-06-15 11:47:10,344][1653645] Updated weights for policy 0, policy_version 20258 (0.0014) [2024-06-15 11:47:10,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44782.7, 300 sec: 44875.5). Total num frames: 41549824. Throughput: 0: 11161.6. Samples: 10424320. Policy #0 lag: (min: 95.0, avg: 207.6, max: 351.0) [2024-06-15 11:47:10,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:47:14,390][1653645] Updated weights for policy 0, policy_version 20307 (0.0020) [2024-06-15 11:47:15,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 41680896. Throughput: 0: 11036.4. Samples: 10491904. Policy #0 lag: (min: 5.0, avg: 116.5, max: 261.0) [2024-06-15 11:47:15,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:47:17,938][1653645] Updated weights for policy 0, policy_version 20368 (0.0013) [2024-06-15 11:47:19,846][1653645] Updated weights for policy 0, policy_version 20435 (0.0012) [2024-06-15 11:47:20,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 41910272. Throughput: 0: 11127.5. Samples: 10526208. Policy #0 lag: (min: 5.0, avg: 116.5, max: 261.0) [2024-06-15 11:47:20,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 11:47:21,117][1653645] Updated weights for policy 0, policy_version 20480 (0.0011) [2024-06-15 11:47:22,684][1653645] Updated weights for policy 0, policy_version 20542 (0.0013) [2024-06-15 11:47:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 42074112. Throughput: 0: 10956.8. Samples: 10585600. Policy #0 lag: (min: 5.0, avg: 116.5, max: 261.0) [2024-06-15 11:47:25,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:47:27,448][1653645] Updated weights for policy 0, policy_version 20592 (0.0011) [2024-06-15 11:47:30,267][1653645] Updated weights for policy 0, policy_version 20610 (0.0014) [2024-06-15 11:47:30,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 43144.5, 300 sec: 44209.1). Total num frames: 42237952. Throughput: 0: 11184.4. Samples: 10659840. Policy #0 lag: (min: 12.0, avg: 95.2, max: 268.0) [2024-06-15 11:47:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 11:47:31,343][1651596] Saving new best policy, reward=37.230! [2024-06-15 11:47:32,552][1653645] Updated weights for policy 0, policy_version 20692 (0.0012) [2024-06-15 11:47:34,297][1653645] Updated weights for policy 0, policy_version 20772 (0.0014) [2024-06-15 11:47:35,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 42598400. Throughput: 0: 10899.9. Samples: 10680320. Policy #0 lag: (min: 12.0, avg: 95.2, max: 268.0) [2024-06-15 11:47:35,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 11:47:38,606][1651596] Signal inference workers to stop experience collection... (1100 times) [2024-06-15 11:47:38,677][1653645] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-15 11:47:38,856][1651596] Signal inference workers to resume experience collection... (1100 times) [2024-06-15 11:47:38,858][1653645] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-15 11:47:38,989][1653645] Updated weights for policy 0, policy_version 20832 (0.0140) [2024-06-15 11:47:40,958][1648982] Fps is (10 sec: 49150.1, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 42729472. Throughput: 0: 10888.5. Samples: 10753024. Policy #0 lag: (min: 12.0, avg: 95.2, max: 268.0) [2024-06-15 11:47:40,959][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 11:47:42,445][1653645] Updated weights for policy 0, policy_version 20896 (0.0012) [2024-06-15 11:47:44,434][1653645] Updated weights for policy 0, policy_version 20946 (0.0018) [2024-06-15 11:47:45,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44236.6, 300 sec: 44653.3). Total num frames: 43057152. Throughput: 0: 11013.7. Samples: 10815488. Policy #0 lag: (min: 47.0, avg: 163.2, max: 335.0) [2024-06-15 11:47:45,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:47:46,403][1653645] Updated weights for policy 0, policy_version 21047 (0.0083) [2024-06-15 11:47:50,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 43144.7, 300 sec: 44098.0). Total num frames: 43155456. Throughput: 0: 10956.8. Samples: 10849792. Policy #0 lag: (min: 47.0, avg: 163.2, max: 335.0) [2024-06-15 11:47:50,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:47:51,393][1653645] Updated weights for policy 0, policy_version 21090 (0.0013) [2024-06-15 11:47:54,612][1653645] Updated weights for policy 0, policy_version 21137 (0.0046) [2024-06-15 11:47:55,958][1648982] Fps is (10 sec: 32768.9, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 43384832. Throughput: 0: 11002.4. Samples: 10919424. Policy #0 lag: (min: 47.0, avg: 163.2, max: 335.0) [2024-06-15 11:47:55,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:47:56,930][1653645] Updated weights for policy 0, policy_version 21234 (0.0012) [2024-06-15 11:47:58,426][1653645] Updated weights for policy 0, policy_version 21305 (0.0016) [2024-06-15 11:48:00,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 43646976. Throughput: 0: 10717.8. Samples: 10974208. Policy #0 lag: (min: 47.0, avg: 163.2, max: 335.0) [2024-06-15 11:48:00,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:48:03,591][1653645] Updated weights for policy 0, policy_version 21346 (0.0012) [2024-06-15 11:48:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 43778048. Throughput: 0: 10854.4. Samples: 11014656. Policy #0 lag: (min: 15.0, avg: 104.8, max: 271.0) [2024-06-15 11:48:05,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 11:48:07,445][1653645] Updated weights for policy 0, policy_version 21433 (0.0012) [2024-06-15 11:48:09,069][1653645] Updated weights for policy 0, policy_version 21490 (0.0014) [2024-06-15 11:48:10,362][1653645] Updated weights for policy 0, policy_version 21557 (0.0046) [2024-06-15 11:48:10,959][1648982] Fps is (10 sec: 52424.2, 60 sec: 43690.0, 300 sec: 44431.0). Total num frames: 44171264. Throughput: 0: 10865.5. Samples: 11074560. Policy #0 lag: (min: 15.0, avg: 104.8, max: 271.0) [2024-06-15 11:48:10,960][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:48:15,050][1653645] Updated weights for policy 0, policy_version 21600 (0.0049) [2024-06-15 11:48:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 44302336. Throughput: 0: 10877.1. Samples: 11149312. Policy #0 lag: (min: 15.0, avg: 104.8, max: 271.0) [2024-06-15 11:48:15,958][1648982] Avg episode reward: [(0, '36.820')] [2024-06-15 11:48:17,677][1653645] Updated weights for policy 0, policy_version 21648 (0.0124) [2024-06-15 11:48:19,592][1653645] Updated weights for policy 0, policy_version 21712 (0.0014) [2024-06-15 11:48:20,796][1653645] Updated weights for policy 0, policy_version 21776 (0.0017) [2024-06-15 11:48:20,943][1651596] Signal inference workers to stop experience collection... (1150 times) [2024-06-15 11:48:20,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 44782.5, 300 sec: 44431.1). Total num frames: 44597248. Throughput: 0: 11172.8. Samples: 11183104. Policy #0 lag: (min: 29.0, avg: 112.6, max: 285.0) [2024-06-15 11:48:20,959][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:48:20,977][1653645] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-15 11:48:21,163][1651596] Signal inference workers to resume experience collection... (1150 times) [2024-06-15 11:48:21,164][1653645] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-15 11:48:21,636][1653645] Updated weights for policy 0, policy_version 21824 (0.0014) [2024-06-15 11:48:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44209.1). Total num frames: 44761088. Throughput: 0: 11264.1. Samples: 11259904. Policy #0 lag: (min: 29.0, avg: 112.6, max: 285.0) [2024-06-15 11:48:25,959][1648982] Avg episode reward: [(0, '36.880')] [2024-06-15 11:48:26,523][1653645] Updated weights for policy 0, policy_version 21884 (0.0012) [2024-06-15 11:48:29,575][1653645] Updated weights for policy 0, policy_version 21947 (0.0013) [2024-06-15 11:48:30,958][1648982] Fps is (10 sec: 39324.4, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 44990464. Throughput: 0: 11298.2. Samples: 11323904. Policy #0 lag: (min: 29.0, avg: 112.6, max: 285.0) [2024-06-15 11:48:30,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:48:31,394][1653645] Updated weights for policy 0, policy_version 22000 (0.0029) [2024-06-15 11:48:32,573][1653645] Updated weights for policy 0, policy_version 22054 (0.0051) [2024-06-15 11:48:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 45219840. Throughput: 0: 11229.8. Samples: 11355136. Policy #0 lag: (min: 29.0, avg: 112.6, max: 285.0) [2024-06-15 11:48:35,958][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 11:48:37,200][1653645] Updated weights for policy 0, policy_version 22099 (0.0013) [2024-06-15 11:48:40,411][1653645] Updated weights for policy 0, policy_version 22161 (0.0013) [2024-06-15 11:48:40,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 45416448. Throughput: 0: 11332.2. Samples: 11429376. Policy #0 lag: (min: 15.0, avg: 114.8, max: 271.0) [2024-06-15 11:48:40,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 11:48:42,258][1653645] Updated weights for policy 0, policy_version 22224 (0.0013) [2024-06-15 11:48:43,632][1653645] Updated weights for policy 0, policy_version 22290 (0.0107) [2024-06-15 11:48:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.1, 300 sec: 44542.3). Total num frames: 45744128. Throughput: 0: 11480.2. Samples: 11490816. Policy #0 lag: (min: 15.0, avg: 114.8, max: 271.0) [2024-06-15 11:48:45,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 11:48:48,609][1653645] Updated weights for policy 0, policy_version 22352 (0.0012) [2024-06-15 11:48:49,785][1653645] Updated weights for policy 0, policy_version 22398 (0.0012) [2024-06-15 11:48:50,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 45875200. Throughput: 0: 11594.0. Samples: 11536384. Policy #0 lag: (min: 15.0, avg: 114.8, max: 271.0) [2024-06-15 11:48:50,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:48:52,701][1653645] Updated weights for policy 0, policy_version 22448 (0.0012) [2024-06-15 11:48:55,040][1653645] Updated weights for policy 0, policy_version 22546 (0.0012) [2024-06-15 11:48:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 48059.7, 300 sec: 45097.7). Total num frames: 46268416. Throughput: 0: 11526.0. Samples: 11593216. Policy #0 lag: (min: 41.0, avg: 127.5, max: 297.0) [2024-06-15 11:48:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 11:48:56,024][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000022592_46268416.pth... [2024-06-15 11:48:56,118][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000017296_35422208.pth [2024-06-15 11:48:56,123][1651596] Saving new best policy, reward=37.250! [2024-06-15 11:49:00,609][1653645] Updated weights for policy 0, policy_version 22608 (0.0013) [2024-06-15 11:49:00,986][1648982] Fps is (10 sec: 42477.4, 60 sec: 44216.0, 300 sec: 44538.0). Total num frames: 46301184. Throughput: 0: 11518.4. Samples: 11667968. Policy #0 lag: (min: 41.0, avg: 127.5, max: 297.0) [2024-06-15 11:49:00,987][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 11:49:03,637][1653645] Updated weights for policy 0, policy_version 22676 (0.0013) [2024-06-15 11:49:04,576][1653645] Updated weights for policy 0, policy_version 22719 (0.0013) [2024-06-15 11:49:05,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 47513.4, 300 sec: 44987.0). Total num frames: 46628864. Throughput: 0: 11423.4. Samples: 11697152. Policy #0 lag: (min: 41.0, avg: 127.5, max: 297.0) [2024-06-15 11:49:05,959][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 11:49:06,040][1651596] Signal inference workers to stop experience collection... (1200 times) [2024-06-15 11:49:06,091][1653645] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-15 11:49:06,201][1651596] Signal inference workers to resume experience collection... (1200 times) [2024-06-15 11:49:06,202][1653645] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-15 11:49:06,385][1653645] Updated weights for policy 0, policy_version 22789 (0.0012) [2024-06-15 11:49:10,958][1648982] Fps is (10 sec: 49292.5, 60 sec: 43691.5, 300 sec: 44431.2). Total num frames: 46792704. Throughput: 0: 11252.6. Samples: 11766272. Policy #0 lag: (min: 41.0, avg: 127.5, max: 297.0) [2024-06-15 11:49:10,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 11:49:11,540][1653645] Updated weights for policy 0, policy_version 22849 (0.0013) [2024-06-15 11:49:12,985][1653645] Updated weights for policy 0, policy_version 22912 (0.0011) [2024-06-15 11:49:15,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 47054848. Throughput: 0: 11389.1. Samples: 11836416. Policy #0 lag: (min: 15.0, avg: 100.4, max: 271.0) [2024-06-15 11:49:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 11:49:15,959][1651596] Saving new best policy, reward=37.280! [2024-06-15 11:49:16,582][1653645] Updated weights for policy 0, policy_version 22981 (0.0014) [2024-06-15 11:49:18,044][1653645] Updated weights for policy 0, policy_version 23042 (0.0017) [2024-06-15 11:49:19,366][1653645] Updated weights for policy 0, policy_version 23097 (0.0015) [2024-06-15 11:49:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.6, 300 sec: 44542.3). Total num frames: 47316992. Throughput: 0: 11366.4. Samples: 11866624. Policy #0 lag: (min: 15.0, avg: 100.4, max: 271.0) [2024-06-15 11:49:20,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 11:49:24,119][1653645] Updated weights for policy 0, policy_version 23156 (0.0016) [2024-06-15 11:49:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 47448064. Throughput: 0: 11377.8. Samples: 11941376. Policy #0 lag: (min: 15.0, avg: 100.4, max: 271.0) [2024-06-15 11:49:25,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:49:27,302][1653645] Updated weights for policy 0, policy_version 23229 (0.0113) [2024-06-15 11:49:28,654][1653645] Updated weights for policy 0, policy_version 23269 (0.0013) [2024-06-15 11:49:30,287][1653645] Updated weights for policy 0, policy_version 23344 (0.0014) [2024-06-15 11:49:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 47841280. Throughput: 0: 11298.1. Samples: 11999232. Policy #0 lag: (min: 18.0, avg: 140.7, max: 271.0) [2024-06-15 11:49:30,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 11:49:35,614][1653645] Updated weights for policy 0, policy_version 23394 (0.0025) [2024-06-15 11:49:36,012][1648982] Fps is (10 sec: 48885.3, 60 sec: 45287.9, 300 sec: 44756.1). Total num frames: 47939584. Throughput: 0: 11170.8. Samples: 12039680. Policy #0 lag: (min: 18.0, avg: 140.7, max: 271.0) [2024-06-15 11:49:36,013][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 11:49:38,649][1653645] Updated weights for policy 0, policy_version 23472 (0.0029) [2024-06-15 11:49:40,193][1653645] Updated weights for policy 0, policy_version 23524 (0.0014) [2024-06-15 11:49:40,960][1648982] Fps is (10 sec: 39321.4, 60 sec: 46967.6, 300 sec: 44765.1). Total num frames: 48234496. Throughput: 0: 11446.0. Samples: 12108288. Policy #0 lag: (min: 18.0, avg: 140.7, max: 271.0) [2024-06-15 11:49:40,961][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 11:49:42,144][1653645] Updated weights for policy 0, policy_version 23609 (0.0012) [2024-06-15 11:49:45,958][1648982] Fps is (10 sec: 42832.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 48365568. Throughput: 0: 11271.1. Samples: 12174848. Policy #0 lag: (min: 18.0, avg: 140.7, max: 271.0) [2024-06-15 11:49:45,958][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 11:49:48,534][1653645] Updated weights for policy 0, policy_version 23680 (0.0014) [2024-06-15 11:49:50,490][1651596] Signal inference workers to stop experience collection... (1250 times) [2024-06-15 11:49:50,543][1653645] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-15 11:49:50,743][1651596] Signal inference workers to resume experience collection... (1250 times) [2024-06-15 11:49:50,744][1653645] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-15 11:49:50,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.0, 300 sec: 44875.5). Total num frames: 48627712. Throughput: 0: 11434.7. Samples: 12211712. Policy #0 lag: (min: 15.0, avg: 90.3, max: 271.0) [2024-06-15 11:49:50,959][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:49:51,697][1653645] Updated weights for policy 0, policy_version 23777 (0.0013) [2024-06-15 11:49:52,998][1653645] Updated weights for policy 0, policy_version 23840 (0.0013) [2024-06-15 11:49:55,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 48889856. Throughput: 0: 11104.7. Samples: 12265984. Policy #0 lag: (min: 15.0, avg: 90.3, max: 271.0) [2024-06-15 11:49:55,958][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 11:50:00,629][1653645] Updated weights for policy 0, policy_version 23907 (0.0041) [2024-06-15 11:50:00,960][1648982] Fps is (10 sec: 36045.3, 60 sec: 44804.2, 300 sec: 44653.3). Total num frames: 48988160. Throughput: 0: 11309.5. Samples: 12345344. Policy #0 lag: (min: 15.0, avg: 90.3, max: 271.0) [2024-06-15 11:50:00,962][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:50:02,724][1653645] Updated weights for policy 0, policy_version 23993 (0.0014) [2024-06-15 11:50:04,709][1653645] Updated weights for policy 0, policy_version 24080 (0.0100) [2024-06-15 11:50:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 46421.5, 300 sec: 44875.5). Total num frames: 49414144. Throughput: 0: 11150.2. Samples: 12368384. Policy #0 lag: (min: 47.0, avg: 169.5, max: 317.0) [2024-06-15 11:50:05,958][1648982] Avg episode reward: [(0, '36.940')] [2024-06-15 11:50:10,958][1648982] Fps is (10 sec: 42595.9, 60 sec: 43690.2, 300 sec: 44431.1). Total num frames: 49414144. Throughput: 0: 11150.1. Samples: 12443136. Policy #0 lag: (min: 47.0, avg: 169.5, max: 317.0) [2024-06-15 11:50:10,959][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:50:11,794][1653645] Updated weights for policy 0, policy_version 24149 (0.0022) [2024-06-15 11:50:13,729][1653645] Updated weights for policy 0, policy_version 24230 (0.0011) [2024-06-15 11:50:15,250][1653645] Updated weights for policy 0, policy_version 24304 (0.0016) [2024-06-15 11:50:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 49840128. Throughput: 0: 11229.9. Samples: 12504576. Policy #0 lag: (min: 47.0, avg: 169.5, max: 317.0) [2024-06-15 11:50:15,958][1648982] Avg episode reward: [(0, '36.700')] [2024-06-15 11:50:16,750][1653645] Updated weights for policy 0, policy_version 24377 (0.0014) [2024-06-15 11:50:20,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 49938432. Throughput: 0: 11129.5. Samples: 12539904. Policy #0 lag: (min: 47.0, avg: 169.5, max: 317.0) [2024-06-15 11:50:20,960][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:50:23,199][1653645] Updated weights for policy 0, policy_version 24440 (0.0014) [2024-06-15 11:50:24,293][1653645] Updated weights for policy 0, policy_version 24480 (0.0015) [2024-06-15 11:50:25,856][1653645] Updated weights for policy 0, policy_version 24552 (0.0011) [2024-06-15 11:50:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 45097.6). Total num frames: 50266112. Throughput: 0: 11400.5. Samples: 12621312. Policy #0 lag: (min: 15.0, avg: 76.5, max: 271.0) [2024-06-15 11:50:25,958][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 11:50:26,819][1651596] Signal inference workers to stop experience collection... (1300 times) [2024-06-15 11:50:26,879][1653645] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-15 11:50:27,047][1651596] Signal inference workers to resume experience collection... (1300 times) [2024-06-15 11:50:27,047][1653645] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-15 11:50:27,660][1653645] Updated weights for policy 0, policy_version 24635 (0.0030) [2024-06-15 11:50:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.5, 300 sec: 44653.3). Total num frames: 50462720. Throughput: 0: 11309.5. Samples: 12683776. Policy #0 lag: (min: 15.0, avg: 76.5, max: 271.0) [2024-06-15 11:50:30,959][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:50:34,494][1653645] Updated weights for policy 0, policy_version 24695 (0.0011) [2024-06-15 11:50:35,692][1653645] Updated weights for policy 0, policy_version 24736 (0.0011) [2024-06-15 11:50:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45370.4, 300 sec: 44986.6). Total num frames: 50659328. Throughput: 0: 11537.1. Samples: 12730880. Policy #0 lag: (min: 15.0, avg: 76.5, max: 271.0) [2024-06-15 11:50:35,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:50:37,480][1653645] Updated weights for policy 0, policy_version 24816 (0.0013) [2024-06-15 11:50:38,584][1653645] Updated weights for policy 0, policy_version 24865 (0.0028) [2024-06-15 11:50:40,959][1648982] Fps is (10 sec: 52430.2, 60 sec: 45875.3, 300 sec: 45097.7). Total num frames: 50987008. Throughput: 0: 11605.3. Samples: 12788224. Policy #0 lag: (min: 104.0, avg: 236.7, max: 328.0) [2024-06-15 11:50:40,960][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:50:45,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 51052544. Throughput: 0: 11582.6. Samples: 12866560. Policy #0 lag: (min: 104.0, avg: 236.7, max: 328.0) [2024-06-15 11:50:45,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:50:46,247][1653645] Updated weights for policy 0, policy_version 24944 (0.0135) [2024-06-15 11:50:48,398][1653645] Updated weights for policy 0, policy_version 25040 (0.0079) [2024-06-15 11:50:49,363][1653645] Updated weights for policy 0, policy_version 25088 (0.0012) [2024-06-15 11:50:50,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 47513.7, 300 sec: 45431.0). Total num frames: 51478528. Throughput: 0: 11514.3. Samples: 12886528. Policy #0 lag: (min: 104.0, avg: 236.7, max: 328.0) [2024-06-15 11:50:50,959][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:50:50,978][1653645] Updated weights for policy 0, policy_version 25148 (0.0014) [2024-06-15 11:50:55,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 51511296. Throughput: 0: 11366.5. Samples: 12954624. Policy #0 lag: (min: 104.0, avg: 236.7, max: 328.0) [2024-06-15 11:50:55,959][1648982] Avg episode reward: [(0, '36.670')] [2024-06-15 11:50:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000025152_51511296.pth... [2024-06-15 11:50:56,016][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000019936_40828928.pth [2024-06-15 11:50:56,021][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000025152_51511296.pth [2024-06-15 11:50:59,254][1653645] Updated weights for policy 0, policy_version 25216 (0.0014) [2024-06-15 11:51:00,470][1653645] Updated weights for policy 0, policy_version 25268 (0.0012) [2024-06-15 11:51:00,958][1648982] Fps is (10 sec: 29490.3, 60 sec: 46421.1, 300 sec: 44875.5). Total num frames: 51773440. Throughput: 0: 11616.6. Samples: 13027328. Policy #0 lag: (min: 15.0, avg: 56.1, max: 271.0) [2024-06-15 11:51:00,959][1648982] Avg episode reward: [(0, '36.770')] [2024-06-15 11:51:01,736][1653645] Updated weights for policy 0, policy_version 25328 (0.0013) [2024-06-15 11:51:03,308][1653645] Updated weights for policy 0, policy_version 25399 (0.0012) [2024-06-15 11:51:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 52035584. Throughput: 0: 11412.0. Samples: 13053440. Policy #0 lag: (min: 15.0, avg: 56.1, max: 271.0) [2024-06-15 11:51:05,959][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:51:10,005][1653645] Updated weights for policy 0, policy_version 25440 (0.0011) [2024-06-15 11:51:10,961][1648982] Fps is (10 sec: 39322.9, 60 sec: 45875.6, 300 sec: 44431.2). Total num frames: 52166656. Throughput: 0: 11434.7. Samples: 13135872. Policy #0 lag: (min: 15.0, avg: 56.1, max: 271.0) [2024-06-15 11:51:10,963][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 11:51:11,238][1651596] Signal inference workers to stop experience collection... (1350 times) [2024-06-15 11:51:11,288][1653645] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-15 11:51:11,521][1651596] Signal inference workers to resume experience collection... (1350 times) [2024-06-15 11:51:11,522][1653645] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-15 11:51:11,938][1653645] Updated weights for policy 0, policy_version 25520 (0.0013) [2024-06-15 11:51:13,706][1653645] Updated weights for policy 0, policy_version 25616 (0.0013) [2024-06-15 11:51:15,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 45328.9, 300 sec: 45097.6). Total num frames: 52559872. Throughput: 0: 11161.6. Samples: 13186048. Policy #0 lag: (min: 15.0, avg: 56.1, max: 271.0) [2024-06-15 11:51:15,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:51:20,960][1648982] Fps is (10 sec: 42598.5, 60 sec: 44237.1, 300 sec: 44542.3). Total num frames: 52592640. Throughput: 0: 10956.8. Samples: 13223936. Policy #0 lag: (min: 6.0, avg: 62.8, max: 262.0) [2024-06-15 11:51:20,961][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 11:51:21,572][1653645] Updated weights for policy 0, policy_version 25699 (0.0057) [2024-06-15 11:51:23,505][1653645] Updated weights for policy 0, policy_version 25792 (0.0013) [2024-06-15 11:51:24,505][1653645] Updated weights for policy 0, policy_version 25848 (0.0102) [2024-06-15 11:51:25,824][1653645] Updated weights for policy 0, policy_version 25892 (0.0021) [2024-06-15 11:51:25,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 53018624. Throughput: 0: 11161.6. Samples: 13290496. Policy #0 lag: (min: 6.0, avg: 62.8, max: 262.0) [2024-06-15 11:51:25,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:51:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 53084160. Throughput: 0: 11059.2. Samples: 13364224. Policy #0 lag: (min: 6.0, avg: 62.8, max: 262.0) [2024-06-15 11:51:30,958][1648982] Avg episode reward: [(0, '36.890')] [2024-06-15 11:51:32,525][1653645] Updated weights for policy 0, policy_version 25939 (0.0013) [2024-06-15 11:51:33,626][1653645] Updated weights for policy 0, policy_version 25987 (0.0013) [2024-06-15 11:51:35,391][1653645] Updated weights for policy 0, policy_version 26080 (0.0123) [2024-06-15 11:51:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 53444608. Throughput: 0: 11377.8. Samples: 13398528. Policy #0 lag: (min: 6.0, avg: 62.8, max: 262.0) [2024-06-15 11:51:35,959][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:51:37,236][1653645] Updated weights for policy 0, policy_version 26144 (0.0011) [2024-06-15 11:51:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 53608448. Throughput: 0: 11195.7. Samples: 13458432. Policy #0 lag: (min: 6.0, avg: 62.8, max: 262.0) [2024-06-15 11:51:40,958][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 11:51:44,917][1653645] Updated weights for policy 0, policy_version 26197 (0.0012) [2024-06-15 11:51:45,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 53772288. Throughput: 0: 11264.1. Samples: 13534208. Policy #0 lag: (min: 15.0, avg: 68.6, max: 271.0) [2024-06-15 11:51:45,959][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 11:51:46,035][1653645] Updated weights for policy 0, policy_version 26256 (0.0118) [2024-06-15 11:51:47,170][1653645] Updated weights for policy 0, policy_version 26320 (0.0228) [2024-06-15 11:51:48,975][1651596] Signal inference workers to stop experience collection... (1400 times) [2024-06-15 11:51:49,049][1653645] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-15 11:51:49,176][1651596] Signal inference workers to resume experience collection... (1400 times) [2024-06-15 11:51:49,177][1653645] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-15 11:51:49,341][1653645] Updated weights for policy 0, policy_version 26401 (0.0013) [2024-06-15 11:51:50,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 44236.9, 300 sec: 45319.9). Total num frames: 54132736. Throughput: 0: 11264.0. Samples: 13560320. Policy #0 lag: (min: 15.0, avg: 68.6, max: 271.0) [2024-06-15 11:51:50,958][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 11:51:55,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 54132736. Throughput: 0: 10979.5. Samples: 13629952. Policy #0 lag: (min: 15.0, avg: 68.6, max: 271.0) [2024-06-15 11:51:55,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 11:51:56,729][1653645] Updated weights for policy 0, policy_version 26453 (0.0018) [2024-06-15 11:51:58,043][1653645] Updated weights for policy 0, policy_version 26515 (0.0012) [2024-06-15 11:51:59,715][1653645] Updated weights for policy 0, policy_version 26579 (0.0011) [2024-06-15 11:52:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.6, 300 sec: 45319.8). Total num frames: 54525952. Throughput: 0: 11161.7. Samples: 13688320. Policy #0 lag: (min: 15.0, avg: 68.6, max: 271.0) [2024-06-15 11:52:00,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 11:52:01,070][1653645] Updated weights for policy 0, policy_version 26629 (0.0051) [2024-06-15 11:52:02,282][1653645] Updated weights for policy 0, policy_version 26681 (0.0021) [2024-06-15 11:52:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 54657024. Throughput: 0: 11104.7. Samples: 13723648. Policy #0 lag: (min: 15.0, avg: 68.6, max: 271.0) [2024-06-15 11:52:05,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:52:08,926][1653645] Updated weights for policy 0, policy_version 26737 (0.0011) [2024-06-15 11:52:10,862][1653645] Updated weights for policy 0, policy_version 26818 (0.0133) [2024-06-15 11:52:10,958][1648982] Fps is (10 sec: 39318.3, 60 sec: 45874.7, 300 sec: 44875.4). Total num frames: 54919168. Throughput: 0: 11206.9. Samples: 13794816. Policy #0 lag: (min: 15.0, avg: 75.8, max: 271.0) [2024-06-15 11:52:10,959][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 11:52:12,109][1653645] Updated weights for policy 0, policy_version 26871 (0.0014) [2024-06-15 11:52:13,401][1653645] Updated weights for policy 0, policy_version 26915 (0.0013) [2024-06-15 11:52:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.9, 300 sec: 44986.6). Total num frames: 55181312. Throughput: 0: 11059.2. Samples: 13861888. Policy #0 lag: (min: 15.0, avg: 75.8, max: 271.0) [2024-06-15 11:52:15,958][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 11:52:19,645][1653645] Updated weights for policy 0, policy_version 26962 (0.0012) [2024-06-15 11:52:20,958][1648982] Fps is (10 sec: 42601.0, 60 sec: 45875.1, 300 sec: 44986.5). Total num frames: 55345152. Throughput: 0: 11150.2. Samples: 13900288. Policy #0 lag: (min: 15.0, avg: 75.8, max: 271.0) [2024-06-15 11:52:20,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 11:52:21,291][1653645] Updated weights for policy 0, policy_version 27040 (0.0101) [2024-06-15 11:52:23,241][1653645] Updated weights for policy 0, policy_version 27121 (0.0013) [2024-06-15 11:52:25,139][1653645] Updated weights for policy 0, policy_version 27154 (0.0014) [2024-06-15 11:52:25,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 44236.8, 300 sec: 45542.0). Total num frames: 55672832. Throughput: 0: 11150.2. Samples: 13960192. Policy #0 lag: (min: 15.0, avg: 75.8, max: 271.0) [2024-06-15 11:52:25,958][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 11:52:26,161][1653645] Updated weights for policy 0, policy_version 27200 (0.0014) [2024-06-15 11:52:30,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 55705600. Throughput: 0: 11104.7. Samples: 14033920. Policy #0 lag: (min: 15.0, avg: 75.8, max: 271.0) [2024-06-15 11:52:30,959][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 11:52:32,509][1651596] Signal inference workers to stop experience collection... (1450 times) [2024-06-15 11:52:32,564][1653645] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-15 11:52:32,837][1651596] Signal inference workers to resume experience collection... (1450 times) [2024-06-15 11:52:32,838][1653645] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-15 11:52:33,544][1653645] Updated weights for policy 0, policy_version 27325 (0.0159) [2024-06-15 11:52:35,175][1653645] Updated weights for policy 0, policy_version 27388 (0.0012) [2024-06-15 11:52:35,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.9, 300 sec: 45319.9). Total num frames: 56098816. Throughput: 0: 10990.9. Samples: 14054912. Policy #0 lag: (min: 15.0, avg: 78.0, max: 271.0) [2024-06-15 11:52:35,958][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 11:52:38,079][1653645] Updated weights for policy 0, policy_version 27445 (0.0014) [2024-06-15 11:52:40,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 44653.3). Total num frames: 56229888. Throughput: 0: 10899.9. Samples: 14120448. Policy #0 lag: (min: 15.0, avg: 78.0, max: 271.0) [2024-06-15 11:52:40,959][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:52:43,824][1653645] Updated weights for policy 0, policy_version 27475 (0.0011) [2024-06-15 11:52:45,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 56426496. Throughput: 0: 10968.1. Samples: 14181888. Policy #0 lag: (min: 15.0, avg: 78.0, max: 271.0) [2024-06-15 11:52:45,958][1648982] Avg episode reward: [(0, '36.760')] [2024-06-15 11:52:46,053][1653645] Updated weights for policy 0, policy_version 27568 (0.0013) [2024-06-15 11:52:47,743][1653645] Updated weights for policy 0, policy_version 27639 (0.0025) [2024-06-15 11:52:50,510][1653645] Updated weights for policy 0, policy_version 27681 (0.0013) [2024-06-15 11:52:50,958][1648982] Fps is (10 sec: 49153.3, 60 sec: 43144.5, 300 sec: 45208.7). Total num frames: 56721408. Throughput: 0: 10831.6. Samples: 14211072. Policy #0 lag: (min: 15.0, avg: 154.3, max: 271.0) [2024-06-15 11:52:50,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 11:52:55,927][1653645] Updated weights for policy 0, policy_version 27713 (0.0024) [2024-06-15 11:52:55,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 56754176. Throughput: 0: 10877.3. Samples: 14284288. Policy #0 lag: (min: 15.0, avg: 154.3, max: 271.0) [2024-06-15 11:52:55,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:52:56,553][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000027744_56819712.pth... [2024-06-15 11:52:56,715][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000022592_46268416.pth [2024-06-15 11:52:57,822][1653645] Updated weights for policy 0, policy_version 27792 (0.0027) [2024-06-15 11:52:59,775][1653645] Updated weights for policy 0, policy_version 27858 (0.0010) [2024-06-15 11:53:00,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 57147392. Throughput: 0: 10547.2. Samples: 14336512. Policy #0 lag: (min: 15.0, avg: 154.3, max: 271.0) [2024-06-15 11:53:00,960][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:53:02,739][1653645] Updated weights for policy 0, policy_version 27924 (0.0012) [2024-06-15 11:53:03,551][1653645] Updated weights for policy 0, policy_version 27967 (0.0013) [2024-06-15 11:53:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.6, 300 sec: 44431.4). Total num frames: 57278464. Throughput: 0: 10433.5. Samples: 14369792. Policy #0 lag: (min: 15.0, avg: 154.3, max: 271.0) [2024-06-15 11:53:05,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 11:53:09,704][1653645] Updated weights for policy 0, policy_version 28034 (0.0011) [2024-06-15 11:53:10,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 43145.1, 300 sec: 44764.4). Total num frames: 57507840. Throughput: 0: 10740.6. Samples: 14443520. Policy #0 lag: (min: 10.0, avg: 72.8, max: 266.0) [2024-06-15 11:53:10,958][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:53:11,308][1653645] Updated weights for policy 0, policy_version 28096 (0.0010) [2024-06-15 11:53:12,807][1653645] Updated weights for policy 0, policy_version 28160 (0.0013) [2024-06-15 11:53:14,905][1651596] Signal inference workers to stop experience collection... (1500 times) [2024-06-15 11:53:14,942][1653645] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-15 11:53:15,168][1651596] Signal inference workers to resume experience collection... (1500 times) [2024-06-15 11:53:15,170][1653645] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-15 11:53:15,420][1653645] Updated weights for policy 0, policy_version 28223 (0.0014) [2024-06-15 11:53:15,960][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44764.5). Total num frames: 57802752. Throughput: 0: 10399.3. Samples: 14501888. Policy #0 lag: (min: 10.0, avg: 72.8, max: 266.0) [2024-06-15 11:53:15,962][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 11:53:20,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 41506.3, 300 sec: 44320.1). Total num frames: 57835520. Throughput: 0: 10763.4. Samples: 14539264. Policy #0 lag: (min: 10.0, avg: 72.8, max: 266.0) [2024-06-15 11:53:20,958][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 11:53:22,485][1653645] Updated weights for policy 0, policy_version 28306 (0.0017) [2024-06-15 11:53:24,705][1653645] Updated weights for policy 0, policy_version 28387 (0.0015) [2024-06-15 11:53:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 58195968. Throughput: 0: 10626.9. Samples: 14598656. Policy #0 lag: (min: 10.0, avg: 72.8, max: 266.0) [2024-06-15 11:53:25,958][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 11:53:27,275][1653645] Updated weights for policy 0, policy_version 28448 (0.0013) [2024-06-15 11:53:30,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 58327040. Throughput: 0: 10786.1. Samples: 14667264. Policy #0 lag: (min: 31.0, avg: 184.8, max: 287.0) [2024-06-15 11:53:30,959][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 11:53:32,416][1653645] Updated weights for policy 0, policy_version 28486 (0.0017) [2024-06-15 11:53:34,699][1653645] Updated weights for policy 0, policy_version 28576 (0.0018) [2024-06-15 11:53:35,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 44764.5). Total num frames: 58621952. Throughput: 0: 10854.4. Samples: 14699520. Policy #0 lag: (min: 31.0, avg: 184.8, max: 287.0) [2024-06-15 11:53:35,959][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 11:53:36,736][1653645] Updated weights for policy 0, policy_version 28662 (0.0013) [2024-06-15 11:53:39,542][1653645] Updated weights for policy 0, policy_version 28706 (0.0012) [2024-06-15 11:53:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 58851328. Throughput: 0: 10626.9. Samples: 14762496. Policy #0 lag: (min: 31.0, avg: 184.8, max: 287.0) [2024-06-15 11:53:40,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 11:53:45,002][1653645] Updated weights for policy 0, policy_version 28771 (0.0014) [2024-06-15 11:53:45,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 58982400. Throughput: 0: 10979.5. Samples: 14830592. Policy #0 lag: (min: 31.0, avg: 184.8, max: 287.0) [2024-06-15 11:53:45,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 11:53:46,888][1653645] Updated weights for policy 0, policy_version 28848 (0.0192) [2024-06-15 11:53:48,894][1653645] Updated weights for policy 0, policy_version 28919 (0.0013) [2024-06-15 11:53:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42052.1, 300 sec: 43986.9). Total num frames: 59244544. Throughput: 0: 10717.8. Samples: 14852096. Policy #0 lag: (min: 63.0, avg: 148.8, max: 335.0) [2024-06-15 11:53:50,959][1648982] Avg episode reward: [(0, '36.940')] [2024-06-15 11:53:52,420][1653645] Updated weights for policy 0, policy_version 28988 (0.0013) [2024-06-15 11:53:55,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.8, 300 sec: 44324.4). Total num frames: 59375616. Throughput: 0: 10740.6. Samples: 14926848. Policy #0 lag: (min: 63.0, avg: 148.8, max: 335.0) [2024-06-15 11:53:55,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 11:53:57,447][1653645] Updated weights for policy 0, policy_version 29032 (0.0069) [2024-06-15 11:53:59,300][1653645] Updated weights for policy 0, policy_version 29112 (0.0027) [2024-06-15 11:54:00,109][1651596] Signal inference workers to stop experience collection... (1550 times) [2024-06-15 11:54:00,156][1653645] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-15 11:54:00,296][1651596] Signal inference workers to resume experience collection... (1550 times) [2024-06-15 11:54:00,297][1653645] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-15 11:54:00,788][1653645] Updated weights for policy 0, policy_version 29176 (0.0016) [2024-06-15 11:54:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.4, 300 sec: 44542.3). Total num frames: 59768832. Throughput: 0: 10752.0. Samples: 14985728. Policy #0 lag: (min: 63.0, avg: 148.8, max: 335.0) [2024-06-15 11:54:00,958][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 11:54:04,153][1653645] Updated weights for policy 0, policy_version 29242 (0.0043) [2024-06-15 11:54:05,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 59899904. Throughput: 0: 10888.5. Samples: 15029248. Policy #0 lag: (min: 63.0, avg: 148.8, max: 335.0) [2024-06-15 11:54:05,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 11:54:08,554][1653645] Updated weights for policy 0, policy_version 29281 (0.0015) [2024-06-15 11:54:10,564][1653645] Updated weights for policy 0, policy_version 29362 (0.0141) [2024-06-15 11:54:10,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 60162048. Throughput: 0: 10979.6. Samples: 15092736. Policy #0 lag: (min: 15.0, avg: 95.1, max: 271.0) [2024-06-15 11:54:10,958][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 11:54:12,186][1653645] Updated weights for policy 0, policy_version 29413 (0.0014) [2024-06-15 11:54:15,722][1653645] Updated weights for policy 0, policy_version 29472 (0.0013) [2024-06-15 11:54:15,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 60358656. Throughput: 0: 11025.1. Samples: 15163392. Policy #0 lag: (min: 15.0, avg: 95.1, max: 271.0) [2024-06-15 11:54:15,959][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 11:54:19,670][1653645] Updated weights for policy 0, policy_version 29520 (0.0112) [2024-06-15 11:54:20,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 60555264. Throughput: 0: 11093.3. Samples: 15198720. Policy #0 lag: (min: 15.0, avg: 95.1, max: 271.0) [2024-06-15 11:54:20,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 11:54:21,762][1653645] Updated weights for policy 0, policy_version 29603 (0.0013) [2024-06-15 11:54:23,452][1653645] Updated weights for policy 0, policy_version 29653 (0.0013) [2024-06-15 11:54:25,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 60817408. Throughput: 0: 11070.5. Samples: 15260672. Policy #0 lag: (min: 15.0, avg: 95.1, max: 271.0) [2024-06-15 11:54:25,959][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 11:54:26,357][1653645] Updated weights for policy 0, policy_version 29698 (0.0020) [2024-06-15 11:54:27,650][1653645] Updated weights for policy 0, policy_version 29754 (0.0012) [2024-06-15 11:54:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44217.2). Total num frames: 60981248. Throughput: 0: 11320.9. Samples: 15340032. Policy #0 lag: (min: 50.0, avg: 167.1, max: 306.0) [2024-06-15 11:54:30,960][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 11:54:32,075][1653645] Updated weights for policy 0, policy_version 29824 (0.0107) [2024-06-15 11:54:33,368][1653645] Updated weights for policy 0, policy_version 29883 (0.0058) [2024-06-15 11:54:35,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 61308928. Throughput: 0: 11309.5. Samples: 15361024. Policy #0 lag: (min: 50.0, avg: 167.1, max: 306.0) [2024-06-15 11:54:35,959][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 11:54:36,165][1653645] Updated weights for policy 0, policy_version 29951 (0.0012) [2024-06-15 11:54:39,988][1653645] Updated weights for policy 0, policy_version 30009 (0.0046) [2024-06-15 11:54:40,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 61472768. Throughput: 0: 11161.6. Samples: 15429120. Policy #0 lag: (min: 50.0, avg: 167.1, max: 306.0) [2024-06-15 11:54:40,958][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 11:54:43,591][1653645] Updated weights for policy 0, policy_version 30066 (0.0141) [2024-06-15 11:54:44,555][1651596] Signal inference workers to stop experience collection... (1600 times) [2024-06-15 11:54:44,630][1653645] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-15 11:54:44,665][1651596] Signal inference workers to resume experience collection... (1600 times) [2024-06-15 11:54:44,666][1653645] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-15 11:54:44,836][1653645] Updated weights for policy 0, policy_version 30138 (0.0013) [2024-06-15 11:54:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 61734912. Throughput: 0: 11377.8. Samples: 15497728. Policy #0 lag: (min: 50.0, avg: 167.1, max: 306.0) [2024-06-15 11:54:45,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:54:47,601][1653645] Updated weights for policy 0, policy_version 30203 (0.0013) [2024-06-15 11:54:50,958][1648982] Fps is (10 sec: 49150.4, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 61964288. Throughput: 0: 11138.8. Samples: 15530496. Policy #0 lag: (min: 47.0, avg: 185.3, max: 303.0) [2024-06-15 11:54:50,960][1648982] Avg episode reward: [(0, '36.940')] [2024-06-15 11:54:51,174][1653645] Updated weights for policy 0, policy_version 30267 (0.0020) [2024-06-15 11:54:54,892][1653645] Updated weights for policy 0, policy_version 30320 (0.0019) [2024-06-15 11:54:55,961][1648982] Fps is (10 sec: 45862.8, 60 sec: 46965.2, 300 sec: 44764.0). Total num frames: 62193664. Throughput: 0: 11479.4. Samples: 15609344. Policy #0 lag: (min: 47.0, avg: 185.3, max: 303.0) [2024-06-15 11:54:55,962][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 11:54:56,419][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000030384_62226432.pth... [2024-06-15 11:54:56,462][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000025152_51511296.pth [2024-06-15 11:54:56,697][1653645] Updated weights for policy 0, policy_version 30393 (0.0179) [2024-06-15 11:54:58,524][1653645] Updated weights for policy 0, policy_version 30457 (0.0016) [2024-06-15 11:55:00,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 62390272. Throughput: 0: 11320.9. Samples: 15672832. Policy #0 lag: (min: 47.0, avg: 185.3, max: 303.0) [2024-06-15 11:55:00,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 11:55:02,570][1653645] Updated weights for policy 0, policy_version 30517 (0.0105) [2024-06-15 11:55:05,958][1648982] Fps is (10 sec: 36054.2, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 62554112. Throughput: 0: 11195.7. Samples: 15702528. Policy #0 lag: (min: 47.0, avg: 185.3, max: 303.0) [2024-06-15 11:55:05,959][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:55:06,083][1653645] Updated weights for policy 0, policy_version 30560 (0.0015) [2024-06-15 11:55:08,138][1653645] Updated weights for policy 0, policy_version 30645 (0.0013) [2024-06-15 11:55:08,905][1653645] Updated weights for policy 0, policy_version 30658 (0.0012) [2024-06-15 11:55:10,490][1653645] Updated weights for policy 0, policy_version 30711 (0.0199) [2024-06-15 11:55:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 62914560. Throughput: 0: 11400.6. Samples: 15773696. Policy #0 lag: (min: 86.0, avg: 163.8, max: 342.0) [2024-06-15 11:55:10,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 11:55:13,842][1653645] Updated weights for policy 0, policy_version 30755 (0.0014) [2024-06-15 11:55:15,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 63045632. Throughput: 0: 11047.9. Samples: 15837184. Policy #0 lag: (min: 86.0, avg: 163.8, max: 342.0) [2024-06-15 11:55:15,958][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 11:55:17,479][1653645] Updated weights for policy 0, policy_version 30801 (0.0012) [2024-06-15 11:55:19,572][1653645] Updated weights for policy 0, policy_version 30880 (0.0163) [2024-06-15 11:55:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46421.4, 300 sec: 44320.1). Total num frames: 63340544. Throughput: 0: 11457.5. Samples: 15876608. Policy #0 lag: (min: 86.0, avg: 163.8, max: 342.0) [2024-06-15 11:55:20,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:55:21,511][1653645] Updated weights for policy 0, policy_version 30960 (0.0016) [2024-06-15 11:55:25,966][1648982] Fps is (10 sec: 45837.0, 60 sec: 44776.9, 300 sec: 44207.8). Total num frames: 63504384. Throughput: 0: 11261.9. Samples: 15936000. Policy #0 lag: (min: 86.0, avg: 163.8, max: 342.0) [2024-06-15 11:55:25,966][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 11:55:26,027][1653645] Updated weights for policy 0, policy_version 31014 (0.0012) [2024-06-15 11:55:29,504][1653645] Updated weights for policy 0, policy_version 31060 (0.0014) [2024-06-15 11:55:30,671][1651596] Signal inference workers to stop experience collection... (1650 times) [2024-06-15 11:55:30,755][1653645] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-15 11:55:30,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 63700992. Throughput: 0: 11411.9. Samples: 16011264. Policy #0 lag: (min: 11.0, avg: 83.3, max: 267.0) [2024-06-15 11:55:30,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:55:31,030][1651596] Signal inference workers to resume experience collection... (1650 times) [2024-06-15 11:55:31,031][1653645] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-15 11:55:31,492][1653645] Updated weights for policy 0, policy_version 31136 (0.0020) [2024-06-15 11:55:32,676][1653645] Updated weights for policy 0, policy_version 31184 (0.0012) [2024-06-15 11:55:33,856][1653645] Updated weights for policy 0, policy_version 31232 (0.0010) [2024-06-15 11:55:35,958][1648982] Fps is (10 sec: 45913.5, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 63963136. Throughput: 0: 11332.3. Samples: 16040448. Policy #0 lag: (min: 11.0, avg: 83.3, max: 267.0) [2024-06-15 11:55:35,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 11:55:37,507][1653645] Updated weights for policy 0, policy_version 31291 (0.0013) [2024-06-15 11:55:40,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 64126976. Throughput: 0: 11196.5. Samples: 16113152. Policy #0 lag: (min: 11.0, avg: 83.3, max: 267.0) [2024-06-15 11:55:40,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 11:55:41,551][1653645] Updated weights for policy 0, policy_version 31344 (0.0098) [2024-06-15 11:55:43,276][1653645] Updated weights for policy 0, policy_version 31415 (0.0014) [2024-06-15 11:55:44,675][1653645] Updated weights for policy 0, policy_version 31456 (0.0012) [2024-06-15 11:55:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.4, 300 sec: 44098.0). Total num frames: 64487424. Throughput: 0: 11127.5. Samples: 16173568. Policy #0 lag: (min: 11.0, avg: 83.3, max: 267.0) [2024-06-15 11:55:45,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 11:55:48,736][1653645] Updated weights for policy 0, policy_version 31508 (0.0011) [2024-06-15 11:55:50,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 64618496. Throughput: 0: 11332.3. Samples: 16212480. Policy #0 lag: (min: 11.0, avg: 83.3, max: 267.0) [2024-06-15 11:55:50,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 11:55:52,287][1653645] Updated weights for policy 0, policy_version 31553 (0.0013) [2024-06-15 11:55:54,064][1653645] Updated weights for policy 0, policy_version 31616 (0.0142) [2024-06-15 11:55:55,579][1653645] Updated weights for policy 0, policy_version 31677 (0.0012) [2024-06-15 11:55:55,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 44784.9, 300 sec: 44431.2). Total num frames: 64880640. Throughput: 0: 11127.4. Samples: 16274432. Policy #0 lag: (min: 15.0, avg: 101.0, max: 271.0) [2024-06-15 11:55:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 11:55:57,194][1653645] Updated weights for policy 0, policy_version 31744 (0.0018) [2024-06-15 11:56:00,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 65044480. Throughput: 0: 11252.6. Samples: 16343552. Policy #0 lag: (min: 15.0, avg: 101.0, max: 271.0) [2024-06-15 11:56:00,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 11:56:01,591][1653645] Updated weights for policy 0, policy_version 31805 (0.0013) [2024-06-15 11:56:05,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45329.3, 300 sec: 44431.2). Total num frames: 65273856. Throughput: 0: 11070.6. Samples: 16374784. Policy #0 lag: (min: 15.0, avg: 101.0, max: 271.0) [2024-06-15 11:56:05,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 11:56:06,121][1653645] Updated weights for policy 0, policy_version 31873 (0.0013) [2024-06-15 11:56:08,293][1653645] Updated weights for policy 0, policy_version 31952 (0.0014) [2024-06-15 11:56:09,405][1653645] Updated weights for policy 0, policy_version 31996 (0.0014) [2024-06-15 11:56:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 65536000. Throughput: 0: 11095.4. Samples: 16435200. Policy #0 lag: (min: 15.0, avg: 101.0, max: 271.0) [2024-06-15 11:56:10,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 11:56:12,956][1653645] Updated weights for policy 0, policy_version 32036 (0.0013) [2024-06-15 11:56:15,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 65667072. Throughput: 0: 11025.1. Samples: 16507392. Policy #0 lag: (min: 50.0, avg: 161.4, max: 306.0) [2024-06-15 11:56:15,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 11:56:16,729][1651596] Signal inference workers to stop experience collection... (1700 times) [2024-06-15 11:56:16,789][1653645] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-15 11:56:16,974][1651596] Signal inference workers to resume experience collection... (1700 times) [2024-06-15 11:56:16,975][1653645] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-15 11:56:16,977][1653645] Updated weights for policy 0, policy_version 32096 (0.0013) [2024-06-15 11:56:18,288][1653645] Updated weights for policy 0, policy_version 32146 (0.0023) [2024-06-15 11:56:19,172][1653645] Updated weights for policy 0, policy_version 32192 (0.0015) [2024-06-15 11:56:20,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 66027520. Throughput: 0: 11070.6. Samples: 16538624. Policy #0 lag: (min: 50.0, avg: 161.4, max: 306.0) [2024-06-15 11:56:20,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 11:56:21,014][1653645] Updated weights for policy 0, policy_version 32249 (0.0028) [2024-06-15 11:56:25,241][1653645] Updated weights for policy 0, policy_version 32308 (0.0030) [2024-06-15 11:56:25,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44789.0, 300 sec: 44431.2). Total num frames: 66191360. Throughput: 0: 10968.1. Samples: 16606720. Policy #0 lag: (min: 50.0, avg: 161.4, max: 306.0) [2024-06-15 11:56:25,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 11:56:29,024][1653645] Updated weights for policy 0, policy_version 32354 (0.0011) [2024-06-15 11:56:30,824][1653645] Updated weights for policy 0, policy_version 32432 (0.0015) [2024-06-15 11:56:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 66420736. Throughput: 0: 11002.3. Samples: 16668672. Policy #0 lag: (min: 50.0, avg: 161.4, max: 306.0) [2024-06-15 11:56:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 11:56:32,982][1653645] Updated weights for policy 0, policy_version 32500 (0.0013) [2024-06-15 11:56:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 66584576. Throughput: 0: 10797.5. Samples: 16698368. Policy #0 lag: (min: 50.0, avg: 161.4, max: 306.0) [2024-06-15 11:56:35,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 11:56:35,960][1651596] Saving new best policy, reward=37.310! [2024-06-15 11:56:36,797][1653645] Updated weights for policy 0, policy_version 32532 (0.0014) [2024-06-15 11:56:40,152][1653645] Updated weights for policy 0, policy_version 32592 (0.0012) [2024-06-15 11:56:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 66781184. Throughput: 0: 11002.3. Samples: 16769536. Policy #0 lag: (min: 13.0, avg: 124.8, max: 269.0) [2024-06-15 11:56:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 11:56:41,934][1653645] Updated weights for policy 0, policy_version 32656 (0.0103) [2024-06-15 11:56:42,961][1653645] Updated weights for policy 0, policy_version 32704 (0.0015) [2024-06-15 11:56:44,721][1653645] Updated weights for policy 0, policy_version 32755 (0.0014) [2024-06-15 11:56:45,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 67108864. Throughput: 0: 10934.0. Samples: 16835584. Policy #0 lag: (min: 13.0, avg: 124.8, max: 269.0) [2024-06-15 11:56:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 11:56:48,228][1653645] Updated weights for policy 0, policy_version 32800 (0.0017) [2024-06-15 11:56:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 67239936. Throughput: 0: 11013.7. Samples: 16870400. Policy #0 lag: (min: 13.0, avg: 124.8, max: 269.0) [2024-06-15 11:56:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 11:56:51,562][1653645] Updated weights for policy 0, policy_version 32848 (0.0026) [2024-06-15 11:56:52,684][1653645] Updated weights for policy 0, policy_version 32893 (0.0012) [2024-06-15 11:56:54,068][1653645] Updated weights for policy 0, policy_version 32952 (0.0013) [2024-06-15 11:56:55,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 67567616. Throughput: 0: 11127.4. Samples: 16935936. Policy #0 lag: (min: 13.0, avg: 124.8, max: 269.0) [2024-06-15 11:56:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 11:56:56,296][1653645] Updated weights for policy 0, policy_version 33008 (0.0013) [2024-06-15 11:56:56,303][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000033008_67600384.pth... [2024-06-15 11:56:56,370][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000027744_56819712.pth [2024-06-15 11:56:59,697][1653645] Updated weights for policy 0, policy_version 33056 (0.0011) [2024-06-15 11:57:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 67764224. Throughput: 0: 11275.4. Samples: 17014784. Policy #0 lag: (min: 15.0, avg: 124.5, max: 271.0) [2024-06-15 11:57:00,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 11:57:01,678][1651596] Signal inference workers to stop experience collection... (1750 times) [2024-06-15 11:57:01,719][1653645] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-15 11:57:01,775][1653645] Updated weights for policy 0, policy_version 33092 (0.0013) [2024-06-15 11:57:01,886][1651596] Signal inference workers to resume experience collection... (1750 times) [2024-06-15 11:57:01,887][1653645] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-15 11:57:03,936][1653645] Updated weights for policy 0, policy_version 33168 (0.0013) [2024-06-15 11:57:04,855][1653645] Updated weights for policy 0, policy_version 33213 (0.0015) [2024-06-15 11:57:05,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 45875.1, 300 sec: 44431.3). Total num frames: 68026368. Throughput: 0: 11309.5. Samples: 17047552. Policy #0 lag: (min: 15.0, avg: 124.5, max: 271.0) [2024-06-15 11:57:05,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 11:57:07,666][1653645] Updated weights for policy 0, policy_version 33264 (0.0013) [2024-06-15 11:57:10,517][1653645] Updated weights for policy 0, policy_version 33298 (0.0028) [2024-06-15 11:57:10,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 44782.6, 300 sec: 44209.0). Total num frames: 68222976. Throughput: 0: 11366.4. Samples: 17118208. Policy #0 lag: (min: 15.0, avg: 124.5, max: 271.0) [2024-06-15 11:57:10,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 11:57:11,376][1653645] Updated weights for policy 0, policy_version 33344 (0.0011) [2024-06-15 11:57:13,928][1653645] Updated weights for policy 0, policy_version 33400 (0.0080) [2024-06-15 11:57:15,482][1653645] Updated weights for policy 0, policy_version 33468 (0.0011) [2024-06-15 11:57:15,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 48059.8, 300 sec: 44764.5). Total num frames: 68550656. Throughput: 0: 11491.6. Samples: 17185792. Policy #0 lag: (min: 15.0, avg: 124.5, max: 271.0) [2024-06-15 11:57:15,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 11:57:19,109][1653645] Updated weights for policy 0, policy_version 33520 (0.0012) [2024-06-15 11:57:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 68681728. Throughput: 0: 11605.3. Samples: 17220608. Policy #0 lag: (min: 15.0, avg: 124.5, max: 271.0) [2024-06-15 11:57:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 11:57:20,960][1651596] Saving new best policy, reward=37.360! [2024-06-15 11:57:21,992][1653645] Updated weights for policy 0, policy_version 33568 (0.0013) [2024-06-15 11:57:25,121][1653645] Updated weights for policy 0, policy_version 33635 (0.0012) [2024-06-15 11:57:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 68943872. Throughput: 0: 11741.9. Samples: 17297920. Policy #0 lag: (min: 15.0, avg: 116.6, max: 271.0) [2024-06-15 11:57:25,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 11:57:26,284][1653645] Updated weights for policy 0, policy_version 33680 (0.0014) [2024-06-15 11:57:27,393][1653645] Updated weights for policy 0, policy_version 33725 (0.0014) [2024-06-15 11:57:30,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 45328.8, 300 sec: 44209.0). Total num frames: 69140480. Throughput: 0: 11582.5. Samples: 17356800. Policy #0 lag: (min: 15.0, avg: 116.6, max: 271.0) [2024-06-15 11:57:30,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 11:57:31,632][1653645] Updated weights for policy 0, policy_version 33792 (0.0012) [2024-06-15 11:57:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 69337088. Throughput: 0: 11594.0. Samples: 17392128. Policy #0 lag: (min: 15.0, avg: 116.6, max: 271.0) [2024-06-15 11:57:35,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:57:36,338][1653645] Updated weights for policy 0, policy_version 33872 (0.0015) [2024-06-15 11:57:38,267][1653645] Updated weights for policy 0, policy_version 33952 (0.0188) [2024-06-15 11:57:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 46967.1, 300 sec: 44653.3). Total num frames: 69599232. Throughput: 0: 11582.5. Samples: 17457152. Policy #0 lag: (min: 15.0, avg: 116.6, max: 271.0) [2024-06-15 11:57:40,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 11:57:42,374][1653645] Updated weights for policy 0, policy_version 34006 (0.0013) [2024-06-15 11:57:45,570][1653645] Updated weights for policy 0, policy_version 34065 (0.0015) [2024-06-15 11:57:45,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 69795840. Throughput: 0: 11389.2. Samples: 17527296. Policy #0 lag: (min: 15.0, avg: 118.9, max: 271.0) [2024-06-15 11:57:45,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 11:57:46,505][1653645] Updated weights for policy 0, policy_version 34110 (0.0011) [2024-06-15 11:57:48,667][1651596] Signal inference workers to stop experience collection... (1800 times) [2024-06-15 11:57:48,691][1653645] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-15 11:57:48,910][1651596] Signal inference workers to resume experience collection... (1800 times) [2024-06-15 11:57:48,911][1653645] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-15 11:57:49,885][1653645] Updated weights for policy 0, policy_version 34177 (0.0133) [2024-06-15 11:57:50,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 47513.5, 300 sec: 45208.7). Total num frames: 70090752. Throughput: 0: 11502.9. Samples: 17565184. Policy #0 lag: (min: 15.0, avg: 118.9, max: 271.0) [2024-06-15 11:57:50,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 11:57:51,222][1653645] Updated weights for policy 0, policy_version 34236 (0.0013) [2024-06-15 11:57:54,988][1653645] Updated weights for policy 0, policy_version 34299 (0.0018) [2024-06-15 11:57:55,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 70254592. Throughput: 0: 11195.8. Samples: 17622016. Policy #0 lag: (min: 15.0, avg: 118.9, max: 271.0) [2024-06-15 11:57:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 11:57:57,896][1653645] Updated weights for policy 0, policy_version 34352 (0.0033) [2024-06-15 11:58:00,939][1653645] Updated weights for policy 0, policy_version 34400 (0.0011) [2024-06-15 11:58:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 70451200. Throughput: 0: 11343.6. Samples: 17696256. Policy #0 lag: (min: 15.0, avg: 118.9, max: 271.0) [2024-06-15 11:58:00,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 11:58:02,842][1653645] Updated weights for policy 0, policy_version 34465 (0.0138) [2024-06-15 11:58:03,449][1653645] Updated weights for policy 0, policy_version 34495 (0.0012) [2024-06-15 11:58:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 70680576. Throughput: 0: 11127.5. Samples: 17721344. Policy #0 lag: (min: 15.0, avg: 118.9, max: 271.0) [2024-06-15 11:58:05,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:58:09,073][1653645] Updated weights for policy 0, policy_version 34576 (0.0054) [2024-06-15 11:58:10,118][1653645] Updated weights for policy 0, policy_version 34624 (0.0013) [2024-06-15 11:58:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 70909952. Throughput: 0: 10831.6. Samples: 17785344. Policy #0 lag: (min: 15.0, avg: 129.4, max: 271.0) [2024-06-15 11:58:10,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 11:58:14,302][1653645] Updated weights for policy 0, policy_version 34704 (0.0255) [2024-06-15 11:58:15,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43690.6, 300 sec: 45208.7). Total num frames: 71172096. Throughput: 0: 10854.5. Samples: 17845248. Policy #0 lag: (min: 15.0, avg: 129.4, max: 271.0) [2024-06-15 11:58:15,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 11:58:17,936][1653645] Updated weights for policy 0, policy_version 34755 (0.0021) [2024-06-15 11:58:20,724][1653645] Updated weights for policy 0, policy_version 34818 (0.0028) [2024-06-15 11:58:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 71335936. Throughput: 0: 10865.8. Samples: 17881088. Policy #0 lag: (min: 15.0, avg: 129.4, max: 271.0) [2024-06-15 11:58:20,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 11:58:24,983][1653645] Updated weights for policy 0, policy_version 34912 (0.0014) [2024-06-15 11:58:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 71565312. Throughput: 0: 11002.4. Samples: 17952256. Policy #0 lag: (min: 15.0, avg: 129.4, max: 271.0) [2024-06-15 11:58:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 11:58:26,757][1653645] Updated weights for policy 0, policy_version 34962 (0.0011) [2024-06-15 11:58:30,144][1653645] Updated weights for policy 0, policy_version 35014 (0.0093) [2024-06-15 11:58:30,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44237.0, 300 sec: 44653.3). Total num frames: 71794688. Throughput: 0: 10774.7. Samples: 18012160. Policy #0 lag: (min: 15.0, avg: 129.4, max: 271.0) [2024-06-15 11:58:30,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 11:58:30,965][1653645] Updated weights for policy 0, policy_version 35062 (0.0018) [2024-06-15 11:58:34,156][1653645] Updated weights for policy 0, policy_version 35120 (0.0013) [2024-06-15 11:58:35,909][1651596] Signal inference workers to stop experience collection... (1850 times) [2024-06-15 11:58:35,946][1653645] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-15 11:58:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 71958528. Throughput: 0: 10797.5. Samples: 18051072. Policy #0 lag: (min: 2.0, avg: 108.6, max: 258.0) [2024-06-15 11:58:35,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 11:58:36,130][1651596] Signal inference workers to resume experience collection... (1850 times) [2024-06-15 11:58:36,132][1653645] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-15 11:58:37,190][1653645] Updated weights for policy 0, policy_version 35190 (0.0013) [2024-06-15 11:58:38,669][1653645] Updated weights for policy 0, policy_version 35233 (0.0011) [2024-06-15 11:58:40,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43691.0, 300 sec: 44875.5). Total num frames: 72220672. Throughput: 0: 10843.0. Samples: 18109952. Policy #0 lag: (min: 2.0, avg: 108.6, max: 258.0) [2024-06-15 11:58:40,961][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 11:58:42,950][1653645] Updated weights for policy 0, policy_version 35325 (0.0014) [2024-06-15 11:58:45,641][1653645] Updated weights for policy 0, policy_version 35360 (0.0019) [2024-06-15 11:58:45,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 72417280. Throughput: 0: 10899.9. Samples: 18186752. Policy #0 lag: (min: 2.0, avg: 108.6, max: 258.0) [2024-06-15 11:58:45,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:58:47,644][1653645] Updated weights for policy 0, policy_version 35408 (0.0013) [2024-06-15 11:58:49,773][1653645] Updated weights for policy 0, policy_version 35488 (0.0015) [2024-06-15 11:58:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44237.0, 300 sec: 45319.8). Total num frames: 72744960. Throughput: 0: 11059.2. Samples: 18219008. Policy #0 lag: (min: 2.0, avg: 108.6, max: 258.0) [2024-06-15 11:58:50,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 11:58:55,049][1653645] Updated weights for policy 0, policy_version 35557 (0.0011) [2024-06-15 11:58:55,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 72876032. Throughput: 0: 10945.4. Samples: 18277888. Policy #0 lag: (min: 2.0, avg: 108.6, max: 258.0) [2024-06-15 11:58:55,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 11:58:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000035584_72876032.pth... [2024-06-15 11:58:56,013][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000030384_62226432.pth [2024-06-15 11:58:57,970][1653645] Updated weights for policy 0, policy_version 35600 (0.0020) [2024-06-15 11:58:59,907][1653645] Updated weights for policy 0, policy_version 35668 (0.0012) [2024-06-15 11:59:00,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 73138176. Throughput: 0: 11025.0. Samples: 18341376. Policy #0 lag: (min: 15.0, avg: 105.3, max: 271.0) [2024-06-15 11:59:00,959][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 11:59:01,779][1653645] Updated weights for policy 0, policy_version 35744 (0.0090) [2024-06-15 11:59:05,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.4, 300 sec: 44431.1). Total num frames: 73269248. Throughput: 0: 10922.6. Samples: 18372608. Policy #0 lag: (min: 15.0, avg: 105.3, max: 271.0) [2024-06-15 11:59:05,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 11:59:07,605][1653645] Updated weights for policy 0, policy_version 35837 (0.0013) [2024-06-15 11:59:10,529][1653645] Updated weights for policy 0, policy_version 35873 (0.0015) [2024-06-15 11:59:10,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 43144.6, 300 sec: 44542.3). Total num frames: 73498624. Throughput: 0: 10922.7. Samples: 18443776. Policy #0 lag: (min: 15.0, avg: 105.3, max: 271.0) [2024-06-15 11:59:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 11:59:12,454][1653645] Updated weights for policy 0, policy_version 35952 (0.0026) [2024-06-15 11:59:14,204][1653645] Updated weights for policy 0, policy_version 36016 (0.0012) [2024-06-15 11:59:15,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 73793536. Throughput: 0: 10945.4. Samples: 18504704. Policy #0 lag: (min: 15.0, avg: 105.3, max: 271.0) [2024-06-15 11:59:15,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 11:59:19,446][1653645] Updated weights for policy 0, policy_version 36068 (0.0014) [2024-06-15 11:59:20,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 73924608. Throughput: 0: 10945.4. Samples: 18543616. Policy #0 lag: (min: 15.0, avg: 105.3, max: 271.0) [2024-06-15 11:59:20,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 11:59:21,978][1653645] Updated weights for policy 0, policy_version 36114 (0.0015) [2024-06-15 11:59:22,315][1651596] Signal inference workers to stop experience collection... (1900 times) [2024-06-15 11:59:22,380][1653645] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-15 11:59:22,500][1651596] Signal inference workers to resume experience collection... (1900 times) [2024-06-15 11:59:22,514][1653645] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-15 11:59:24,555][1653645] Updated weights for policy 0, policy_version 36193 (0.0012) [2024-06-15 11:59:25,252][1653645] Updated weights for policy 0, policy_version 36225 (0.0094) [2024-06-15 11:59:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 74252288. Throughput: 0: 10968.1. Samples: 18603520. Policy #0 lag: (min: 15.0, avg: 109.4, max: 271.0) [2024-06-15 11:59:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 11:59:26,275][1653645] Updated weights for policy 0, policy_version 36288 (0.0012) [2024-06-15 11:59:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 74416128. Throughput: 0: 10877.2. Samples: 18676224. Policy #0 lag: (min: 15.0, avg: 109.4, max: 271.0) [2024-06-15 11:59:30,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 11:59:31,079][1653645] Updated weights for policy 0, policy_version 36347 (0.0011) [2024-06-15 11:59:33,631][1653645] Updated weights for policy 0, policy_version 36387 (0.0012) [2024-06-15 11:59:35,352][1653645] Updated weights for policy 0, policy_version 36438 (0.0014) [2024-06-15 11:59:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 74678272. Throughput: 0: 10968.1. Samples: 18712576. Policy #0 lag: (min: 15.0, avg: 109.4, max: 271.0) [2024-06-15 11:59:35,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 11:59:37,231][1653645] Updated weights for policy 0, policy_version 36496 (0.0012) [2024-06-15 11:59:38,455][1653645] Updated weights for policy 0, policy_version 36544 (0.0011) [2024-06-15 11:59:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 74842112. Throughput: 0: 11093.4. Samples: 18777088. Policy #0 lag: (min: 15.0, avg: 109.4, max: 271.0) [2024-06-15 11:59:40,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:59:42,588][1653645] Updated weights for policy 0, policy_version 36598 (0.0013) [2024-06-15 11:59:45,318][1653645] Updated weights for policy 0, policy_version 36625 (0.0017) [2024-06-15 11:59:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 75071488. Throughput: 0: 11229.9. Samples: 18846720. Policy #0 lag: (min: 27.0, avg: 121.1, max: 283.0) [2024-06-15 11:59:45,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 11:59:46,188][1653645] Updated weights for policy 0, policy_version 36672 (0.0106) [2024-06-15 11:59:47,377][1653645] Updated weights for policy 0, policy_version 36721 (0.0013) [2024-06-15 11:59:49,104][1653645] Updated weights for policy 0, policy_version 36768 (0.0044) [2024-06-15 11:59:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 44653.8). Total num frames: 75366400. Throughput: 0: 11252.7. Samples: 18878976. Policy #0 lag: (min: 27.0, avg: 121.1, max: 283.0) [2024-06-15 11:59:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 11:59:53,836][1653645] Updated weights for policy 0, policy_version 36835 (0.0012) [2024-06-15 11:59:55,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 75497472. Throughput: 0: 11161.5. Samples: 18946048. Policy #0 lag: (min: 27.0, avg: 121.1, max: 283.0) [2024-06-15 11:59:55,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 11:59:56,946][1653645] Updated weights for policy 0, policy_version 36880 (0.0011) [2024-06-15 11:59:58,279][1653645] Updated weights for policy 0, policy_version 36944 (0.0056) [2024-06-15 12:00:00,736][1653645] Updated weights for policy 0, policy_version 37024 (0.0038) [2024-06-15 12:00:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 75825152. Throughput: 0: 11252.6. Samples: 19011072. Policy #0 lag: (min: 27.0, avg: 121.1, max: 283.0) [2024-06-15 12:00:00,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:00:04,397][1653645] Updated weights for policy 0, policy_version 37072 (0.0026) [2024-06-15 12:00:05,562][1653645] Updated weights for policy 0, policy_version 37120 (0.0042) [2024-06-15 12:00:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 76021760. Throughput: 0: 11229.8. Samples: 19048960. Policy #0 lag: (min: 27.0, avg: 121.1, max: 283.0) [2024-06-15 12:00:05,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:00:08,869][1651596] Signal inference workers to stop experience collection... (1950 times) [2024-06-15 12:00:08,993][1653645] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-15 12:00:09,105][1651596] Signal inference workers to resume experience collection... (1950 times) [2024-06-15 12:00:09,106][1653645] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-15 12:00:09,991][1653645] Updated weights for policy 0, policy_version 37184 (0.0011) [2024-06-15 12:00:10,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 76185600. Throughput: 0: 11446.1. Samples: 19118592. Policy #0 lag: (min: 47.0, avg: 124.5, max: 303.0) [2024-06-15 12:00:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:00:12,092][1653645] Updated weights for policy 0, policy_version 37253 (0.0012) [2024-06-15 12:00:15,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.4, 300 sec: 44320.0). Total num frames: 76414976. Throughput: 0: 11218.4. Samples: 19181056. Policy #0 lag: (min: 47.0, avg: 124.5, max: 303.0) [2024-06-15 12:00:15,959][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:00:16,297][1653645] Updated weights for policy 0, policy_version 37318 (0.0092) [2024-06-15 12:00:20,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 44210.3). Total num frames: 76546048. Throughput: 0: 11070.6. Samples: 19210752. Policy #0 lag: (min: 47.0, avg: 124.5, max: 303.0) [2024-06-15 12:00:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:00:21,039][1653645] Updated weights for policy 0, policy_version 37384 (0.0014) [2024-06-15 12:00:22,960][1653645] Updated weights for policy 0, policy_version 37456 (0.0118) [2024-06-15 12:00:24,913][1653645] Updated weights for policy 0, policy_version 37536 (0.0020) [2024-06-15 12:00:25,699][1653645] Updated weights for policy 0, policy_version 37567 (0.0012) [2024-06-15 12:00:25,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 76939264. Throughput: 0: 10968.1. Samples: 19270656. Policy #0 lag: (min: 47.0, avg: 124.5, max: 303.0) [2024-06-15 12:00:25,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:00:29,827][1653645] Updated weights for policy 0, policy_version 37616 (0.0042) [2024-06-15 12:00:30,960][1648982] Fps is (10 sec: 52418.3, 60 sec: 44235.4, 300 sec: 44430.9). Total num frames: 77070336. Throughput: 0: 10922.2. Samples: 19338240. Policy #0 lag: (min: 47.0, avg: 124.5, max: 303.0) [2024-06-15 12:00:30,960][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:00:35,672][1653645] Updated weights for policy 0, policy_version 37728 (0.0016) [2024-06-15 12:00:35,960][1648982] Fps is (10 sec: 32768.5, 60 sec: 43144.6, 300 sec: 44542.2). Total num frames: 77266944. Throughput: 0: 11104.7. Samples: 19378688. Policy #0 lag: (min: 0.0, avg: 72.6, max: 256.0) [2024-06-15 12:00:35,961][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:00:37,847][1653645] Updated weights for policy 0, policy_version 37817 (0.0012) [2024-06-15 12:00:40,958][1648982] Fps is (10 sec: 39329.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 77463552. Throughput: 0: 10797.6. Samples: 19431936. Policy #0 lag: (min: 0.0, avg: 72.6, max: 256.0) [2024-06-15 12:00:40,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 12:00:41,855][1653645] Updated weights for policy 0, policy_version 37872 (0.0037) [2024-06-15 12:00:45,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 44097.9). Total num frames: 77627392. Throughput: 0: 11036.4. Samples: 19507712. Policy #0 lag: (min: 0.0, avg: 72.6, max: 256.0) [2024-06-15 12:00:45,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:00:46,236][1653645] Updated weights for policy 0, policy_version 37920 (0.0014) [2024-06-15 12:00:48,389][1653645] Updated weights for policy 0, policy_version 38000 (0.0014) [2024-06-15 12:00:49,253][1651596] Signal inference workers to stop experience collection... (2000 times) [2024-06-15 12:00:49,313][1653645] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-15 12:00:49,492][1651596] Signal inference workers to resume experience collection... (2000 times) [2024-06-15 12:00:49,493][1653645] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-15 12:00:49,929][1653645] Updated weights for policy 0, policy_version 38068 (0.0089) [2024-06-15 12:00:50,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 77987840. Throughput: 0: 10672.4. Samples: 19529216. Policy #0 lag: (min: 0.0, avg: 72.6, max: 256.0) [2024-06-15 12:00:50,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:00:53,280][1653645] Updated weights for policy 0, policy_version 38128 (0.0119) [2024-06-15 12:00:55,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 78118912. Throughput: 0: 10672.3. Samples: 19598848. Policy #0 lag: (min: 0.0, avg: 72.6, max: 256.0) [2024-06-15 12:00:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:00:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000038144_78118912.pth... [2024-06-15 12:00:56,055][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000033008_67600384.pth [2024-06-15 12:00:56,060][1651596] Saving new best policy, reward=37.400! [2024-06-15 12:00:57,685][1653645] Updated weights for policy 0, policy_version 38162 (0.0018) [2024-06-15 12:00:59,758][1653645] Updated weights for policy 0, policy_version 38256 (0.0013) [2024-06-15 12:01:00,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 78446592. Throughput: 0: 10820.4. Samples: 19667968. Policy #0 lag: (min: 40.0, avg: 107.8, max: 296.0) [2024-06-15 12:01:00,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:01:01,827][1653645] Updated weights for policy 0, policy_version 38336 (0.0015) [2024-06-15 12:01:05,411][1653645] Updated weights for policy 0, policy_version 38395 (0.0012) [2024-06-15 12:01:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 78643200. Throughput: 0: 10854.4. Samples: 19699200. Policy #0 lag: (min: 40.0, avg: 107.8, max: 296.0) [2024-06-15 12:01:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:01:10,356][1653645] Updated weights for policy 0, policy_version 38451 (0.0014) [2024-06-15 12:01:10,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 43144.4, 300 sec: 44431.2). Total num frames: 78774272. Throughput: 0: 11127.5. Samples: 19771392. Policy #0 lag: (min: 40.0, avg: 107.8, max: 296.0) [2024-06-15 12:01:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:01:12,759][1653645] Updated weights for policy 0, policy_version 38544 (0.0012) [2024-06-15 12:01:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43691.0, 300 sec: 44098.0). Total num frames: 79036416. Throughput: 0: 10820.7. Samples: 19825152. Policy #0 lag: (min: 40.0, avg: 107.8, max: 296.0) [2024-06-15 12:01:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:01:17,428][1653645] Updated weights for policy 0, policy_version 38609 (0.0014) [2024-06-15 12:01:20,957][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 79167488. Throughput: 0: 10570.0. Samples: 19854336. Policy #0 lag: (min: 40.0, avg: 107.8, max: 296.0) [2024-06-15 12:01:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:01:22,294][1653645] Updated weights for policy 0, policy_version 38659 (0.0013) [2024-06-15 12:01:24,156][1653645] Updated weights for policy 0, policy_version 38736 (0.0019) [2024-06-15 12:01:25,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 42052.5, 300 sec: 44209.1). Total num frames: 79462400. Throughput: 0: 10831.7. Samples: 19919360. Policy #0 lag: (min: 31.0, avg: 94.9, max: 267.0) [2024-06-15 12:01:25,960][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:01:26,332][1653645] Updated weights for policy 0, policy_version 38818 (0.0012) [2024-06-15 12:01:30,309][1653645] Updated weights for policy 0, policy_version 38880 (0.0012) [2024-06-15 12:01:30,958][1648982] Fps is (10 sec: 52426.3, 60 sec: 43691.9, 300 sec: 44431.2). Total num frames: 79691776. Throughput: 0: 10410.6. Samples: 19976192. Policy #0 lag: (min: 31.0, avg: 94.9, max: 267.0) [2024-06-15 12:01:30,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:01:35,707][1651596] Signal inference workers to stop experience collection... (2050 times) [2024-06-15 12:01:35,741][1653645] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-15 12:01:35,743][1653645] Updated weights for policy 0, policy_version 38963 (0.0129) [2024-06-15 12:01:35,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 42052.3, 300 sec: 44098.0). Total num frames: 79790080. Throughput: 0: 10797.6. Samples: 20015104. Policy #0 lag: (min: 31.0, avg: 94.9, max: 267.0) [2024-06-15 12:01:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:01:35,987][1651596] Signal inference workers to resume experience collection... (2050 times) [2024-06-15 12:01:35,988][1653645] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-15 12:01:37,077][1653645] Updated weights for policy 0, policy_version 39024 (0.0015) [2024-06-15 12:01:38,772][1653645] Updated weights for policy 0, policy_version 39091 (0.0011) [2024-06-15 12:01:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 80084992. Throughput: 0: 10558.5. Samples: 20073984. Policy #0 lag: (min: 31.0, avg: 94.9, max: 267.0) [2024-06-15 12:01:40,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:01:42,572][1653645] Updated weights for policy 0, policy_version 39136 (0.0013) [2024-06-15 12:01:45,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 80216064. Throughput: 0: 10706.4. Samples: 20149760. Policy #0 lag: (min: 31.0, avg: 94.9, max: 267.0) [2024-06-15 12:01:45,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:01:46,363][1653645] Updated weights for policy 0, policy_version 39189 (0.0020) [2024-06-15 12:01:47,682][1653645] Updated weights for policy 0, policy_version 39236 (0.0017) [2024-06-15 12:01:48,895][1653645] Updated weights for policy 0, policy_version 39289 (0.0012) [2024-06-15 12:01:50,236][1653645] Updated weights for policy 0, policy_version 39330 (0.0011) [2024-06-15 12:01:50,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 80609280. Throughput: 0: 10638.2. Samples: 20177920. Policy #0 lag: (min: 47.0, avg: 154.4, max: 335.0) [2024-06-15 12:01:50,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:01:54,024][1653645] Updated weights for policy 0, policy_version 39393 (0.0014) [2024-06-15 12:01:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 80740352. Throughput: 0: 10524.4. Samples: 20244992. Policy #0 lag: (min: 47.0, avg: 154.4, max: 335.0) [2024-06-15 12:01:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:01:57,736][1653645] Updated weights for policy 0, policy_version 39456 (0.0012) [2024-06-15 12:01:59,279][1653645] Updated weights for policy 0, policy_version 39504 (0.0075) [2024-06-15 12:02:00,233][1653645] Updated weights for policy 0, policy_version 39545 (0.0020) [2024-06-15 12:02:00,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 81035264. Throughput: 0: 10945.5. Samples: 20317696. Policy #0 lag: (min: 47.0, avg: 154.4, max: 335.0) [2024-06-15 12:02:00,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:02:01,873][1653645] Updated weights for policy 0, policy_version 39607 (0.0012) [2024-06-15 12:02:05,465][1653645] Updated weights for policy 0, policy_version 39648 (0.0013) [2024-06-15 12:02:05,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 81231872. Throughput: 0: 11036.4. Samples: 20350976. Policy #0 lag: (min: 47.0, avg: 154.4, max: 335.0) [2024-06-15 12:02:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:02:09,464][1653645] Updated weights for policy 0, policy_version 39712 (0.0012) [2024-06-15 12:02:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 81428480. Throughput: 0: 11218.4. Samples: 20424192. Policy #0 lag: (min: 47.0, avg: 154.4, max: 335.0) [2024-06-15 12:02:10,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:02:11,237][1653645] Updated weights for policy 0, policy_version 39776 (0.0012) [2024-06-15 12:02:12,831][1653645] Updated weights for policy 0, policy_version 39824 (0.0128) [2024-06-15 12:02:13,953][1653645] Updated weights for policy 0, policy_version 39872 (0.0012) [2024-06-15 12:02:15,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 81657856. Throughput: 0: 11195.7. Samples: 20480000. Policy #0 lag: (min: 63.0, avg: 206.2, max: 319.0) [2024-06-15 12:02:15,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:02:18,240][1653645] Updated weights for policy 0, policy_version 39936 (0.0043) [2024-06-15 12:02:20,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 81788928. Throughput: 0: 11070.6. Samples: 20513280. Policy #0 lag: (min: 63.0, avg: 206.2, max: 319.0) [2024-06-15 12:02:20,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:02:21,408][1651596] Signal inference workers to stop experience collection... (2100 times) [2024-06-15 12:02:21,468][1653645] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-15 12:02:21,727][1651596] Signal inference workers to resume experience collection... (2100 times) [2024-06-15 12:02:21,728][1653645] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-15 12:02:22,731][1653645] Updated weights for policy 0, policy_version 40001 (0.0011) [2024-06-15 12:02:24,130][1653645] Updated weights for policy 0, policy_version 40062 (0.0013) [2024-06-15 12:02:25,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 44782.7, 300 sec: 44098.0). Total num frames: 82149376. Throughput: 0: 11207.1. Samples: 20578304. Policy #0 lag: (min: 63.0, avg: 206.2, max: 319.0) [2024-06-15 12:02:25,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:02:25,994][1653645] Updated weights for policy 0, policy_version 40119 (0.0028) [2024-06-15 12:02:29,480][1653645] Updated weights for policy 0, policy_version 40190 (0.0014) [2024-06-15 12:02:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 82313216. Throughput: 0: 11093.4. Samples: 20648960. Policy #0 lag: (min: 63.0, avg: 206.2, max: 319.0) [2024-06-15 12:02:30,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:02:34,169][1653645] Updated weights for policy 0, policy_version 40247 (0.0218) [2024-06-15 12:02:35,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 46421.3, 300 sec: 43987.0). Total num frames: 82575360. Throughput: 0: 11343.7. Samples: 20688384. Policy #0 lag: (min: 63.0, avg: 206.2, max: 319.0) [2024-06-15 12:02:35,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:02:37,022][1653645] Updated weights for policy 0, policy_version 40336 (0.0014) [2024-06-15 12:02:40,069][1653645] Updated weights for policy 0, policy_version 40400 (0.0093) [2024-06-15 12:02:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.2, 300 sec: 44098.0). Total num frames: 82804736. Throughput: 0: 11173.0. Samples: 20747776. Policy #0 lag: (min: 15.0, avg: 140.3, max: 271.0) [2024-06-15 12:02:40,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:02:45,232][1653645] Updated weights for policy 0, policy_version 40464 (0.0012) [2024-06-15 12:02:45,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 82903040. Throughput: 0: 11195.7. Samples: 20821504. Policy #0 lag: (min: 15.0, avg: 140.3, max: 271.0) [2024-06-15 12:02:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:02:47,300][1653645] Updated weights for policy 0, policy_version 40547 (0.0013) [2024-06-15 12:02:48,511][1653645] Updated weights for policy 0, policy_version 40579 (0.0013) [2024-06-15 12:02:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 83230720. Throughput: 0: 11059.2. Samples: 20848640. Policy #0 lag: (min: 15.0, avg: 140.3, max: 271.0) [2024-06-15 12:02:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:02:51,752][1653645] Updated weights for policy 0, policy_version 40672 (0.0015) [2024-06-15 12:02:55,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 83361792. Throughput: 0: 11036.5. Samples: 20920832. Policy #0 lag: (min: 15.0, avg: 140.3, max: 271.0) [2024-06-15 12:02:55,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:02:56,557][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000040736_83427328.pth... [2024-06-15 12:02:56,729][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000035584_72876032.pth [2024-06-15 12:02:57,330][1653645] Updated weights for policy 0, policy_version 40768 (0.0034) [2024-06-15 12:03:00,058][1653645] Updated weights for policy 0, policy_version 40833 (0.0100) [2024-06-15 12:03:00,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 44782.7, 300 sec: 44209.0). Total num frames: 83722240. Throughput: 0: 11264.0. Samples: 20986880. Policy #0 lag: (min: 15.0, avg: 140.3, max: 271.0) [2024-06-15 12:03:00,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:03:02,819][1653645] Updated weights for policy 0, policy_version 40901 (0.0013) [2024-06-15 12:03:03,412][1651596] Signal inference workers to stop experience collection... (2150 times) [2024-06-15 12:03:03,506][1653645] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-15 12:03:03,697][1651596] Signal inference workers to resume experience collection... (2150 times) [2024-06-15 12:03:03,697][1653645] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-15 12:03:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 83886080. Throughput: 0: 11309.5. Samples: 21022208. Policy #0 lag: (min: 15.0, avg: 140.3, max: 271.0) [2024-06-15 12:03:05,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:03:07,978][1653645] Updated weights for policy 0, policy_version 40976 (0.0105) [2024-06-15 12:03:09,586][1653645] Updated weights for policy 0, policy_version 41040 (0.0013) [2024-06-15 12:03:10,653][1653645] Updated weights for policy 0, policy_version 41085 (0.0016) [2024-06-15 12:03:10,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 84148224. Throughput: 0: 11389.2. Samples: 21090816. Policy #0 lag: (min: 15.0, avg: 88.2, max: 271.0) [2024-06-15 12:03:10,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:03:12,320][1653645] Updated weights for policy 0, policy_version 41144 (0.0015) [2024-06-15 12:03:15,761][1653645] Updated weights for policy 0, policy_version 41205 (0.0014) [2024-06-15 12:03:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.4, 300 sec: 44320.1). Total num frames: 84410368. Throughput: 0: 11195.7. Samples: 21152768. Policy #0 lag: (min: 15.0, avg: 88.2, max: 271.0) [2024-06-15 12:03:15,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:03:20,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 84508672. Throughput: 0: 11173.0. Samples: 21191168. Policy #0 lag: (min: 15.0, avg: 88.2, max: 271.0) [2024-06-15 12:03:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:03:21,262][1653645] Updated weights for policy 0, policy_version 41280 (0.0013) [2024-06-15 12:03:23,089][1653645] Updated weights for policy 0, policy_version 41337 (0.0013) [2024-06-15 12:03:24,905][1653645] Updated weights for policy 0, policy_version 41397 (0.0014) [2024-06-15 12:03:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 84803584. Throughput: 0: 11036.5. Samples: 21244416. Policy #0 lag: (min: 15.0, avg: 88.2, max: 271.0) [2024-06-15 12:03:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:03:27,319][1653645] Updated weights for policy 0, policy_version 41427 (0.0014) [2024-06-15 12:03:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 84934656. Throughput: 0: 11082.0. Samples: 21320192. Policy #0 lag: (min: 15.0, avg: 88.2, max: 271.0) [2024-06-15 12:03:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:03:31,958][1653645] Updated weights for policy 0, policy_version 41474 (0.0012) [2024-06-15 12:03:33,771][1653645] Updated weights for policy 0, policy_version 41539 (0.0160) [2024-06-15 12:03:35,022][1653645] Updated weights for policy 0, policy_version 41599 (0.0014) [2024-06-15 12:03:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 85196800. Throughput: 0: 11241.3. Samples: 21354496. Policy #0 lag: (min: 95.0, avg: 164.2, max: 351.0) [2024-06-15 12:03:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:03:37,017][1653645] Updated weights for policy 0, policy_version 41655 (0.0012) [2024-06-15 12:03:39,203][1653645] Updated weights for policy 0, policy_version 41696 (0.0012) [2024-06-15 12:03:39,857][1653645] Updated weights for policy 0, policy_version 41728 (0.0034) [2024-06-15 12:03:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 44209.1). Total num frames: 85458944. Throughput: 0: 10968.2. Samples: 21414400. Policy #0 lag: (min: 95.0, avg: 164.2, max: 351.0) [2024-06-15 12:03:40,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:03:45,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 45328.9, 300 sec: 43653.6). Total num frames: 85622784. Throughput: 0: 11047.8. Samples: 21484032. Policy #0 lag: (min: 95.0, avg: 164.2, max: 351.0) [2024-06-15 12:03:45,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:03:46,049][1653645] Updated weights for policy 0, policy_version 41824 (0.0013) [2024-06-15 12:03:48,289][1653645] Updated weights for policy 0, policy_version 41888 (0.0033) [2024-06-15 12:03:50,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 85852160. Throughput: 0: 10763.4. Samples: 21506560. Policy #0 lag: (min: 95.0, avg: 164.2, max: 351.0) [2024-06-15 12:03:50,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:03:51,171][1651596] Signal inference workers to stop experience collection... (2200 times) [2024-06-15 12:03:51,210][1653645] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-15 12:03:51,212][1653645] Updated weights for policy 0, policy_version 41923 (0.0046) [2024-06-15 12:03:51,353][1651596] Signal inference workers to resume experience collection... (2200 times) [2024-06-15 12:03:51,355][1653645] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-15 12:03:55,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 43690.3, 300 sec: 43542.5). Total num frames: 85983232. Throughput: 0: 10831.5. Samples: 21578240. Policy #0 lag: (min: 95.0, avg: 164.2, max: 351.0) [2024-06-15 12:03:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:03:56,329][1653645] Updated weights for policy 0, policy_version 42000 (0.0015) [2024-06-15 12:03:58,718][1653645] Updated weights for policy 0, policy_version 42096 (0.0013) [2024-06-15 12:04:00,957][1648982] Fps is (10 sec: 45876.3, 60 sec: 43144.8, 300 sec: 44209.1). Total num frames: 86310912. Throughput: 0: 10854.4. Samples: 21641216. Policy #0 lag: (min: 95.0, avg: 164.2, max: 351.0) [2024-06-15 12:04:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:04:01,365][1653645] Updated weights for policy 0, policy_version 42176 (0.0012) [2024-06-15 12:04:03,894][1653645] Updated weights for policy 0, policy_version 42240 (0.0040) [2024-06-15 12:04:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 86507520. Throughput: 0: 10660.9. Samples: 21670912. Policy #0 lag: (min: 45.0, avg: 180.7, max: 301.0) [2024-06-15 12:04:05,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:04:09,576][1653645] Updated weights for policy 0, policy_version 42290 (0.0014) [2024-06-15 12:04:10,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 42598.3, 300 sec: 43764.7). Total num frames: 86704128. Throughput: 0: 11150.2. Samples: 21746176. Policy #0 lag: (min: 45.0, avg: 180.7, max: 301.0) [2024-06-15 12:04:10,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:04:11,488][1653645] Updated weights for policy 0, policy_version 42363 (0.0039) [2024-06-15 12:04:13,623][1653645] Updated weights for policy 0, policy_version 42427 (0.0012) [2024-06-15 12:04:15,228][1653645] Updated weights for policy 0, policy_version 42480 (0.0015) [2024-06-15 12:04:15,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 87031808. Throughput: 0: 10752.0. Samples: 21804032. Policy #0 lag: (min: 45.0, avg: 180.7, max: 301.0) [2024-06-15 12:04:15,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:04:20,567][1653645] Updated weights for policy 0, policy_version 42513 (0.0013) [2024-06-15 12:04:20,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 87097344. Throughput: 0: 10945.4. Samples: 21847040. Policy #0 lag: (min: 45.0, avg: 180.7, max: 301.0) [2024-06-15 12:04:20,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:04:23,019][1653645] Updated weights for policy 0, policy_version 42623 (0.0012) [2024-06-15 12:04:25,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 87425024. Throughput: 0: 10911.2. Samples: 21905408. Policy #0 lag: (min: 45.0, avg: 180.7, max: 301.0) [2024-06-15 12:04:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:04:26,161][1653645] Updated weights for policy 0, policy_version 42692 (0.0030) [2024-06-15 12:04:27,037][1653645] Updated weights for policy 0, policy_version 42744 (0.0016) [2024-06-15 12:04:30,958][1648982] Fps is (10 sec: 45873.4, 60 sec: 43690.4, 300 sec: 43653.6). Total num frames: 87556096. Throughput: 0: 11047.8. Samples: 21981184. Policy #0 lag: (min: 45.0, avg: 180.7, max: 301.0) [2024-06-15 12:04:30,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:04:32,856][1653645] Updated weights for policy 0, policy_version 42787 (0.0010) [2024-06-15 12:04:33,988][1653645] Updated weights for policy 0, policy_version 42836 (0.0013) [2024-06-15 12:04:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 87818240. Throughput: 0: 11241.2. Samples: 22012416. Policy #0 lag: (min: 2.0, avg: 69.2, max: 258.0) [2024-06-15 12:04:35,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:04:36,503][1651596] Signal inference workers to stop experience collection... (2250 times) [2024-06-15 12:04:36,549][1653645] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-15 12:04:36,679][1651596] Signal inference workers to resume experience collection... (2250 times) [2024-06-15 12:04:36,692][1653645] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-15 12:04:36,694][1653645] Updated weights for policy 0, policy_version 42912 (0.0122) [2024-06-15 12:04:38,085][1653645] Updated weights for policy 0, policy_version 42976 (0.0013) [2024-06-15 12:04:40,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 88080384. Throughput: 0: 11036.5. Samples: 22074880. Policy #0 lag: (min: 2.0, avg: 69.2, max: 258.0) [2024-06-15 12:04:40,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:04:43,579][1653645] Updated weights for policy 0, policy_version 43028 (0.0046) [2024-06-15 12:04:45,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44237.0, 300 sec: 43764.7). Total num frames: 88276992. Throughput: 0: 11093.3. Samples: 22140416. Policy #0 lag: (min: 2.0, avg: 69.2, max: 258.0) [2024-06-15 12:04:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:04:46,232][1653645] Updated weights for policy 0, policy_version 43136 (0.0017) [2024-06-15 12:04:50,099][1653645] Updated weights for policy 0, policy_version 43216 (0.0014) [2024-06-15 12:04:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 88571904. Throughput: 0: 11195.8. Samples: 22174720. Policy #0 lag: (min: 2.0, avg: 69.2, max: 258.0) [2024-06-15 12:04:50,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:04:51,302][1653645] Updated weights for policy 0, policy_version 43263 (0.0013) [2024-06-15 12:04:55,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.3, 300 sec: 43542.6). Total num frames: 88670208. Throughput: 0: 10956.9. Samples: 22239232. Policy #0 lag: (min: 2.0, avg: 69.2, max: 258.0) [2024-06-15 12:04:55,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:04:56,324][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000043312_88702976.pth... [2024-06-15 12:04:56,372][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000038144_78118912.pth [2024-06-15 12:04:56,546][1653645] Updated weights for policy 0, policy_version 43319 (0.0012) [2024-06-15 12:04:57,781][1653645] Updated weights for policy 0, policy_version 43376 (0.0103) [2024-06-15 12:05:00,971][1648982] Fps is (10 sec: 32723.3, 60 sec: 43134.6, 300 sec: 43651.7). Total num frames: 88899584. Throughput: 0: 11146.8. Samples: 22305792. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:00,972][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:05:02,116][1653645] Updated weights for policy 0, policy_version 43442 (0.0013) [2024-06-15 12:05:03,512][1653645] Updated weights for policy 0, policy_version 43513 (0.0114) [2024-06-15 12:05:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 89128960. Throughput: 0: 10717.9. Samples: 22329344. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:05,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:05:08,337][1653645] Updated weights for policy 0, policy_version 43568 (0.0017) [2024-06-15 12:05:10,411][1653645] Updated weights for policy 0, policy_version 43639 (0.0013) [2024-06-15 12:05:10,958][1648982] Fps is (10 sec: 49217.1, 60 sec: 44782.8, 300 sec: 43986.9). Total num frames: 89391104. Throughput: 0: 11002.3. Samples: 22400512. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:10,959][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:05:13,398][1653645] Updated weights for policy 0, policy_version 43668 (0.0012) [2024-06-15 12:05:15,566][1653645] Updated weights for policy 0, policy_version 43760 (0.0115) [2024-06-15 12:05:15,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 89653248. Throughput: 0: 10649.7. Samples: 22460416. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:15,963][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:05:19,953][1653645] Updated weights for policy 0, policy_version 43824 (0.0011) [2024-06-15 12:05:20,681][1651596] Signal inference workers to stop experience collection... (2300 times) [2024-06-15 12:05:20,746][1653645] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-15 12:05:20,958][1648982] Fps is (10 sec: 39323.1, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 89784320. Throughput: 0: 10774.8. Samples: 22497280. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:20,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:05:20,959][1651596] Signal inference workers to resume experience collection... (2300 times) [2024-06-15 12:05:20,960][1653645] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-15 12:05:22,046][1653645] Updated weights for policy 0, policy_version 43900 (0.0013) [2024-06-15 12:05:25,970][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.7, 300 sec: 43876.1). Total num frames: 90013696. Throughput: 0: 10865.8. Samples: 22563840. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:25,970][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:05:26,265][1653645] Updated weights for policy 0, policy_version 43968 (0.0015) [2024-06-15 12:05:27,761][1653645] Updated weights for policy 0, policy_version 44027 (0.0013) [2024-06-15 12:05:30,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 90177536. Throughput: 0: 10865.7. Samples: 22629376. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:05:30,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:05:31,646][1653645] Updated weights for policy 0, policy_version 44067 (0.0096) [2024-06-15 12:05:33,623][1653645] Updated weights for policy 0, policy_version 44128 (0.0018) [2024-06-15 12:05:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 90439680. Throughput: 0: 10808.9. Samples: 22661120. Policy #0 lag: (min: 4.0, avg: 95.8, max: 260.0) [2024-06-15 12:05:35,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:05:35,983][1653645] Updated weights for policy 0, policy_version 44163 (0.0017) [2024-06-15 12:05:37,236][1653645] Updated weights for policy 0, policy_version 44220 (0.0043) [2024-06-15 12:05:39,200][1653645] Updated weights for policy 0, policy_version 44281 (0.0021) [2024-06-15 12:05:40,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 90701824. Throughput: 0: 10899.8. Samples: 22729728. Policy #0 lag: (min: 4.0, avg: 95.8, max: 260.0) [2024-06-15 12:05:40,959][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 12:05:44,271][1653645] Updated weights for policy 0, policy_version 44342 (0.0018) [2024-06-15 12:05:45,598][1653645] Updated weights for policy 0, policy_version 44400 (0.0012) [2024-06-15 12:05:45,978][1648982] Fps is (10 sec: 49151.5, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 90931200. Throughput: 0: 10891.8. Samples: 22795776. Policy #0 lag: (min: 4.0, avg: 95.8, max: 260.0) [2024-06-15 12:05:45,979][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 12:05:48,256][1653645] Updated weights for policy 0, policy_version 44436 (0.0015) [2024-06-15 12:05:49,557][1653645] Updated weights for policy 0, policy_version 44482 (0.0013) [2024-06-15 12:05:50,962][1648982] Fps is (10 sec: 49134.5, 60 sec: 43687.9, 300 sec: 44319.6). Total num frames: 91193344. Throughput: 0: 11149.3. Samples: 22831104. Policy #0 lag: (min: 4.0, avg: 95.8, max: 260.0) [2024-06-15 12:05:50,962][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 12:05:51,073][1653645] Updated weights for policy 0, policy_version 44543 (0.0012) [2024-06-15 12:05:55,445][1653645] Updated weights for policy 0, policy_version 44601 (0.0013) [2024-06-15 12:05:55,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 91357184. Throughput: 0: 11309.6. Samples: 22909440. Policy #0 lag: (min: 4.0, avg: 95.8, max: 260.0) [2024-06-15 12:05:55,958][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 12:05:57,285][1653645] Updated weights for policy 0, policy_version 44668 (0.0012) [2024-06-15 12:06:00,612][1653645] Updated weights for policy 0, policy_version 44733 (0.0014) [2024-06-15 12:06:00,958][1648982] Fps is (10 sec: 42613.5, 60 sec: 45339.2, 300 sec: 43986.8). Total num frames: 91619328. Throughput: 0: 11150.2. Samples: 22962176. Policy #0 lag: (min: 15.0, avg: 128.0, max: 271.0) [2024-06-15 12:06:00,959][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 12:06:02,451][1653645] Updated weights for policy 0, policy_version 44798 (0.0013) [2024-06-15 12:06:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 91750400. Throughput: 0: 11036.5. Samples: 22993920. Policy #0 lag: (min: 15.0, avg: 128.0, max: 271.0) [2024-06-15 12:06:05,958][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:06:08,038][1653645] Updated weights for policy 0, policy_version 44864 (0.0019) [2024-06-15 12:06:08,372][1651596] Signal inference workers to stop experience collection... (2350 times) [2024-06-15 12:06:08,435][1653645] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-15 12:06:08,651][1651596] Signal inference workers to resume experience collection... (2350 times) [2024-06-15 12:06:08,657][1653645] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-15 12:06:10,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 92012544. Throughput: 0: 11047.8. Samples: 23060992. Policy #0 lag: (min: 15.0, avg: 128.0, max: 271.0) [2024-06-15 12:06:10,959][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 12:06:11,106][1653645] Updated weights for policy 0, policy_version 44929 (0.0012) [2024-06-15 12:06:13,472][1653645] Updated weights for policy 0, policy_version 44993 (0.0014) [2024-06-15 12:06:15,957][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 92274688. Throughput: 0: 11093.5. Samples: 23128576. Policy #0 lag: (min: 15.0, avg: 128.0, max: 271.0) [2024-06-15 12:06:15,958][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 12:06:18,679][1653645] Updated weights for policy 0, policy_version 45063 (0.0012) [2024-06-15 12:06:20,096][1653645] Updated weights for policy 0, policy_version 45121 (0.0034) [2024-06-15 12:06:20,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 44783.0, 300 sec: 44097.9). Total num frames: 92471296. Throughput: 0: 11275.4. Samples: 23168512. Policy #0 lag: (min: 15.0, avg: 128.0, max: 271.0) [2024-06-15 12:06:20,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:06:21,624][1653645] Updated weights for policy 0, policy_version 45182 (0.0011) [2024-06-15 12:06:24,764][1653645] Updated weights for policy 0, policy_version 45238 (0.0014) [2024-06-15 12:06:25,002][1653645] Updated weights for policy 0, policy_version 45248 (0.0009) [2024-06-15 12:06:25,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 92700672. Throughput: 0: 11013.7. Samples: 23225344. Policy #0 lag: (min: 15.0, avg: 128.0, max: 271.0) [2024-06-15 12:06:25,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:06:26,536][1653645] Updated weights for policy 0, policy_version 45308 (0.0035) [2024-06-15 12:06:30,728][1653645] Updated weights for policy 0, policy_version 45360 (0.0014) [2024-06-15 12:06:30,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 45329.1, 300 sec: 44431.1). Total num frames: 92897280. Throughput: 0: 11298.1. Samples: 23304192. Policy #0 lag: (min: 2.0, avg: 89.0, max: 258.0) [2024-06-15 12:06:30,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:06:32,163][1653645] Updated weights for policy 0, policy_version 45411 (0.0028) [2024-06-15 12:06:34,575][1653645] Updated weights for policy 0, policy_version 45442 (0.0013) [2024-06-15 12:06:35,815][1653645] Updated weights for policy 0, policy_version 45498 (0.0040) [2024-06-15 12:06:35,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 45874.9, 300 sec: 44431.2). Total num frames: 93192192. Throughput: 0: 11264.9. Samples: 23337984. Policy #0 lag: (min: 2.0, avg: 89.0, max: 258.0) [2024-06-15 12:06:35,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:06:37,487][1653645] Updated weights for policy 0, policy_version 45567 (0.0094) [2024-06-15 12:06:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 93323264. Throughput: 0: 10990.9. Samples: 23404032. Policy #0 lag: (min: 2.0, avg: 89.0, max: 258.0) [2024-06-15 12:06:40,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:06:42,236][1653645] Updated weights for policy 0, policy_version 45603 (0.0012) [2024-06-15 12:06:43,473][1653645] Updated weights for policy 0, policy_version 45650 (0.0012) [2024-06-15 12:06:45,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 93585408. Throughput: 0: 11286.8. Samples: 23470080. Policy #0 lag: (min: 2.0, avg: 89.0, max: 258.0) [2024-06-15 12:06:45,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:06:46,510][1653645] Updated weights for policy 0, policy_version 45700 (0.0012) [2024-06-15 12:06:48,499][1653645] Updated weights for policy 0, policy_version 45776 (0.0047) [2024-06-15 12:06:50,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 44239.5, 300 sec: 44431.2). Total num frames: 93847552. Throughput: 0: 11241.2. Samples: 23499776. Policy #0 lag: (min: 2.0, avg: 89.0, max: 258.0) [2024-06-15 12:06:50,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:06:53,451][1653645] Updated weights for policy 0, policy_version 45826 (0.0014) [2024-06-15 12:06:54,221][1651596] Signal inference workers to stop experience collection... (2400 times) [2024-06-15 12:06:54,260][1653645] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-15 12:06:54,533][1651596] Signal inference workers to resume experience collection... (2400 times) [2024-06-15 12:06:54,534][1653645] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-15 12:06:55,008][1653645] Updated weights for policy 0, policy_version 45888 (0.0014) [2024-06-15 12:06:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.8, 300 sec: 44097.9). Total num frames: 94044160. Throughput: 0: 11377.8. Samples: 23572992. Policy #0 lag: (min: 2.0, avg: 89.0, max: 258.0) [2024-06-15 12:06:55,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:06:56,276][1653645] Updated weights for policy 0, policy_version 45947 (0.0013) [2024-06-15 12:06:56,321][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000045952_94109696.pth... [2024-06-15 12:06:56,392][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000040736_83427328.pth [2024-06-15 12:07:00,530][1653645] Updated weights for policy 0, policy_version 46037 (0.0022) [2024-06-15 12:07:00,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44783.1, 300 sec: 44320.1). Total num frames: 94306304. Throughput: 0: 11264.0. Samples: 23635456. Policy #0 lag: (min: 61.0, avg: 176.2, max: 317.0) [2024-06-15 12:07:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:07:05,887][1653645] Updated weights for policy 0, policy_version 46128 (0.0085) [2024-06-15 12:07:05,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 94470144. Throughput: 0: 11173.0. Samples: 23671296. Policy #0 lag: (min: 61.0, avg: 176.2, max: 317.0) [2024-06-15 12:07:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:07:07,251][1653645] Updated weights for policy 0, policy_version 46178 (0.0015) [2024-06-15 12:07:09,568][1653645] Updated weights for policy 0, policy_version 46224 (0.0025) [2024-06-15 12:07:10,552][1653645] Updated weights for policy 0, policy_version 46272 (0.0020) [2024-06-15 12:07:10,958][1648982] Fps is (10 sec: 45872.5, 60 sec: 45874.9, 300 sec: 44431.2). Total num frames: 94765056. Throughput: 0: 11514.2. Samples: 23743488. Policy #0 lag: (min: 61.0, avg: 176.2, max: 317.0) [2024-06-15 12:07:10,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:07:12,027][1653645] Updated weights for policy 0, policy_version 46326 (0.0052) [2024-06-15 12:07:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 94896128. Throughput: 0: 11252.7. Samples: 23810560. Policy #0 lag: (min: 61.0, avg: 176.2, max: 317.0) [2024-06-15 12:07:15,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:07:17,254][1653645] Updated weights for policy 0, policy_version 46368 (0.0011) [2024-06-15 12:07:18,553][1653645] Updated weights for policy 0, policy_version 46416 (0.0010) [2024-06-15 12:07:20,853][1653645] Updated weights for policy 0, policy_version 46480 (0.0012) [2024-06-15 12:07:20,958][1648982] Fps is (10 sec: 42600.8, 60 sec: 45329.0, 300 sec: 44209.1). Total num frames: 95191040. Throughput: 0: 11343.7. Samples: 23848448. Policy #0 lag: (min: 61.0, avg: 176.2, max: 317.0) [2024-06-15 12:07:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:07:23,100][1653645] Updated weights for policy 0, policy_version 46544 (0.0012) [2024-06-15 12:07:25,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 95420416. Throughput: 0: 11173.1. Samples: 23906816. Policy #0 lag: (min: 61.0, avg: 176.2, max: 317.0) [2024-06-15 12:07:25,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:07:28,993][1653645] Updated weights for policy 0, policy_version 46610 (0.0014) [2024-06-15 12:07:30,393][1653645] Updated weights for policy 0, policy_version 46672 (0.0015) [2024-06-15 12:07:30,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 95617024. Throughput: 0: 11218.5. Samples: 23974912. Policy #0 lag: (min: 15.0, avg: 94.2, max: 271.0) [2024-06-15 12:07:30,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:07:32,879][1653645] Updated weights for policy 0, policy_version 46740 (0.0013) [2024-06-15 12:07:35,341][1653645] Updated weights for policy 0, policy_version 46800 (0.0043) [2024-06-15 12:07:35,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.2, 300 sec: 44320.1). Total num frames: 95879168. Throughput: 0: 11264.0. Samples: 24006656. Policy #0 lag: (min: 15.0, avg: 94.2, max: 271.0) [2024-06-15 12:07:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:07:40,749][1653645] Updated weights for policy 0, policy_version 46849 (0.0014) [2024-06-15 12:07:40,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 95944704. Throughput: 0: 11036.4. Samples: 24069632. Policy #0 lag: (min: 15.0, avg: 94.2, max: 271.0) [2024-06-15 12:07:40,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:07:41,087][1651596] Signal inference workers to stop experience collection... (2450 times) [2024-06-15 12:07:41,156][1653645] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-15 12:07:41,312][1651596] Signal inference workers to resume experience collection... (2450 times) [2024-06-15 12:07:41,312][1653645] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-15 12:07:42,564][1653645] Updated weights for policy 0, policy_version 46913 (0.0012) [2024-06-15 12:07:44,119][1653645] Updated weights for policy 0, policy_version 46977 (0.0103) [2024-06-15 12:07:45,470][1653645] Updated weights for policy 0, policy_version 47032 (0.0013) [2024-06-15 12:07:45,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 96337920. Throughput: 0: 11161.6. Samples: 24137728. Policy #0 lag: (min: 15.0, avg: 94.2, max: 271.0) [2024-06-15 12:07:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:07:47,888][1653645] Updated weights for policy 0, policy_version 47080 (0.0014) [2024-06-15 12:07:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 96468992. Throughput: 0: 11172.9. Samples: 24174080. Policy #0 lag: (min: 15.0, avg: 94.2, max: 271.0) [2024-06-15 12:07:50,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:07:52,511][1653645] Updated weights for policy 0, policy_version 47141 (0.0014) [2024-06-15 12:07:54,105][1653645] Updated weights for policy 0, policy_version 47170 (0.0012) [2024-06-15 12:07:55,444][1653645] Updated weights for policy 0, policy_version 47230 (0.0013) [2024-06-15 12:07:55,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44782.8, 300 sec: 44097.9). Total num frames: 96731136. Throughput: 0: 11127.5. Samples: 24244224. Policy #0 lag: (min: 15.0, avg: 94.2, max: 271.0) [2024-06-15 12:07:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:07:57,035][1653645] Updated weights for policy 0, policy_version 47280 (0.0012) [2024-06-15 12:07:58,268][1653645] Updated weights for policy 0, policy_version 47312 (0.0012) [2024-06-15 12:08:00,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 96993280. Throughput: 0: 11127.5. Samples: 24311296. Policy #0 lag: (min: 31.0, avg: 173.1, max: 287.0) [2024-06-15 12:08:00,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:08:03,900][1653645] Updated weights for policy 0, policy_version 47392 (0.0044) [2024-06-15 12:08:05,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 97124352. Throughput: 0: 11093.3. Samples: 24347648. Policy #0 lag: (min: 31.0, avg: 173.1, max: 287.0) [2024-06-15 12:08:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:08:06,011][1653645] Updated weights for policy 0, policy_version 47428 (0.0011) [2024-06-15 12:08:07,566][1653645] Updated weights for policy 0, policy_version 47489 (0.0013) [2024-06-15 12:08:09,006][1653645] Updated weights for policy 0, policy_version 47547 (0.0012) [2024-06-15 12:08:10,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44237.1, 300 sec: 44097.9). Total num frames: 97419264. Throughput: 0: 11047.8. Samples: 24403968. Policy #0 lag: (min: 31.0, avg: 173.1, max: 287.0) [2024-06-15 12:08:10,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:08:11,606][1653645] Updated weights for policy 0, policy_version 47605 (0.0094) [2024-06-15 12:08:15,766][1653645] Updated weights for policy 0, policy_version 47648 (0.0016) [2024-06-15 12:08:15,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 44782.8, 300 sec: 44320.1). Total num frames: 97583104. Throughput: 0: 11116.1. Samples: 24475136. Policy #0 lag: (min: 31.0, avg: 173.1, max: 287.0) [2024-06-15 12:08:15,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:08:16,508][1653645] Updated weights for policy 0, policy_version 47678 (0.0012) [2024-06-15 12:08:19,373][1653645] Updated weights for policy 0, policy_version 47730 (0.0012) [2024-06-15 12:08:20,857][1653645] Updated weights for policy 0, policy_version 47808 (0.0012) [2024-06-15 12:08:20,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 97910784. Throughput: 0: 11184.3. Samples: 24509952. Policy #0 lag: (min: 31.0, avg: 173.1, max: 287.0) [2024-06-15 12:08:20,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:08:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 98041856. Throughput: 0: 11150.2. Samples: 24571392. Policy #0 lag: (min: 31.0, avg: 173.1, max: 287.0) [2024-06-15 12:08:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:08:26,976][1653645] Updated weights for policy 0, policy_version 47873 (0.0025) [2024-06-15 12:08:27,348][1651596] Signal inference workers to stop experience collection... (2500 times) [2024-06-15 12:08:27,441][1653645] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-15 12:08:27,582][1651596] Signal inference workers to resume experience collection... (2500 times) [2024-06-15 12:08:27,583][1653645] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-15 12:08:30,098][1653645] Updated weights for policy 0, policy_version 47938 (0.0012) [2024-06-15 12:08:30,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.9, 300 sec: 44209.0). Total num frames: 98238464. Throughput: 0: 11138.9. Samples: 24638976. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:08:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:08:31,545][1653645] Updated weights for policy 0, policy_version 48003 (0.0101) [2024-06-15 12:08:32,715][1653645] Updated weights for policy 0, policy_version 48057 (0.0013) [2024-06-15 12:08:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 98500608. Throughput: 0: 11093.3. Samples: 24673280. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:08:35,959][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:08:36,330][1653645] Updated weights for policy 0, policy_version 48117 (0.0020) [2024-06-15 12:08:39,219][1653645] Updated weights for policy 0, policy_version 48181 (0.0013) [2024-06-15 12:08:40,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45875.4, 300 sec: 44320.1). Total num frames: 98697216. Throughput: 0: 10934.1. Samples: 24736256. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:08:40,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:08:41,880][1653645] Updated weights for policy 0, policy_version 48208 (0.0012) [2024-06-15 12:08:43,341][1653645] Updated weights for policy 0, policy_version 48259 (0.0098) [2024-06-15 12:08:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 98959360. Throughput: 0: 10990.9. Samples: 24805888. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:08:45,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:08:47,107][1653645] Updated weights for policy 0, policy_version 48322 (0.0118) [2024-06-15 12:08:50,309][1653645] Updated weights for policy 0, policy_version 48400 (0.0019) [2024-06-15 12:08:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.1, 300 sec: 44653.4). Total num frames: 99155968. Throughput: 0: 10922.7. Samples: 24839168. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:08:50,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:08:51,274][1653645] Updated weights for policy 0, policy_version 48444 (0.0012) [2024-06-15 12:08:53,541][1653645] Updated weights for policy 0, policy_version 48496 (0.0014) [2024-06-15 12:08:55,280][1653645] Updated weights for policy 0, policy_version 48568 (0.0018) [2024-06-15 12:08:55,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 45875.1, 300 sec: 44653.2). Total num frames: 99483648. Throughput: 0: 11252.5. Samples: 24910336. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:08:55,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:08:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000048576_99483648.pth... [2024-06-15 12:08:56,024][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000043312_88702976.pth [2024-06-15 12:08:58,629][1653645] Updated weights for policy 0, policy_version 48598 (0.0013) [2024-06-15 12:08:59,339][1653645] Updated weights for policy 0, policy_version 48636 (0.0027) [2024-06-15 12:09:00,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 99614720. Throughput: 0: 11264.0. Samples: 24982016. Policy #0 lag: (min: 4.0, avg: 102.8, max: 260.0) [2024-06-15 12:09:00,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 12:09:01,663][1653645] Updated weights for policy 0, policy_version 48661 (0.0012) [2024-06-15 12:09:04,671][1653645] Updated weights for policy 0, policy_version 48725 (0.0012) [2024-06-15 12:09:05,958][1648982] Fps is (10 sec: 39323.7, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 99876864. Throughput: 0: 11286.8. Samples: 25017856. Policy #0 lag: (min: 7.0, avg: 107.0, max: 263.0) [2024-06-15 12:09:05,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:09:06,858][1653645] Updated weights for policy 0, policy_version 48816 (0.0013) [2024-06-15 12:09:10,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 100040704. Throughput: 0: 11321.0. Samples: 25080832. Policy #0 lag: (min: 7.0, avg: 107.0, max: 263.0) [2024-06-15 12:09:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:09:11,014][1653645] Updated weights for policy 0, policy_version 48852 (0.0012) [2024-06-15 12:09:13,596][1653645] Updated weights for policy 0, policy_version 48912 (0.0067) [2024-06-15 12:09:13,612][1651596] Signal inference workers to stop experience collection... (2550 times) [2024-06-15 12:09:13,684][1653645] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-15 12:09:13,843][1651596] Signal inference workers to resume experience collection... (2550 times) [2024-06-15 12:09:13,849][1653645] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-15 12:09:15,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44782.8, 300 sec: 44653.3). Total num frames: 100270080. Throughput: 0: 11332.2. Samples: 25148928. Policy #0 lag: (min: 7.0, avg: 107.0, max: 263.0) [2024-06-15 12:09:15,959][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:09:16,397][1653645] Updated weights for policy 0, policy_version 48976 (0.0012) [2024-06-15 12:09:18,346][1653645] Updated weights for policy 0, policy_version 49056 (0.0014) [2024-06-15 12:09:20,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 100532224. Throughput: 0: 11173.0. Samples: 25176064. Policy #0 lag: (min: 7.0, avg: 107.0, max: 263.0) [2024-06-15 12:09:20,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:09:22,274][1653645] Updated weights for policy 0, policy_version 49093 (0.0011) [2024-06-15 12:09:25,604][1653645] Updated weights for policy 0, policy_version 49156 (0.0013) [2024-06-15 12:09:25,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 100696064. Throughput: 0: 11320.8. Samples: 25245696. Policy #0 lag: (min: 7.0, avg: 107.0, max: 263.0) [2024-06-15 12:09:25,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:09:26,736][1653645] Updated weights for policy 0, policy_version 49216 (0.0012) [2024-06-15 12:09:30,251][1653645] Updated weights for policy 0, policy_version 49312 (0.0187) [2024-06-15 12:09:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 101056512. Throughput: 0: 11116.2. Samples: 25306112. Policy #0 lag: (min: 7.0, avg: 107.0, max: 263.0) [2024-06-15 12:09:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:09:34,072][1653645] Updated weights for policy 0, policy_version 49346 (0.0011) [2024-06-15 12:09:35,537][1653645] Updated weights for policy 0, policy_version 49407 (0.0011) [2024-06-15 12:09:35,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 44782.8, 300 sec: 44431.1). Total num frames: 101187584. Throughput: 0: 11218.4. Samples: 25344000. Policy #0 lag: (min: 15.0, avg: 123.6, max: 271.0) [2024-06-15 12:09:35,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:09:39,962][1653645] Updated weights for policy 0, policy_version 49473 (0.0013) [2024-06-15 12:09:40,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 101416960. Throughput: 0: 11241.4. Samples: 25416192. Policy #0 lag: (min: 15.0, avg: 123.6, max: 271.0) [2024-06-15 12:09:40,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:09:41,364][1653645] Updated weights for policy 0, policy_version 49539 (0.0081) [2024-06-15 12:09:45,958][1648982] Fps is (10 sec: 39323.4, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 101580800. Throughput: 0: 11036.5. Samples: 25478656. Policy #0 lag: (min: 15.0, avg: 123.6, max: 271.0) [2024-06-15 12:09:45,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:09:46,259][1653645] Updated weights for policy 0, policy_version 49619 (0.0017) [2024-06-15 12:09:49,843][1653645] Updated weights for policy 0, policy_version 49700 (0.0013) [2024-06-15 12:09:50,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 101842944. Throughput: 0: 11002.3. Samples: 25512960. Policy #0 lag: (min: 15.0, avg: 123.6, max: 271.0) [2024-06-15 12:09:50,959][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:09:52,182][1653645] Updated weights for policy 0, policy_version 49760 (0.0011) [2024-06-15 12:09:53,489][1653645] Updated weights for policy 0, policy_version 49817 (0.0116) [2024-06-15 12:09:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43691.0, 300 sec: 44766.5). Total num frames: 102105088. Throughput: 0: 11047.8. Samples: 25577984. Policy #0 lag: (min: 15.0, avg: 123.6, max: 271.0) [2024-06-15 12:09:55,982][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:09:57,295][1653645] Updated weights for policy 0, policy_version 49872 (0.0013) [2024-06-15 12:09:57,807][1651596] Signal inference workers to stop experience collection... (2600 times) [2024-06-15 12:09:57,887][1653645] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-15 12:09:58,037][1651596] Signal inference workers to resume experience collection... (2600 times) [2024-06-15 12:09:58,070][1653645] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-15 12:09:58,346][1653645] Updated weights for policy 0, policy_version 49920 (0.0028) [2024-06-15 12:10:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.8, 300 sec: 44542.2). Total num frames: 102268928. Throughput: 0: 11252.6. Samples: 25655296. Policy #0 lag: (min: 15.0, avg: 123.6, max: 271.0) [2024-06-15 12:10:00,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:10:01,781][1653645] Updated weights for policy 0, policy_version 49981 (0.0010) [2024-06-15 12:10:04,210][1653645] Updated weights for policy 0, policy_version 50032 (0.0012) [2024-06-15 12:10:05,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 44764.5). Total num frames: 102596608. Throughput: 0: 11332.3. Samples: 25686016. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:05,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:10:06,208][1653645] Updated weights for policy 0, policy_version 50106 (0.0019) [2024-06-15 12:10:09,624][1653645] Updated weights for policy 0, policy_version 50144 (0.0012) [2024-06-15 12:10:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 102760448. Throughput: 0: 11218.5. Samples: 25750528. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:10,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:10:12,370][1653645] Updated weights for policy 0, policy_version 50192 (0.0013) [2024-06-15 12:10:13,461][1653645] Updated weights for policy 0, policy_version 50236 (0.0015) [2024-06-15 12:10:15,959][1648982] Fps is (10 sec: 42598.0, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 103022592. Throughput: 0: 11446.0. Samples: 25821184. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:15,960][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:10:16,287][1653645] Updated weights for policy 0, policy_version 50327 (0.0170) [2024-06-15 12:10:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.5, 300 sec: 44542.2). Total num frames: 103153664. Throughput: 0: 11150.3. Samples: 25845760. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:20,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:10:21,356][1653645] Updated weights for policy 0, policy_version 50384 (0.0013) [2024-06-15 12:10:24,023][1653645] Updated weights for policy 0, policy_version 50436 (0.0065) [2024-06-15 12:10:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.2, 300 sec: 44875.6). Total num frames: 103415808. Throughput: 0: 11229.9. Samples: 25921536. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:25,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:10:26,438][1653645] Updated weights for policy 0, policy_version 50499 (0.0013) [2024-06-15 12:10:28,575][1653645] Updated weights for policy 0, policy_version 50579 (0.0017) [2024-06-15 12:10:30,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 103677952. Throughput: 0: 11093.3. Samples: 25977856. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:30,958][1648982] Avg episode reward: [(0, '36.850')] [2024-06-15 12:10:33,292][1653645] Updated weights for policy 0, policy_version 50643 (0.0014) [2024-06-15 12:10:35,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 103809024. Throughput: 0: 11184.3. Samples: 26016256. Policy #0 lag: (min: 111.0, avg: 215.7, max: 367.0) [2024-06-15 12:10:35,959][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:10:36,235][1653645] Updated weights for policy 0, policy_version 50707 (0.0028) [2024-06-15 12:10:37,045][1653645] Updated weights for policy 0, policy_version 50743 (0.0011) [2024-06-15 12:10:39,257][1653645] Updated weights for policy 0, policy_version 50800 (0.0011) [2024-06-15 12:10:40,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 45328.9, 300 sec: 44764.4). Total num frames: 104136704. Throughput: 0: 11252.6. Samples: 26084352. Policy #0 lag: (min: 15.0, avg: 119.3, max: 271.0) [2024-06-15 12:10:40,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:10:41,372][1653645] Updated weights for policy 0, policy_version 50876 (0.0012) [2024-06-15 12:10:45,115][1651596] Signal inference workers to stop experience collection... (2650 times) [2024-06-15 12:10:45,174][1653645] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-15 12:10:45,378][1651596] Signal inference workers to resume experience collection... (2650 times) [2024-06-15 12:10:45,379][1653645] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-15 12:10:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44782.8, 300 sec: 44320.7). Total num frames: 104267776. Throughput: 0: 11002.3. Samples: 26150400. Policy #0 lag: (min: 15.0, avg: 119.3, max: 271.0) [2024-06-15 12:10:45,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:10:46,232][1653645] Updated weights for policy 0, policy_version 50928 (0.0012) [2024-06-15 12:10:48,507][1653645] Updated weights for policy 0, policy_version 50976 (0.0028) [2024-06-15 12:10:50,855][1653645] Updated weights for policy 0, policy_version 51024 (0.0014) [2024-06-15 12:10:50,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 104497152. Throughput: 0: 10968.2. Samples: 26179584. Policy #0 lag: (min: 15.0, avg: 119.3, max: 271.0) [2024-06-15 12:10:50,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:10:52,490][1653645] Updated weights for policy 0, policy_version 51088 (0.0015) [2024-06-15 12:10:55,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 104726528. Throughput: 0: 10922.6. Samples: 26242048. Policy #0 lag: (min: 15.0, avg: 119.3, max: 271.0) [2024-06-15 12:10:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:10:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000051136_104726528.pth... [2024-06-15 12:10:56,044][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000045952_94109696.pth [2024-06-15 12:10:56,049][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000051136_104726528.pth [2024-06-15 12:10:57,206][1653645] Updated weights for policy 0, policy_version 51139 (0.0012) [2024-06-15 12:10:58,408][1653645] Updated weights for policy 0, policy_version 51197 (0.0013) [2024-06-15 12:11:00,627][1653645] Updated weights for policy 0, policy_version 51248 (0.0014) [2024-06-15 12:11:00,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 104988672. Throughput: 0: 10956.8. Samples: 26314240. Policy #0 lag: (min: 15.0, avg: 119.3, max: 271.0) [2024-06-15 12:11:00,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:11:02,569][1653645] Updated weights for policy 0, policy_version 51300 (0.0013) [2024-06-15 12:11:03,520][1653645] Updated weights for policy 0, policy_version 51346 (0.0011) [2024-06-15 12:11:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 105250816. Throughput: 0: 11286.8. Samples: 26353664. Policy #0 lag: (min: 15.0, avg: 119.3, max: 271.0) [2024-06-15 12:11:05,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:11:08,997][1653645] Updated weights for policy 0, policy_version 51408 (0.0011) [2024-06-15 12:11:10,917][1653645] Updated weights for policy 0, policy_version 51472 (0.0074) [2024-06-15 12:11:10,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.9, 300 sec: 44542.2). Total num frames: 105414656. Throughput: 0: 11195.7. Samples: 26425344. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:11:14,509][1653645] Updated weights for policy 0, policy_version 51572 (0.0013) [2024-06-15 12:11:15,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 105742336. Throughput: 0: 11127.5. Samples: 26478592. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:15,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:11:20,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 105775104. Throughput: 0: 11059.2. Samples: 26513920. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:20,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:11:21,131][1653645] Updated weights for policy 0, policy_version 51650 (0.0012) [2024-06-15 12:11:22,583][1653645] Updated weights for policy 0, policy_version 51712 (0.0013) [2024-06-15 12:11:24,349][1653645] Updated weights for policy 0, policy_version 51774 (0.0014) [2024-06-15 12:11:25,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 44236.9, 300 sec: 44653.4). Total num frames: 106070016. Throughput: 0: 10991.0. Samples: 26578944. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:25,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:11:26,211][1653645] Updated weights for policy 0, policy_version 51814 (0.0013) [2024-06-15 12:11:27,169][1651596] Signal inference workers to stop experience collection... (2700 times) [2024-06-15 12:11:27,229][1653645] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-15 12:11:27,536][1651596] Signal inference workers to resume experience collection... (2700 times) [2024-06-15 12:11:27,537][1653645] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-15 12:11:27,791][1653645] Updated weights for policy 0, policy_version 51877 (0.0013) [2024-06-15 12:11:30,960][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 106299392. Throughput: 0: 10991.0. Samples: 26644992. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:30,962][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:11:34,840][1653645] Updated weights for policy 0, policy_version 51963 (0.0013) [2024-06-15 12:11:35,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44783.1, 300 sec: 44653.4). Total num frames: 106496000. Throughput: 0: 11207.1. Samples: 26683904. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:35,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:11:36,135][1653645] Updated weights for policy 0, policy_version 52021 (0.0012) [2024-06-15 12:11:37,796][1653645] Updated weights for policy 0, policy_version 52064 (0.0134) [2024-06-15 12:11:39,557][1653645] Updated weights for policy 0, policy_version 52144 (0.0132) [2024-06-15 12:11:40,960][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 106823680. Throughput: 0: 11082.0. Samples: 26740736. Policy #0 lag: (min: 13.0, avg: 122.1, max: 269.0) [2024-06-15 12:11:40,961][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:11:45,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 106856448. Throughput: 0: 11184.3. Samples: 26817536. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:11:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:11:47,120][1653645] Updated weights for policy 0, policy_version 52231 (0.0014) [2024-06-15 12:11:48,413][1653645] Updated weights for policy 0, policy_version 52287 (0.0174) [2024-06-15 12:11:50,033][1653645] Updated weights for policy 0, policy_version 52356 (0.0015) [2024-06-15 12:11:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 107315200. Throughput: 0: 10865.8. Samples: 26842624. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:11:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:11:55,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.9, 300 sec: 44209.0). Total num frames: 107347968. Throughput: 0: 10763.4. Samples: 26909696. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:11:55,960][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:11:58,122][1653645] Updated weights for policy 0, policy_version 52432 (0.0012) [2024-06-15 12:12:00,427][1653645] Updated weights for policy 0, policy_version 52516 (0.0014) [2024-06-15 12:12:00,958][1648982] Fps is (10 sec: 29491.5, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 107610112. Throughput: 0: 11025.1. Samples: 26974720. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:12:00,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:12:01,443][1653645] Updated weights for policy 0, policy_version 52576 (0.0012) [2024-06-15 12:12:03,433][1653645] Updated weights for policy 0, policy_version 52657 (0.0151) [2024-06-15 12:12:05,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 107872256. Throughput: 0: 10808.8. Samples: 27000320. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:12:05,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:12:10,836][1653645] Updated weights for policy 0, policy_version 52691 (0.0081) [2024-06-15 12:12:10,958][1648982] Fps is (10 sec: 29490.5, 60 sec: 41506.1, 300 sec: 44097.9). Total num frames: 107905024. Throughput: 0: 11104.7. Samples: 27078656. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:12:10,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:12:12,609][1651596] Signal inference workers to stop experience collection... (2750 times) [2024-06-15 12:12:12,668][1653645] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-15 12:12:12,671][1653645] Updated weights for policy 0, policy_version 52785 (0.0014) [2024-06-15 12:12:12,926][1651596] Signal inference workers to resume experience collection... (2750 times) [2024-06-15 12:12:12,927][1653645] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-15 12:12:15,311][1653645] Updated weights for policy 0, policy_version 52896 (0.0015) [2024-06-15 12:12:15,958][1648982] Fps is (10 sec: 49154.3, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 108363776. Throughput: 0: 10763.4. Samples: 27129344. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:12:15,960][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:12:20,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 108396544. Throughput: 0: 10752.0. Samples: 27167744. Policy #0 lag: (min: 15.0, avg: 83.5, max: 271.0) [2024-06-15 12:12:20,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:12:22,383][1653645] Updated weights for policy 0, policy_version 52944 (0.0017) [2024-06-15 12:12:23,978][1653645] Updated weights for policy 0, policy_version 53024 (0.0073) [2024-06-15 12:12:25,959][1648982] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 108724224. Throughput: 0: 11047.8. Samples: 27237888. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:25,960][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:12:26,218][1653645] Updated weights for policy 0, policy_version 53109 (0.0012) [2024-06-15 12:12:27,828][1653645] Updated weights for policy 0, policy_version 53168 (0.0011) [2024-06-15 12:12:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 108920832. Throughput: 0: 10729.2. Samples: 27300352. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:30,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:12:35,074][1653645] Updated weights for policy 0, policy_version 53221 (0.0012) [2024-06-15 12:12:35,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 109051904. Throughput: 0: 11059.2. Samples: 27340288. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:35,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:12:37,835][1653645] Updated weights for policy 0, policy_version 53328 (0.0013) [2024-06-15 12:12:39,009][1653645] Updated weights for policy 0, policy_version 53373 (0.0012) [2024-06-15 12:12:40,779][1653645] Updated weights for policy 0, policy_version 53411 (0.0012) [2024-06-15 12:12:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 109412352. Throughput: 0: 10683.7. Samples: 27390464. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:40,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:12:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 109445120. Throughput: 0: 10956.7. Samples: 27467776. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:45,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:12:46,967][1653645] Updated weights for policy 0, policy_version 53472 (0.0015) [2024-06-15 12:12:48,586][1653645] Updated weights for policy 0, policy_version 53536 (0.0013) [2024-06-15 12:12:50,525][1653645] Updated weights for policy 0, policy_version 53616 (0.0018) [2024-06-15 12:12:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 44431.2). Total num frames: 109838336. Throughput: 0: 10934.2. Samples: 27492352. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:12:51,746][1653645] Updated weights for policy 0, policy_version 53648 (0.0011) [2024-06-15 12:12:55,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 109969408. Throughput: 0: 10569.9. Samples: 27554304. Policy #0 lag: (min: 15.0, avg: 65.7, max: 271.0) [2024-06-15 12:12:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:12:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000053696_109969408.pth... [2024-06-15 12:12:56,001][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000048576_99483648.pth [2024-06-15 12:12:58,917][1651596] Signal inference workers to stop experience collection... (2800 times) [2024-06-15 12:12:58,968][1653645] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-15 12:12:59,187][1651596] Signal inference workers to resume experience collection... (2800 times) [2024-06-15 12:12:59,188][1653645] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-15 12:12:59,189][1653645] Updated weights for policy 0, policy_version 53712 (0.0012) [2024-06-15 12:13:00,958][1648982] Fps is (10 sec: 29490.5, 60 sec: 42052.0, 300 sec: 44097.9). Total num frames: 110133248. Throughput: 0: 10899.8. Samples: 27619840. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:00,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:13:01,274][1653645] Updated weights for policy 0, policy_version 53795 (0.0019) [2024-06-15 12:13:02,832][1653645] Updated weights for policy 0, policy_version 53872 (0.0118) [2024-06-15 12:13:04,691][1653645] Updated weights for policy 0, policy_version 53921 (0.0121) [2024-06-15 12:13:05,958][1648982] Fps is (10 sec: 52431.2, 60 sec: 43691.1, 300 sec: 44320.1). Total num frames: 110493696. Throughput: 0: 10615.5. Samples: 27645440. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:05,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:13:10,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.4, 300 sec: 43764.7). Total num frames: 110493696. Throughput: 0: 10706.4. Samples: 27719680. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:10,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:13:11,541][1653645] Updated weights for policy 0, policy_version 53969 (0.0013) [2024-06-15 12:13:14,035][1653645] Updated weights for policy 0, policy_version 54069 (0.0014) [2024-06-15 12:13:15,640][1653645] Updated weights for policy 0, policy_version 54139 (0.0013) [2024-06-15 12:13:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 43986.9). Total num frames: 110886912. Throughput: 0: 10501.7. Samples: 27772928. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:15,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:13:18,086][1653645] Updated weights for policy 0, policy_version 54200 (0.0013) [2024-06-15 12:13:20,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.4, 300 sec: 43986.9). Total num frames: 111017984. Throughput: 0: 10330.9. Samples: 27805184. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:20,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:13:24,792][1653645] Updated weights for policy 0, policy_version 54272 (0.0014) [2024-06-15 12:13:25,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 111214592. Throughput: 0: 10763.4. Samples: 27874816. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:13:27,152][1653645] Updated weights for policy 0, policy_version 54368 (0.0143) [2024-06-15 12:13:27,769][1653645] Updated weights for policy 0, policy_version 54398 (0.0013) [2024-06-15 12:13:30,474][1653645] Updated weights for policy 0, policy_version 54448 (0.0055) [2024-06-15 12:13:30,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 111542272. Throughput: 0: 10331.1. Samples: 27932672. Policy #0 lag: (min: 7.0, avg: 62.7, max: 263.0) [2024-06-15 12:13:30,961][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:13:35,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 111575040. Throughput: 0: 10626.8. Samples: 27970560. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:13:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:13:36,060][1653645] Updated weights for policy 0, policy_version 54484 (0.0013) [2024-06-15 12:13:37,622][1653645] Updated weights for policy 0, policy_version 54560 (0.0015) [2024-06-15 12:13:38,464][1651596] Signal inference workers to stop experience collection... (2850 times) [2024-06-15 12:13:38,492][1653645] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-15 12:13:38,788][1651596] Signal inference workers to resume experience collection... (2850 times) [2024-06-15 12:13:38,790][1653645] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-15 12:13:38,940][1653645] Updated weights for policy 0, policy_version 54612 (0.0012) [2024-06-15 12:13:40,960][1648982] Fps is (10 sec: 39311.9, 60 sec: 42050.6, 300 sec: 43986.5). Total num frames: 111935488. Throughput: 0: 10671.9. Samples: 28034560. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:13:40,961][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:13:41,590][1653645] Updated weights for policy 0, policy_version 54658 (0.0015) [2024-06-15 12:13:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 112066560. Throughput: 0: 10831.7. Samples: 28107264. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:13:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:13:48,160][1653645] Updated weights for policy 0, policy_version 54752 (0.0013) [2024-06-15 12:13:49,343][1653645] Updated weights for policy 0, policy_version 54807 (0.0011) [2024-06-15 12:13:50,958][1648982] Fps is (10 sec: 42608.9, 60 sec: 42052.3, 300 sec: 43653.7). Total num frames: 112361472. Throughput: 0: 11047.8. Samples: 28142592. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:13:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:13:51,013][1653645] Updated weights for policy 0, policy_version 54868 (0.0018) [2024-06-15 12:13:51,939][1653645] Updated weights for policy 0, policy_version 54912 (0.0012) [2024-06-15 12:13:54,529][1653645] Updated weights for policy 0, policy_version 54976 (0.0014) [2024-06-15 12:13:55,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 112590848. Throughput: 0: 10683.8. Samples: 28200448. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:13:55,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:14:00,719][1653645] Updated weights for policy 0, policy_version 55061 (0.0188) [2024-06-15 12:14:00,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44237.0, 300 sec: 43764.7). Total num frames: 112787456. Throughput: 0: 11104.7. Samples: 28272640. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:14:00,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:14:03,251][1653645] Updated weights for policy 0, policy_version 55160 (0.0016) [2024-06-15 12:14:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 41506.0, 300 sec: 43875.8). Total num frames: 112984064. Throughput: 0: 10820.4. Samples: 28292096. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:14:05,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:14:07,307][1653645] Updated weights for policy 0, policy_version 55216 (0.0130) [2024-06-15 12:14:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 113115136. Throughput: 0: 10786.1. Samples: 28360192. Policy #0 lag: (min: 4.0, avg: 62.4, max: 260.0) [2024-06-15 12:14:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:14:12,891][1653645] Updated weights for policy 0, policy_version 55294 (0.0012) [2024-06-15 12:14:15,052][1653645] Updated weights for policy 0, policy_version 55376 (0.0012) [2024-06-15 12:14:15,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 113475584. Throughput: 0: 10706.5. Samples: 28414464. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:15,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:14:19,199][1653645] Updated weights for policy 0, policy_version 55428 (0.0141) [2024-06-15 12:14:20,555][1653645] Updated weights for policy 0, policy_version 55487 (0.0013) [2024-06-15 12:14:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43691.0, 300 sec: 43875.8). Total num frames: 113639424. Throughput: 0: 10695.1. Samples: 28451840. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:14:25,527][1651596] Signal inference workers to stop experience collection... (2900 times) [2024-06-15 12:14:25,611][1653645] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-15 12:14:25,795][1651596] Signal inference workers to resume experience collection... (2900 times) [2024-06-15 12:14:25,796][1653645] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-15 12:14:25,951][1653645] Updated weights for policy 0, policy_version 55554 (0.0030) [2024-06-15 12:14:25,958][1648982] Fps is (10 sec: 29490.3, 60 sec: 42598.2, 300 sec: 43098.2). Total num frames: 113770496. Throughput: 0: 10843.5. Samples: 28522496. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:25,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:14:28,201][1653645] Updated weights for policy 0, policy_version 55649 (0.0136) [2024-06-15 12:14:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 43542.6). Total num frames: 114032640. Throughput: 0: 10399.3. Samples: 28575232. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:30,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:14:33,356][1653645] Updated weights for policy 0, policy_version 55738 (0.0013) [2024-06-15 12:14:35,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 114163712. Throughput: 0: 10274.1. Samples: 28604928. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:35,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:14:37,396][1653645] Updated weights for policy 0, policy_version 55777 (0.0012) [2024-06-15 12:14:39,275][1653645] Updated weights for policy 0, policy_version 55863 (0.0014) [2024-06-15 12:14:40,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 43146.2, 300 sec: 43875.8). Total num frames: 114524160. Throughput: 0: 10626.8. Samples: 28678656. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:14:41,099][1653645] Updated weights for policy 0, policy_version 55935 (0.0026) [2024-06-15 12:14:45,123][1653645] Updated weights for policy 0, policy_version 55991 (0.0015) [2024-06-15 12:14:45,974][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 114688000. Throughput: 0: 10456.2. Samples: 28743168. Policy #0 lag: (min: 0.0, avg: 66.5, max: 256.0) [2024-06-15 12:14:45,975][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:14:49,500][1653645] Updated weights for policy 0, policy_version 56055 (0.0021) [2024-06-15 12:14:50,992][1648982] Fps is (10 sec: 39186.3, 60 sec: 42573.8, 300 sec: 43426.4). Total num frames: 114917376. Throughput: 0: 10937.0. Samples: 28784640. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:14:50,993][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:14:52,260][1653645] Updated weights for policy 0, policy_version 56160 (0.0087) [2024-06-15 12:14:55,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 115081216. Throughput: 0: 10524.4. Samples: 28833792. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:14:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:14:55,986][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000056192_115081216.pth... [2024-06-15 12:14:56,077][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000051136_104726528.pth [2024-06-15 12:14:57,145][1653645] Updated weights for policy 0, policy_version 56224 (0.0011) [2024-06-15 12:15:00,958][1648982] Fps is (10 sec: 29594.0, 60 sec: 40413.9, 300 sec: 42765.0). Total num frames: 115212288. Throughput: 0: 10911.3. Samples: 28905472. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:15:00,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:15:01,876][1653645] Updated weights for policy 0, policy_version 56288 (0.0012) [2024-06-15 12:15:03,784][1653645] Updated weights for policy 0, policy_version 56368 (0.0013) [2024-06-15 12:15:05,515][1653645] Updated weights for policy 0, policy_version 56438 (0.0096) [2024-06-15 12:15:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 115605504. Throughput: 0: 10661.0. Samples: 28931584. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:15:05,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:15:09,133][1651596] Signal inference workers to stop experience collection... (2950 times) [2024-06-15 12:15:09,179][1653645] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-15 12:15:09,446][1651596] Signal inference workers to resume experience collection... (2950 times) [2024-06-15 12:15:09,454][1653645] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-15 12:15:09,959][1653645] Updated weights for policy 0, policy_version 56480 (0.0126) [2024-06-15 12:15:10,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 115736576. Throughput: 0: 10478.9. Samples: 28994048. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:15:10,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:15:14,164][1653645] Updated weights for policy 0, policy_version 56544 (0.0012) [2024-06-15 12:15:15,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 41506.2, 300 sec: 43431.5). Total num frames: 115965952. Throughput: 0: 10604.1. Samples: 29052416. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:15:15,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:15:16,273][1653645] Updated weights for policy 0, policy_version 56640 (0.0013) [2024-06-15 12:15:17,600][1653645] Updated weights for policy 0, policy_version 56697 (0.0013) [2024-06-15 12:15:20,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 41506.1, 300 sec: 43098.3). Total num frames: 116129792. Throughput: 0: 10638.2. Samples: 29083648. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:15:20,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:15:22,790][1653645] Updated weights for policy 0, policy_version 56761 (0.0011) [2024-06-15 12:15:25,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 41506.4, 300 sec: 42653.9). Total num frames: 116260864. Throughput: 0: 10626.9. Samples: 29156864. Policy #0 lag: (min: 15.0, avg: 93.4, max: 271.0) [2024-06-15 12:15:25,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:15:26,473][1653645] Updated weights for policy 0, policy_version 56800 (0.0013) [2024-06-15 12:15:28,280][1653645] Updated weights for policy 0, policy_version 56880 (0.0016) [2024-06-15 12:15:29,216][1653645] Updated weights for policy 0, policy_version 56917 (0.0056) [2024-06-15 12:15:30,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 43542.6). Total num frames: 116654080. Throughput: 0: 10615.4. Samples: 29220864. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:15:30,959][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:15:33,410][1653645] Updated weights for policy 0, policy_version 56976 (0.0012) [2024-06-15 12:15:34,500][1653645] Updated weights for policy 0, policy_version 57014 (0.0055) [2024-06-15 12:15:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 116785152. Throughput: 0: 10612.3. Samples: 29261824. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:15:35,958][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 12:15:37,684][1653645] Updated weights for policy 0, policy_version 57059 (0.0012) [2024-06-15 12:15:39,028][1653645] Updated weights for policy 0, policy_version 57120 (0.0014) [2024-06-15 12:15:40,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 117112832. Throughput: 0: 10956.8. Samples: 29326848. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:15:40,959][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 12:15:41,278][1653645] Updated weights for policy 0, policy_version 57209 (0.0015) [2024-06-15 12:15:45,696][1653645] Updated weights for policy 0, policy_version 57271 (0.0034) [2024-06-15 12:15:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 117309440. Throughput: 0: 10808.9. Samples: 29391872. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:15:45,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 12:15:49,422][1653645] Updated weights for policy 0, policy_version 57302 (0.0012) [2024-06-15 12:15:50,811][1651596] Signal inference workers to stop experience collection... (3000 times) [2024-06-15 12:15:50,916][1653645] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-15 12:15:50,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 42622.9, 300 sec: 43209.4). Total num frames: 117473280. Throughput: 0: 11025.0. Samples: 29427712. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:15:50,959][1648982] Avg episode reward: [(0, '36.850')] [2024-06-15 12:15:50,981][1651596] Signal inference workers to resume experience collection... (3000 times) [2024-06-15 12:15:50,982][1653645] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-15 12:15:50,985][1653645] Updated weights for policy 0, policy_version 57376 (0.0014) [2024-06-15 12:15:52,839][1653645] Updated weights for policy 0, policy_version 57461 (0.0011) [2024-06-15 12:15:55,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 117702656. Throughput: 0: 11059.2. Samples: 29491712. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:15:55,958][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 12:15:57,261][1653645] Updated weights for policy 0, policy_version 57520 (0.0013) [2024-06-15 12:16:00,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 43690.6, 300 sec: 42654.0). Total num frames: 117833728. Throughput: 0: 11366.4. Samples: 29563904. Policy #0 lag: (min: 15.0, avg: 77.5, max: 239.0) [2024-06-15 12:16:00,963][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 12:16:01,300][1653645] Updated weights for policy 0, policy_version 57554 (0.0016) [2024-06-15 12:16:02,910][1653645] Updated weights for policy 0, policy_version 57634 (0.0012) [2024-06-15 12:16:04,553][1653645] Updated weights for policy 0, policy_version 57697 (0.0093) [2024-06-15 12:16:05,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.5, 300 sec: 43431.5). Total num frames: 118226944. Throughput: 0: 11332.2. Samples: 29593600. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:05,958][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 12:16:08,261][1653645] Updated weights for policy 0, policy_version 57746 (0.0014) [2024-06-15 12:16:09,164][1653645] Updated weights for policy 0, policy_version 57792 (0.0011) [2024-06-15 12:16:10,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 118358016. Throughput: 0: 11218.4. Samples: 29661696. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:10,959][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 12:16:13,312][1653645] Updated weights for policy 0, policy_version 57842 (0.0033) [2024-06-15 12:16:15,467][1653645] Updated weights for policy 0, policy_version 57936 (0.0013) [2024-06-15 12:16:15,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44782.9, 300 sec: 43653.7). Total num frames: 118652928. Throughput: 0: 11218.5. Samples: 29725696. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:15,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:16:16,826][1653645] Updated weights for policy 0, policy_version 57984 (0.0012) [2024-06-15 12:16:20,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 44782.9, 300 sec: 43209.3). Total num frames: 118816768. Throughput: 0: 11036.4. Samples: 29758464. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:20,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:16:21,339][1653645] Updated weights for policy 0, policy_version 58045 (0.0012) [2024-06-15 12:16:25,812][1653645] Updated weights for policy 0, policy_version 58098 (0.0014) [2024-06-15 12:16:25,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 45329.1, 300 sec: 42987.2). Total num frames: 118980608. Throughput: 0: 11173.0. Samples: 29829632. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:25,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:16:27,672][1653645] Updated weights for policy 0, policy_version 58176 (0.0012) [2024-06-15 12:16:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.8, 300 sec: 43320.4). Total num frames: 119275520. Throughput: 0: 11002.3. Samples: 29886976. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:30,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:16:32,107][1653645] Updated weights for policy 0, policy_version 58242 (0.0015) [2024-06-15 12:16:32,879][1651596] Signal inference workers to stop experience collection... (3050 times) [2024-06-15 12:16:32,931][1653645] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-15 12:16:33,061][1651596] Signal inference workers to resume experience collection... (3050 times) [2024-06-15 12:16:33,080][1653645] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-15 12:16:33,311][1653645] Updated weights for policy 0, policy_version 58304 (0.0013) [2024-06-15 12:16:35,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 119406592. Throughput: 0: 10945.4. Samples: 29920256. Policy #0 lag: (min: 61.0, avg: 129.8, max: 316.0) [2024-06-15 12:16:35,959][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:16:37,601][1653645] Updated weights for policy 0, policy_version 58357 (0.0012) [2024-06-15 12:16:38,859][1653645] Updated weights for policy 0, policy_version 58401 (0.0013) [2024-06-15 12:16:40,202][1653645] Updated weights for policy 0, policy_version 58464 (0.0015) [2024-06-15 12:16:40,872][1653645] Updated weights for policy 0, policy_version 58496 (0.0013) [2024-06-15 12:16:40,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 44783.1, 300 sec: 43875.8). Total num frames: 119799808. Throughput: 0: 11070.7. Samples: 29989888. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:16:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:16:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 119930880. Throughput: 0: 10922.7. Samples: 30055424. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:16:45,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:16:48,032][1653645] Updated weights for policy 0, policy_version 58576 (0.0015) [2024-06-15 12:16:49,067][1653645] Updated weights for policy 0, policy_version 58615 (0.0013) [2024-06-15 12:16:50,548][1653645] Updated weights for policy 0, policy_version 58672 (0.0013) [2024-06-15 12:16:50,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45329.2, 300 sec: 43542.6). Total num frames: 120193024. Throughput: 0: 11104.8. Samples: 30093312. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:16:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:16:52,356][1653645] Updated weights for policy 0, policy_version 58743 (0.0011) [2024-06-15 12:16:54,926][1653645] Updated weights for policy 0, policy_version 58770 (0.0014) [2024-06-15 12:16:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.4, 300 sec: 43542.5). Total num frames: 120455168. Throughput: 0: 11150.3. Samples: 30163456. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:16:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:16:55,970][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000058816_120455168.pth... [2024-06-15 12:16:56,085][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000053696_109969408.pth [2024-06-15 12:16:59,067][1653645] Updated weights for policy 0, policy_version 58818 (0.0080) [2024-06-15 12:17:00,384][1653645] Updated weights for policy 0, policy_version 58872 (0.0023) [2024-06-15 12:17:00,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 45875.2, 300 sec: 43098.3). Total num frames: 120586240. Throughput: 0: 11150.2. Samples: 30227456. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:17:00,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:17:01,999][1653645] Updated weights for policy 0, policy_version 58932 (0.0049) [2024-06-15 12:17:03,669][1653645] Updated weights for policy 0, policy_version 59002 (0.0014) [2024-06-15 12:17:05,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 120848384. Throughput: 0: 11093.3. Samples: 30257664. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:17:05,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:17:07,027][1653645] Updated weights for policy 0, policy_version 59047 (0.0152) [2024-06-15 12:17:10,796][1653645] Updated weights for policy 0, policy_version 59088 (0.0014) [2024-06-15 12:17:10,960][1648982] Fps is (10 sec: 42597.8, 60 sec: 44237.0, 300 sec: 42876.1). Total num frames: 121012224. Throughput: 0: 11150.2. Samples: 30331392. Policy #0 lag: (min: 17.0, avg: 120.7, max: 289.0) [2024-06-15 12:17:10,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:17:12,056][1653645] Updated weights for policy 0, policy_version 59138 (0.0012) [2024-06-15 12:17:13,188][1653645] Updated weights for policy 0, policy_version 59192 (0.0028) [2024-06-15 12:17:14,449][1653645] Updated weights for policy 0, policy_version 59234 (0.0091) [2024-06-15 12:17:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 121372672. Throughput: 0: 11389.1. Samples: 30399488. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:17:17,103][1651596] Signal inference workers to stop experience collection... (3100 times) [2024-06-15 12:17:17,160][1653645] Updated weights for policy 0, policy_version 59269 (0.0012) [2024-06-15 12:17:17,175][1653645] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-15 12:17:17,294][1651596] Signal inference workers to resume experience collection... (3100 times) [2024-06-15 12:17:17,294][1653645] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-15 12:17:18,321][1653645] Updated weights for policy 0, policy_version 59328 (0.0013) [2024-06-15 12:17:20,958][1648982] Fps is (10 sec: 49151.0, 60 sec: 44782.7, 300 sec: 43320.4). Total num frames: 121503744. Throughput: 0: 11468.8. Samples: 30436352. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:20,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:17:22,566][1653645] Updated weights for policy 0, policy_version 59387 (0.0014) [2024-06-15 12:17:24,658][1653645] Updated weights for policy 0, policy_version 59456 (0.0016) [2024-06-15 12:17:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 46967.2, 300 sec: 43653.6). Total num frames: 121798656. Throughput: 0: 11343.6. Samples: 30500352. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:25,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:17:29,342][1653645] Updated weights for policy 0, policy_version 59536 (0.0114) [2024-06-15 12:17:30,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 122028032. Throughput: 0: 11366.4. Samples: 30566912. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:17:33,541][1653645] Updated weights for policy 0, policy_version 59588 (0.0013) [2024-06-15 12:17:35,346][1653645] Updated weights for policy 0, policy_version 59664 (0.0013) [2024-06-15 12:17:35,982][1648982] Fps is (10 sec: 42495.3, 60 sec: 46948.4, 300 sec: 43427.9). Total num frames: 122224640. Throughput: 0: 11371.6. Samples: 30605312. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:35,983][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:17:37,722][1653645] Updated weights for policy 0, policy_version 59715 (0.0013) [2024-06-15 12:17:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 122421248. Throughput: 0: 11025.1. Samples: 30659584. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:40,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:17:42,170][1653645] Updated weights for policy 0, policy_version 59780 (0.0013) [2024-06-15 12:17:45,843][1653645] Updated weights for policy 0, policy_version 59842 (0.0110) [2024-06-15 12:17:45,958][1648982] Fps is (10 sec: 32847.7, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 122552320. Throughput: 0: 11161.5. Samples: 30729728. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:45,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:17:47,159][1653645] Updated weights for policy 0, policy_version 59892 (0.0016) [2024-06-15 12:17:48,764][1653645] Updated weights for policy 0, policy_version 59958 (0.0018) [2024-06-15 12:17:49,898][1653645] Updated weights for policy 0, policy_version 59984 (0.0011) [2024-06-15 12:17:50,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 43875.9). Total num frames: 122912768. Throughput: 0: 11127.5. Samples: 30758400. Policy #0 lag: (min: 63.0, avg: 201.8, max: 344.0) [2024-06-15 12:17:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:17:51,297][1653645] Updated weights for policy 0, policy_version 60030 (0.0020) [2024-06-15 12:17:54,879][1653645] Updated weights for policy 0, policy_version 60087 (0.0012) [2024-06-15 12:17:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 123076608. Throughput: 0: 11036.4. Samples: 30828032. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:17:55,959][1648982] Avg episode reward: [(0, '36.770')] [2024-06-15 12:17:58,795][1653645] Updated weights for policy 0, policy_version 60153 (0.0012) [2024-06-15 12:18:00,373][1653645] Updated weights for policy 0, policy_version 60222 (0.0117) [2024-06-15 12:18:00,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45875.1, 300 sec: 43542.5). Total num frames: 123338752. Throughput: 0: 10945.5. Samples: 30892032. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:00,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 12:18:02,559][1653645] Updated weights for policy 0, policy_version 60259 (0.0013) [2024-06-15 12:18:05,235][1651596] Signal inference workers to stop experience collection... (3150 times) [2024-06-15 12:18:05,275][1653645] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-15 12:18:05,413][1651596] Signal inference workers to resume experience collection... (3150 times) [2024-06-15 12:18:05,414][1653645] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-15 12:18:05,600][1653645] Updated weights for policy 0, policy_version 60309 (0.0013) [2024-06-15 12:18:05,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 123535360. Throughput: 0: 10911.4. Samples: 30927360. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:05,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:18:06,542][1653645] Updated weights for policy 0, policy_version 60349 (0.0012) [2024-06-15 12:18:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45329.0, 300 sec: 43542.5). Total num frames: 123731968. Throughput: 0: 11218.5. Samples: 31005184. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:10,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:18:11,006][1653645] Updated weights for policy 0, policy_version 60417 (0.0012) [2024-06-15 12:18:12,396][1653645] Updated weights for policy 0, policy_version 60478 (0.0012) [2024-06-15 12:18:13,998][1653645] Updated weights for policy 0, policy_version 60533 (0.0014) [2024-06-15 12:18:15,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 123994112. Throughput: 0: 11002.3. Samples: 31062016. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:15,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:18:17,580][1653645] Updated weights for policy 0, policy_version 60576 (0.0012) [2024-06-15 12:18:20,695][1653645] Updated weights for policy 0, policy_version 60624 (0.0024) [2024-06-15 12:18:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 124157952. Throughput: 0: 11042.4. Samples: 31101952. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:20,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:18:21,877][1653645] Updated weights for policy 0, policy_version 60668 (0.0011) [2024-06-15 12:18:23,337][1653645] Updated weights for policy 0, policy_version 60731 (0.0012) [2024-06-15 12:18:25,345][1653645] Updated weights for policy 0, policy_version 60772 (0.0013) [2024-06-15 12:18:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.2, 300 sec: 43986.9). Total num frames: 124518400. Throughput: 0: 11320.9. Samples: 31169024. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:25,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:18:29,463][1653645] Updated weights for policy 0, policy_version 60848 (0.0014) [2024-06-15 12:18:30,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 124649472. Throughput: 0: 11355.0. Samples: 31240704. Policy #0 lag: (min: 2.0, avg: 114.9, max: 258.0) [2024-06-15 12:18:30,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:18:33,256][1653645] Updated weights for policy 0, policy_version 60928 (0.0014) [2024-06-15 12:18:34,837][1653645] Updated weights for policy 0, policy_version 60980 (0.0012) [2024-06-15 12:18:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44801.2, 300 sec: 43987.2). Total num frames: 124911616. Throughput: 0: 11355.0. Samples: 31269376. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:18:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:18:37,199][1653645] Updated weights for policy 0, policy_version 61008 (0.0035) [2024-06-15 12:18:40,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 125042688. Throughput: 0: 11161.7. Samples: 31330304. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:18:40,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:18:41,151][1653645] Updated weights for policy 0, policy_version 61062 (0.0013) [2024-06-15 12:18:42,288][1653645] Updated weights for policy 0, policy_version 61110 (0.0105) [2024-06-15 12:18:44,048][1653645] Updated weights for policy 0, policy_version 61152 (0.0012) [2024-06-15 12:18:45,941][1653645] Updated weights for policy 0, policy_version 61217 (0.0012) [2024-06-15 12:18:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 46967.6, 300 sec: 44097.9). Total num frames: 125370368. Throughput: 0: 11377.8. Samples: 31404032. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:18:45,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 12:18:48,227][1651596] Signal inference workers to stop experience collection... (3200 times) [2024-06-15 12:18:48,275][1653645] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-15 12:18:48,583][1651596] Signal inference workers to resume experience collection... (3200 times) [2024-06-15 12:18:48,585][1653645] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-15 12:18:49,578][1653645] Updated weights for policy 0, policy_version 61304 (0.0015) [2024-06-15 12:18:50,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 125566976. Throughput: 0: 11332.2. Samples: 31437312. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:18:50,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:18:54,119][1653645] Updated weights for policy 0, policy_version 61368 (0.0020) [2024-06-15 12:18:55,958][1648982] Fps is (10 sec: 32767.1, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 125698048. Throughput: 0: 11025.0. Samples: 31501312. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:18:55,959][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:18:56,691][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000061408_125763584.pth... [2024-06-15 12:18:56,861][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000056192_115081216.pth [2024-06-15 12:18:57,466][1653645] Updated weights for policy 0, policy_version 61440 (0.0013) [2024-06-15 12:19:00,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 125992960. Throughput: 0: 11036.5. Samples: 31558656. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:19:00,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:19:01,073][1653645] Updated weights for policy 0, policy_version 61521 (0.0022) [2024-06-15 12:19:01,948][1653645] Updated weights for policy 0, policy_version 61567 (0.0015) [2024-06-15 12:19:05,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 126156800. Throughput: 0: 10956.8. Samples: 31595008. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 12:19:05,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:19:06,320][1653645] Updated weights for policy 0, policy_version 61627 (0.0013) [2024-06-15 12:19:09,289][1653645] Updated weights for policy 0, policy_version 61696 (0.0015) [2024-06-15 12:19:10,916][1653645] Updated weights for policy 0, policy_version 61754 (0.0012) [2024-06-15 12:19:10,962][1648982] Fps is (10 sec: 45854.4, 60 sec: 45325.7, 300 sec: 43986.2). Total num frames: 126451712. Throughput: 0: 11001.2. Samples: 31664128. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:10,963][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:19:13,235][1653645] Updated weights for policy 0, policy_version 61795 (0.0012) [2024-06-15 12:19:15,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 126615552. Throughput: 0: 10911.3. Samples: 31731712. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:15,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 12:19:17,654][1653645] Updated weights for policy 0, policy_version 61877 (0.0026) [2024-06-15 12:19:20,953][1653645] Updated weights for policy 0, policy_version 61936 (0.0145) [2024-06-15 12:19:20,958][1648982] Fps is (10 sec: 39338.5, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 126844928. Throughput: 0: 11059.2. Samples: 31767040. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:20,958][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:19:22,337][1653645] Updated weights for policy 0, policy_version 61988 (0.0013) [2024-06-15 12:19:25,308][1653645] Updated weights for policy 0, policy_version 62032 (0.0021) [2024-06-15 12:19:25,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 44209.0). Total num frames: 127074304. Throughput: 0: 11025.1. Samples: 31826432. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:25,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:19:29,365][1653645] Updated weights for policy 0, policy_version 62113 (0.0014) [2024-06-15 12:19:30,035][1653645] Updated weights for policy 0, policy_version 62144 (0.0013) [2024-06-15 12:19:30,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 127270912. Throughput: 0: 11002.3. Samples: 31899136. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:30,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:19:33,111][1653645] Updated weights for policy 0, policy_version 62208 (0.0013) [2024-06-15 12:19:33,677][1651596] Signal inference workers to stop experience collection... (3250 times) [2024-06-15 12:19:33,768][1653645] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-15 12:19:34,068][1651596] Signal inference workers to resume experience collection... (3250 times) [2024-06-15 12:19:34,070][1653645] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-15 12:19:35,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 127533056. Throughput: 0: 10888.5. Samples: 31927296. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:35,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:19:37,228][1653645] Updated weights for policy 0, policy_version 62274 (0.0025) [2024-06-15 12:19:38,463][1653645] Updated weights for policy 0, policy_version 62325 (0.0011) [2024-06-15 12:19:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 127664128. Throughput: 0: 10900.0. Samples: 31991808. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:40,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:19:41,489][1653645] Updated weights for policy 0, policy_version 62356 (0.0012) [2024-06-15 12:19:42,363][1653645] Updated weights for policy 0, policy_version 62400 (0.0013) [2024-06-15 12:19:45,488][1653645] Updated weights for policy 0, policy_version 62480 (0.0012) [2024-06-15 12:19:45,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 44325.3). Total num frames: 127991808. Throughput: 0: 11104.7. Samples: 32058368. Policy #0 lag: (min: 103.0, avg: 178.6, max: 311.0) [2024-06-15 12:19:45,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:19:48,942][1653645] Updated weights for policy 0, policy_version 62530 (0.0015) [2024-06-15 12:19:50,084][1653645] Updated weights for policy 0, policy_version 62585 (0.0017) [2024-06-15 12:19:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 128188416. Throughput: 0: 10979.6. Samples: 32089088. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:19:50,958][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 12:19:53,528][1653645] Updated weights for policy 0, policy_version 62629 (0.0012) [2024-06-15 12:19:55,546][1653645] Updated weights for policy 0, policy_version 62672 (0.0014) [2024-06-15 12:19:55,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 128352256. Throughput: 0: 11071.7. Samples: 32162304. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:19:55,958][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:19:58,032][1653645] Updated weights for policy 0, policy_version 62768 (0.0141) [2024-06-15 12:20:00,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 44236.6, 300 sec: 44209.0). Total num frames: 128647168. Throughput: 0: 10911.3. Samples: 32222720. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:20:00,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:20:01,025][1653645] Updated weights for policy 0, policy_version 62832 (0.0034) [2024-06-15 12:20:05,202][1653645] Updated weights for policy 0, policy_version 62880 (0.0013) [2024-06-15 12:20:05,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 128843776. Throughput: 0: 10979.6. Samples: 32261120. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:20:05,958][1648982] Avg episode reward: [(0, '36.920')] [2024-06-15 12:20:07,262][1653645] Updated weights for policy 0, policy_version 62913 (0.0013) [2024-06-15 12:20:08,596][1653645] Updated weights for policy 0, policy_version 62976 (0.0099) [2024-06-15 12:20:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44240.0, 300 sec: 44542.2). Total num frames: 129105920. Throughput: 0: 11161.5. Samples: 32328704. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:20:10,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 12:20:11,464][1653645] Updated weights for policy 0, policy_version 63056 (0.0021) [2024-06-15 12:20:15,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 129236992. Throughput: 0: 11218.4. Samples: 32403968. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:20:15,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:20:15,994][1653645] Updated weights for policy 0, policy_version 63120 (0.0095) [2024-06-15 12:20:17,104][1653645] Updated weights for policy 0, policy_version 63159 (0.0014) [2024-06-15 12:20:19,557][1651596] Signal inference workers to stop experience collection... (3300 times) [2024-06-15 12:20:19,617][1653645] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-15 12:20:19,806][1651596] Signal inference workers to resume experience collection... (3300 times) [2024-06-15 12:20:19,807][1653645] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-15 12:20:19,809][1653645] Updated weights for policy 0, policy_version 63232 (0.0014) [2024-06-15 12:20:20,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 129564672. Throughput: 0: 11366.4. Samples: 32438784. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:20:20,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:20:21,347][1653645] Updated weights for policy 0, policy_version 63294 (0.0013) [2024-06-15 12:20:23,666][1653645] Updated weights for policy 0, policy_version 63359 (0.0022) [2024-06-15 12:20:25,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 44782.6, 300 sec: 44431.1). Total num frames: 129761280. Throughput: 0: 11263.9. Samples: 32498688. Policy #0 lag: (min: 12.0, avg: 137.9, max: 268.0) [2024-06-15 12:20:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:20:28,534][1653645] Updated weights for policy 0, policy_version 63413 (0.0013) [2024-06-15 12:20:30,900][1653645] Updated weights for policy 0, policy_version 63472 (0.0013) [2024-06-15 12:20:30,990][1648982] Fps is (10 sec: 42598.2, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 129990656. Throughput: 0: 11468.8. Samples: 32574464. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:20:30,990][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:20:32,816][1653645] Updated weights for policy 0, policy_version 63548 (0.0013) [2024-06-15 12:20:35,585][1653645] Updated weights for policy 0, policy_version 63610 (0.0013) [2024-06-15 12:20:35,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 130285568. Throughput: 0: 11275.4. Samples: 32596480. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:20:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:20:40,335][1653645] Updated weights for policy 0, policy_version 63664 (0.0013) [2024-06-15 12:20:40,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 130416640. Throughput: 0: 11355.0. Samples: 32673280. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:20:40,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:20:43,548][1653645] Updated weights for policy 0, policy_version 63760 (0.0015) [2024-06-15 12:20:45,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44782.6, 300 sec: 44764.4). Total num frames: 130678784. Throughput: 0: 11377.7. Samples: 32734720. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:20:45,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:20:46,833][1653645] Updated weights for policy 0, policy_version 63824 (0.0012) [2024-06-15 12:20:50,282][1653645] Updated weights for policy 0, policy_version 63874 (0.0016) [2024-06-15 12:20:50,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.8, 300 sec: 44653.4). Total num frames: 130875392. Throughput: 0: 11355.0. Samples: 32772096. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:20:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:20:51,668][1653645] Updated weights for policy 0, policy_version 63936 (0.0022) [2024-06-15 12:20:54,241][1653645] Updated weights for policy 0, policy_version 64016 (0.0013) [2024-06-15 12:20:55,958][1648982] Fps is (10 sec: 52431.1, 60 sec: 47513.7, 300 sec: 45319.8). Total num frames: 131203072. Throughput: 0: 11332.3. Samples: 32838656. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:20:55,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:20:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000064064_131203072.pth... [2024-06-15 12:20:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000058816_120455168.pth [2024-06-15 12:20:58,535][1653645] Updated weights for policy 0, policy_version 64096 (0.0013) [2024-06-15 12:21:00,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 131334144. Throughput: 0: 11241.3. Samples: 32909824. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:21:00,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:21:02,784][1653645] Updated weights for policy 0, policy_version 64166 (0.0013) [2024-06-15 12:21:04,977][1651596] Signal inference workers to stop experience collection... (3350 times) [2024-06-15 12:21:05,017][1653645] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-15 12:21:05,246][1651596] Signal inference workers to resume experience collection... (3350 times) [2024-06-15 12:21:05,247][1653645] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-15 12:21:05,249][1653645] Updated weights for policy 0, policy_version 64224 (0.0014) [2024-06-15 12:21:05,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 45329.0, 300 sec: 44764.5). Total num frames: 131563520. Throughput: 0: 11207.1. Samples: 32943104. Policy #0 lag: (min: 46.0, avg: 129.7, max: 302.0) [2024-06-15 12:21:05,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:21:07,297][1653645] Updated weights for policy 0, policy_version 64319 (0.0126) [2024-06-15 12:21:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 131760128. Throughput: 0: 11241.3. Samples: 33004544. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:10,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:21:11,565][1653645] Updated weights for policy 0, policy_version 64378 (0.0013) [2024-06-15 12:21:14,430][1653645] Updated weights for policy 0, policy_version 64423 (0.0045) [2024-06-15 12:21:15,959][1648982] Fps is (10 sec: 42593.7, 60 sec: 45874.5, 300 sec: 44653.2). Total num frames: 131989504. Throughput: 0: 11127.2. Samples: 33075200. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:15,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:21:17,661][1653645] Updated weights for policy 0, policy_version 64496 (0.0013) [2024-06-15 12:21:20,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 132251648. Throughput: 0: 11252.6. Samples: 33102848. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:20,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:21:22,017][1653645] Updated weights for policy 0, policy_version 64579 (0.0034) [2024-06-15 12:21:25,370][1653645] Updated weights for policy 0, policy_version 64647 (0.0013) [2024-06-15 12:21:25,979][1648982] Fps is (10 sec: 45784.7, 60 sec: 44767.7, 300 sec: 44650.2). Total num frames: 132448256. Throughput: 0: 11213.3. Samples: 33178112. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:25,979][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:21:26,335][1653645] Updated weights for policy 0, policy_version 64697 (0.0012) [2024-06-15 12:21:29,250][1653645] Updated weights for policy 0, policy_version 64768 (0.0013) [2024-06-15 12:21:30,512][1653645] Updated weights for policy 0, policy_version 64827 (0.0014) [2024-06-15 12:21:30,966][1648982] Fps is (10 sec: 52390.6, 60 sec: 46415.7, 300 sec: 45318.7). Total num frames: 132775936. Throughput: 0: 11296.4. Samples: 33243136. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:30,970][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:21:33,682][1653645] Updated weights for policy 0, policy_version 64867 (0.0013) [2024-06-15 12:21:35,958][1648982] Fps is (10 sec: 45970.7, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 132907008. Throughput: 0: 11366.4. Samples: 33283584. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:21:36,721][1653645] Updated weights for policy 0, policy_version 64917 (0.0011) [2024-06-15 12:21:40,259][1653645] Updated weights for policy 0, policy_version 64992 (0.0018) [2024-06-15 12:21:40,958][1648982] Fps is (10 sec: 36069.8, 60 sec: 45328.8, 300 sec: 44764.4). Total num frames: 133136384. Throughput: 0: 11514.2. Samples: 33356800. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:40,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:21:41,927][1653645] Updated weights for policy 0, policy_version 65057 (0.0013) [2024-06-15 12:21:45,036][1653645] Updated weights for policy 0, policy_version 65121 (0.0013) [2024-06-15 12:21:45,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45875.5, 300 sec: 44875.5). Total num frames: 133431296. Throughput: 0: 11298.1. Samples: 33418240. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:45,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:21:48,169][1653645] Updated weights for policy 0, policy_version 65168 (0.0114) [2024-06-15 12:21:48,492][1651596] Signal inference workers to stop experience collection... (3400 times) [2024-06-15 12:21:48,543][1653645] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-15 12:21:48,780][1651596] Signal inference workers to resume experience collection... (3400 times) [2024-06-15 12:21:48,781][1653645] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-15 12:21:50,959][1648982] Fps is (10 sec: 42598.6, 60 sec: 44782.8, 300 sec: 44431.1). Total num frames: 133562368. Throughput: 0: 11434.6. Samples: 33457664. Policy #0 lag: (min: 15.0, avg: 127.8, max: 271.0) [2024-06-15 12:21:50,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:21:51,184][1653645] Updated weights for policy 0, policy_version 65233 (0.0032) [2024-06-15 12:21:52,903][1653645] Updated weights for policy 0, policy_version 65318 (0.0112) [2024-06-15 12:21:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 133857280. Throughput: 0: 11480.2. Samples: 33521152. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:21:55,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 12:21:56,702][1653645] Updated weights for policy 0, policy_version 65392 (0.0020) [2024-06-15 12:22:00,722][1653645] Updated weights for policy 0, policy_version 65440 (0.0014) [2024-06-15 12:22:00,965][1648982] Fps is (10 sec: 45876.5, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 134021120. Throughput: 0: 11434.9. Samples: 33589760. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:00,975][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:22:02,116][1653645] Updated weights for policy 0, policy_version 65474 (0.0012) [2024-06-15 12:22:04,172][1653645] Updated weights for policy 0, policy_version 65574 (0.0013) [2024-06-15 12:22:05,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 46421.4, 300 sec: 45208.7). Total num frames: 134348800. Throughput: 0: 11571.2. Samples: 33623552. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:05,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:22:07,595][1653645] Updated weights for policy 0, policy_version 65622 (0.0013) [2024-06-15 12:22:08,609][1653645] Updated weights for policy 0, policy_version 65661 (0.0013) [2024-06-15 12:22:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 134479872. Throughput: 0: 11451.4. Samples: 33693184. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:22:13,051][1653645] Updated weights for policy 0, policy_version 65722 (0.0014) [2024-06-15 12:22:14,679][1653645] Updated weights for policy 0, policy_version 65792 (0.0093) [2024-06-15 12:22:15,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 47514.4, 300 sec: 45208.8). Total num frames: 134840320. Throughput: 0: 11356.8. Samples: 33754112. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:15,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:22:16,094][1653645] Updated weights for policy 0, policy_version 65848 (0.0014) [2024-06-15 12:22:20,023][1653645] Updated weights for policy 0, policy_version 65890 (0.0089) [2024-06-15 12:22:20,959][1648982] Fps is (10 sec: 52426.6, 60 sec: 45874.9, 300 sec: 44764.4). Total num frames: 135004160. Throughput: 0: 11275.3. Samples: 33790976. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:20,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:22:24,297][1653645] Updated weights for policy 0, policy_version 65921 (0.0012) [2024-06-15 12:22:25,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 44798.5, 300 sec: 44431.2). Total num frames: 135135232. Throughput: 0: 11309.6. Samples: 33865728. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:25,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:22:26,372][1653645] Updated weights for policy 0, policy_version 66016 (0.0013) [2024-06-15 12:22:28,485][1653645] Updated weights for policy 0, policy_version 66101 (0.0012) [2024-06-15 12:22:30,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 43695.9, 300 sec: 44657.0). Total num frames: 135397376. Throughput: 0: 11172.9. Samples: 33921024. Policy #0 lag: (min: 15.0, avg: 114.6, max: 271.0) [2024-06-15 12:22:30,958][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:22:31,415][1651596] Signal inference workers to stop experience collection... (3450 times) [2024-06-15 12:22:31,472][1653645] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-15 12:22:31,483][1653645] Updated weights for policy 0, policy_version 66132 (0.0013) [2024-06-15 12:22:31,810][1651596] Signal inference workers to resume experience collection... (3450 times) [2024-06-15 12:22:31,826][1653645] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-15 12:22:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 135528448. Throughput: 0: 10956.9. Samples: 33950720. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:22:35,960][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:22:36,709][1653645] Updated weights for policy 0, policy_version 66195 (0.0014) [2024-06-15 12:22:38,510][1653645] Updated weights for policy 0, policy_version 66272 (0.0013) [2024-06-15 12:22:40,702][1653645] Updated weights for policy 0, policy_version 66354 (0.0013) [2024-06-15 12:22:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45875.4, 300 sec: 45208.8). Total num frames: 135888896. Throughput: 0: 11025.1. Samples: 34017280. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:22:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:22:43,361][1653645] Updated weights for policy 0, policy_version 66384 (0.0012) [2024-06-15 12:22:45,958][1648982] Fps is (10 sec: 52426.1, 60 sec: 43690.2, 300 sec: 44542.2). Total num frames: 136052736. Throughput: 0: 10922.5. Samples: 34081280. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:22:45,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:22:48,921][1653645] Updated weights for policy 0, policy_version 66464 (0.0013) [2024-06-15 12:22:50,442][1653645] Updated weights for policy 0, policy_version 66514 (0.0012) [2024-06-15 12:22:50,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44783.2, 300 sec: 44653.4). Total num frames: 136249344. Throughput: 0: 11116.1. Samples: 34123776. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:22:50,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:22:52,065][1653645] Updated weights for policy 0, policy_version 66577 (0.0019) [2024-06-15 12:22:55,955][1653645] Updated weights for policy 0, policy_version 66656 (0.0012) [2024-06-15 12:22:55,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 43690.5, 300 sec: 44542.2). Total num frames: 136478720. Throughput: 0: 10786.1. Samples: 34178560. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:22:55,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:22:56,296][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000066672_136544256.pth... [2024-06-15 12:22:56,373][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000061408_125763584.pth [2024-06-15 12:23:00,120][1653645] Updated weights for policy 0, policy_version 66691 (0.0013) [2024-06-15 12:23:00,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 136642560. Throughput: 0: 11173.0. Samples: 34256896. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:23:00,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:23:01,564][1653645] Updated weights for policy 0, policy_version 66752 (0.0016) [2024-06-15 12:23:03,801][1653645] Updated weights for policy 0, policy_version 66848 (0.0017) [2024-06-15 12:23:05,958][1648982] Fps is (10 sec: 49153.3, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 136970240. Throughput: 0: 10831.8. Samples: 34278400. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:23:05,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:23:08,203][1653645] Updated weights for policy 0, policy_version 66896 (0.0029) [2024-06-15 12:23:10,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 137101312. Throughput: 0: 10524.4. Samples: 34339328. Policy #0 lag: (min: 24.0, avg: 153.4, max: 280.0) [2024-06-15 12:23:10,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:23:13,268][1653645] Updated weights for policy 0, policy_version 66960 (0.0013) [2024-06-15 12:23:15,267][1651596] Signal inference workers to stop experience collection... (3500 times) [2024-06-15 12:23:15,307][1653645] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-15 12:23:15,324][1653645] Updated weights for policy 0, policy_version 67044 (0.0014) [2024-06-15 12:23:15,469][1651596] Signal inference workers to resume experience collection... (3500 times) [2024-06-15 12:23:15,470][1653645] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-15 12:23:15,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 44764.4). Total num frames: 137363456. Throughput: 0: 10820.3. Samples: 34407936. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:15,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:23:16,376][1653645] Updated weights for policy 0, policy_version 67091 (0.0023) [2024-06-15 12:23:19,038][1653645] Updated weights for policy 0, policy_version 67140 (0.0014) [2024-06-15 12:23:20,179][1653645] Updated weights for policy 0, policy_version 67196 (0.0015) [2024-06-15 12:23:20,964][1648982] Fps is (10 sec: 52402.6, 60 sec: 43687.3, 300 sec: 44430.4). Total num frames: 137625600. Throughput: 0: 10944.2. Samples: 34443264. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:20,965][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:23:25,759][1653645] Updated weights for policy 0, policy_version 67280 (0.0013) [2024-06-15 12:23:25,959][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 137789440. Throughput: 0: 11150.2. Samples: 34519040. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:25,960][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:23:27,744][1653645] Updated weights for policy 0, policy_version 67365 (0.0014) [2024-06-15 12:23:30,958][1648982] Fps is (10 sec: 39341.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 138018816. Throughput: 0: 11093.5. Samples: 34580480. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:30,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:23:31,793][1653645] Updated weights for policy 0, policy_version 67424 (0.0015) [2024-06-15 12:23:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 138182656. Throughput: 0: 10808.9. Samples: 34610176. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:35,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:23:36,304][1653645] Updated weights for policy 0, policy_version 67504 (0.0083) [2024-06-15 12:23:38,507][1653645] Updated weights for policy 0, policy_version 67584 (0.0136) [2024-06-15 12:23:39,929][1653645] Updated weights for policy 0, policy_version 67641 (0.0012) [2024-06-15 12:23:40,963][1648982] Fps is (10 sec: 52405.1, 60 sec: 44233.5, 300 sec: 44652.7). Total num frames: 138543104. Throughput: 0: 11137.8. Samples: 34679808. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:40,965][1648982] Avg episode reward: [(0, '36.770')] [2024-06-15 12:23:44,273][1653645] Updated weights for policy 0, policy_version 67707 (0.0095) [2024-06-15 12:23:45,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43691.1, 300 sec: 44431.2). Total num frames: 138674176. Throughput: 0: 10888.6. Samples: 34746880. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:45,958][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:23:48,467][1653645] Updated weights for policy 0, policy_version 67776 (0.0012) [2024-06-15 12:23:50,154][1653645] Updated weights for policy 0, policy_version 67842 (0.0011) [2024-06-15 12:23:50,958][1648982] Fps is (10 sec: 45896.0, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 139001856. Throughput: 0: 11298.1. Samples: 34786816. Policy #0 lag: (min: 7.0, avg: 73.7, max: 263.0) [2024-06-15 12:23:50,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:23:54,334][1653645] Updated weights for policy 0, policy_version 67906 (0.0013) [2024-06-15 12:23:55,409][1653645] Updated weights for policy 0, policy_version 67962 (0.0021) [2024-06-15 12:23:55,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 139198464. Throughput: 0: 11366.3. Samples: 34850816. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:23:55,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:23:59,283][1651596] Signal inference workers to stop experience collection... (3550 times) [2024-06-15 12:23:59,349][1653645] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-15 12:23:59,490][1651596] Signal inference workers to resume experience collection... (3550 times) [2024-06-15 12:23:59,491][1653645] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-15 12:24:00,096][1653645] Updated weights for policy 0, policy_version 68027 (0.0065) [2024-06-15 12:24:00,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 45329.2, 300 sec: 44764.4). Total num frames: 139362304. Throughput: 0: 11400.6. Samples: 34920960. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:00,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:24:01,730][1653645] Updated weights for policy 0, policy_version 68092 (0.0011) [2024-06-15 12:24:03,210][1653645] Updated weights for policy 0, policy_version 68154 (0.0017) [2024-06-15 12:24:05,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.6, 300 sec: 44542.9). Total num frames: 139591680. Throughput: 0: 11197.0. Samples: 34947072. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:05,958][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 12:24:07,080][1653645] Updated weights for policy 0, policy_version 68208 (0.0019) [2024-06-15 12:24:10,631][1653645] Updated weights for policy 0, policy_version 68226 (0.0021) [2024-06-15 12:24:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 139755520. Throughput: 0: 11059.2. Samples: 35016704. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:10,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:24:12,461][1653645] Updated weights for policy 0, policy_version 68304 (0.0011) [2024-06-15 12:24:13,446][1653645] Updated weights for policy 0, policy_version 68352 (0.0011) [2024-06-15 12:24:15,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 140083200. Throughput: 0: 11036.5. Samples: 35077120. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:15,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:24:18,423][1653645] Updated weights for policy 0, policy_version 68418 (0.0014) [2024-06-15 12:24:19,790][1653645] Updated weights for policy 0, policy_version 68472 (0.0011) [2024-06-15 12:24:20,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43694.3, 300 sec: 44653.3). Total num frames: 140247040. Throughput: 0: 11104.7. Samples: 35109888. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:20,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:24:23,629][1653645] Updated weights for policy 0, policy_version 68516 (0.0013) [2024-06-15 12:24:24,655][1653645] Updated weights for policy 0, policy_version 68560 (0.0011) [2024-06-15 12:24:25,741][1653645] Updated weights for policy 0, policy_version 68608 (0.0012) [2024-06-15 12:24:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 140509184. Throughput: 0: 11162.7. Samples: 35182080. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:25,958][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 12:24:30,816][1653645] Updated weights for policy 0, policy_version 68689 (0.0013) [2024-06-15 12:24:30,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 140673024. Throughput: 0: 11093.3. Samples: 35246080. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:30,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:24:34,739][1653645] Updated weights for policy 0, policy_version 68768 (0.0013) [2024-06-15 12:24:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 140902400. Throughput: 0: 10911.3. Samples: 35277824. Policy #0 lag: (min: 31.0, avg: 153.6, max: 287.0) [2024-06-15 12:24:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:24:36,408][1653645] Updated weights for policy 0, policy_version 68816 (0.0012) [2024-06-15 12:24:38,943][1653645] Updated weights for policy 0, policy_version 68869 (0.0013) [2024-06-15 12:24:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43694.0, 300 sec: 44653.3). Total num frames: 141164544. Throughput: 0: 10854.5. Samples: 35339264. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:24:40,958][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:24:42,172][1653645] Updated weights for policy 0, policy_version 68930 (0.0014) [2024-06-15 12:24:45,827][1651596] Signal inference workers to stop experience collection... (3600 times) [2024-06-15 12:24:45,885][1653645] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-15 12:24:45,887][1653645] Updated weights for policy 0, policy_version 68996 (0.0014) [2024-06-15 12:24:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 141295616. Throughput: 0: 10911.3. Samples: 35411968. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:24:45,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:24:46,029][1651596] Signal inference workers to resume experience collection... (3600 times) [2024-06-15 12:24:46,030][1653645] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-15 12:24:47,046][1653645] Updated weights for policy 0, policy_version 69055 (0.0020) [2024-06-15 12:24:49,308][1653645] Updated weights for policy 0, policy_version 69118 (0.0015) [2024-06-15 12:24:50,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 44764.4). Total num frames: 141557760. Throughput: 0: 11013.7. Samples: 35442688. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:24:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:24:52,430][1653645] Updated weights for policy 0, policy_version 69177 (0.0013) [2024-06-15 12:24:55,080][1653645] Updated weights for policy 0, policy_version 69221 (0.0054) [2024-06-15 12:24:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 141819904. Throughput: 0: 10899.9. Samples: 35507200. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:24:55,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:24:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000069248_141819904.pth... [2024-06-15 12:24:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000064064_131203072.pth [2024-06-15 12:24:57,937][1653645] Updated weights for policy 0, policy_version 69268 (0.0011) [2024-06-15 12:24:58,983][1653645] Updated weights for policy 0, policy_version 69312 (0.0022) [2024-06-15 12:25:00,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 142049280. Throughput: 0: 11059.2. Samples: 35574784. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:25:00,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:25:01,148][1653645] Updated weights for policy 0, policy_version 69369 (0.0013) [2024-06-15 12:25:03,997][1653645] Updated weights for policy 0, policy_version 69440 (0.0040) [2024-06-15 12:25:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 142213120. Throughput: 0: 11047.8. Samples: 35607040. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:25:05,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:25:07,044][1653645] Updated weights for policy 0, policy_version 69496 (0.0012) [2024-06-15 12:25:09,859][1653645] Updated weights for policy 0, policy_version 69552 (0.0012) [2024-06-15 12:25:10,958][1648982] Fps is (10 sec: 42595.2, 60 sec: 45328.5, 300 sec: 44875.4). Total num frames: 142475264. Throughput: 0: 11036.3. Samples: 35678720. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:25:10,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:25:13,555][1653645] Updated weights for policy 0, policy_version 69618 (0.0011) [2024-06-15 12:25:15,518][1653645] Updated weights for policy 0, policy_version 69666 (0.0011) [2024-06-15 12:25:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.6, 300 sec: 44653.3). Total num frames: 142737408. Throughput: 0: 11013.7. Samples: 35741696. Policy #0 lag: (min: 40.0, avg: 154.7, max: 296.0) [2024-06-15 12:25:15,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:25:17,812][1653645] Updated weights for policy 0, policy_version 69717 (0.0013) [2024-06-15 12:25:20,030][1653645] Updated weights for policy 0, policy_version 69776 (0.0122) [2024-06-15 12:25:20,958][1648982] Fps is (10 sec: 45878.3, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 142934016. Throughput: 0: 11127.5. Samples: 35778560. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:20,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:25:23,617][1653645] Updated weights for policy 0, policy_version 69840 (0.0014) [2024-06-15 12:25:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 143130624. Throughput: 0: 11150.2. Samples: 35841024. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:25,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:25:27,713][1653645] Updated weights for policy 0, policy_version 69943 (0.0014) [2024-06-15 12:25:30,240][1653645] Updated weights for policy 0, policy_version 69987 (0.0019) [2024-06-15 12:25:30,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 143392768. Throughput: 0: 11138.8. Samples: 35913216. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:25:31,454][1653645] Updated weights for policy 0, policy_version 70032 (0.0014) [2024-06-15 12:25:31,939][1651596] Signal inference workers to stop experience collection... (3650 times) [2024-06-15 12:25:31,972][1653645] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-15 12:25:32,184][1651596] Signal inference workers to resume experience collection... (3650 times) [2024-06-15 12:25:32,185][1653645] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-15 12:25:35,976][1648982] Fps is (10 sec: 42522.6, 60 sec: 44223.6, 300 sec: 44539.6). Total num frames: 143556608. Throughput: 0: 11088.9. Samples: 35941888. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:35,976][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:25:36,232][1653645] Updated weights for policy 0, policy_version 70112 (0.0015) [2024-06-15 12:25:39,027][1653645] Updated weights for policy 0, policy_version 70192 (0.0012) [2024-06-15 12:25:40,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 44236.5, 300 sec: 44542.3). Total num frames: 143818752. Throughput: 0: 11309.5. Samples: 36016128. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:40,959][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:25:41,110][1653645] Updated weights for policy 0, policy_version 70227 (0.0013) [2024-06-15 12:25:42,743][1653645] Updated weights for policy 0, policy_version 70276 (0.0013) [2024-06-15 12:25:44,067][1653645] Updated weights for policy 0, policy_version 70335 (0.0022) [2024-06-15 12:25:45,958][1648982] Fps is (10 sec: 49240.7, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 144048128. Throughput: 0: 11229.9. Samples: 36080128. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:45,958][1648982] Avg episode reward: [(0, '36.850')] [2024-06-15 12:25:48,189][1653645] Updated weights for policy 0, policy_version 70392 (0.0013) [2024-06-15 12:25:50,728][1653645] Updated weights for policy 0, policy_version 70448 (0.0087) [2024-06-15 12:25:50,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45328.9, 300 sec: 44320.1). Total num frames: 144277504. Throughput: 0: 11332.2. Samples: 36116992. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:50,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:25:52,571][1653645] Updated weights for policy 0, policy_version 70496 (0.0013) [2024-06-15 12:25:54,352][1653645] Updated weights for policy 0, policy_version 70546 (0.0013) [2024-06-15 12:25:55,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 144572416. Throughput: 0: 11309.7. Samples: 36187648. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:25:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:25:58,687][1653645] Updated weights for policy 0, policy_version 70613 (0.0013) [2024-06-15 12:26:00,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.7, 300 sec: 44542.2). Total num frames: 144703488. Throughput: 0: 11434.7. Samples: 36256256. Policy #0 lag: (min: 15.0, avg: 134.9, max: 271.0) [2024-06-15 12:26:00,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:26:01,379][1653645] Updated weights for policy 0, policy_version 70657 (0.0012) [2024-06-15 12:26:02,606][1653645] Updated weights for policy 0, policy_version 70714 (0.0011) [2024-06-15 12:26:04,671][1653645] Updated weights for policy 0, policy_version 70776 (0.0012) [2024-06-15 12:26:05,711][1653645] Updated weights for policy 0, policy_version 70816 (0.0019) [2024-06-15 12:26:05,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 145031168. Throughput: 0: 11468.8. Samples: 36294656. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:05,959][1648982] Avg episode reward: [(0, '36.700')] [2024-06-15 12:26:09,761][1653645] Updated weights for policy 0, policy_version 70880 (0.0015) [2024-06-15 12:26:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.7, 300 sec: 44875.7). Total num frames: 145227776. Throughput: 0: 11525.7. Samples: 36359680. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:26:13,278][1653645] Updated weights for policy 0, policy_version 70944 (0.0037) [2024-06-15 12:26:15,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 145391616. Throughput: 0: 11457.4. Samples: 36428800. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:26:16,263][1653645] Updated weights for policy 0, policy_version 71010 (0.0022) [2024-06-15 12:26:17,322][1653645] Updated weights for policy 0, policy_version 71056 (0.0145) [2024-06-15 12:26:17,936][1651596] Signal inference workers to stop experience collection... (3700 times) [2024-06-15 12:26:18,006][1653645] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-15 12:26:18,221][1651596] Signal inference workers to resume experience collection... (3700 times) [2024-06-15 12:26:18,222][1653645] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-15 12:26:20,897][1653645] Updated weights for policy 0, policy_version 71106 (0.0011) [2024-06-15 12:26:20,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44783.0, 300 sec: 44656.5). Total num frames: 145620992. Throughput: 0: 11439.3. Samples: 36456448. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:20,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 12:26:22,186][1653645] Updated weights for policy 0, policy_version 71168 (0.0013) [2024-06-15 12:26:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.9, 300 sec: 44099.0). Total num frames: 145784832. Throughput: 0: 11264.1. Samples: 36523008. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:25,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:26:27,474][1653645] Updated weights for policy 0, policy_version 71233 (0.0013) [2024-06-15 12:26:29,091][1653645] Updated weights for policy 0, policy_version 71294 (0.0016) [2024-06-15 12:26:30,794][1653645] Updated weights for policy 0, policy_version 71355 (0.0014) [2024-06-15 12:26:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 146145280. Throughput: 0: 11195.7. Samples: 36583936. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:30,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:26:34,232][1653645] Updated weights for policy 0, policy_version 71408 (0.0015) [2024-06-15 12:26:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45342.6, 300 sec: 44542.3). Total num frames: 146276352. Throughput: 0: 11173.0. Samples: 36619776. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:26:37,716][1653645] Updated weights for policy 0, policy_version 71446 (0.0013) [2024-06-15 12:26:38,610][1653645] Updated weights for policy 0, policy_version 71486 (0.0012) [2024-06-15 12:26:40,365][1653645] Updated weights for policy 0, policy_version 71549 (0.0017) [2024-06-15 12:26:40,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 45329.1, 300 sec: 44431.1). Total num frames: 146538496. Throughput: 0: 11229.8. Samples: 36692992. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:40,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:26:42,164][1653645] Updated weights for policy 0, policy_version 71601 (0.0014) [2024-06-15 12:26:44,719][1653645] Updated weights for policy 0, policy_version 71650 (0.0015) [2024-06-15 12:26:45,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 146800640. Throughput: 0: 11047.9. Samples: 36753408. Policy #0 lag: (min: 15.0, avg: 131.7, max: 271.0) [2024-06-15 12:26:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:26:49,536][1653645] Updated weights for policy 0, policy_version 71698 (0.0016) [2024-06-15 12:26:50,772][1653645] Updated weights for policy 0, policy_version 71744 (0.0012) [2024-06-15 12:26:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 146931712. Throughput: 0: 11116.1. Samples: 36794880. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:26:50,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:26:52,710][1653645] Updated weights for policy 0, policy_version 71807 (0.0013) [2024-06-15 12:26:54,328][1653645] Updated weights for policy 0, policy_version 71863 (0.0012) [2024-06-15 12:26:55,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.4, 300 sec: 44653.3). Total num frames: 147193856. Throughput: 0: 10911.2. Samples: 36850688. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:26:55,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:26:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000071872_147193856.pth... [2024-06-15 12:26:56,120][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000066672_136544256.pth [2024-06-15 12:26:57,170][1653645] Updated weights for policy 0, policy_version 71930 (0.0013) [2024-06-15 12:27:00,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 147357696. Throughput: 0: 11116.1. Samples: 36929024. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:00,960][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:27:01,416][1653645] Updated weights for policy 0, policy_version 71973 (0.0012) [2024-06-15 12:27:03,849][1653645] Updated weights for policy 0, policy_version 72019 (0.0013) [2024-06-15 12:27:05,855][1653645] Updated weights for policy 0, policy_version 72099 (0.0011) [2024-06-15 12:27:05,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 147652608. Throughput: 0: 11218.5. Samples: 36961280. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:05,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:27:07,216][1651596] Signal inference workers to stop experience collection... (3750 times) [2024-06-15 12:27:07,257][1653645] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-15 12:27:07,451][1651596] Signal inference workers to resume experience collection... (3750 times) [2024-06-15 12:27:07,453][1653645] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-15 12:27:07,688][1653645] Updated weights for policy 0, policy_version 72145 (0.0011) [2024-06-15 12:27:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 147849216. Throughput: 0: 11013.7. Samples: 37018624. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:10,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:27:13,148][1653645] Updated weights for policy 0, policy_version 72208 (0.0013) [2024-06-15 12:27:14,307][1653645] Updated weights for policy 0, policy_version 72251 (0.0012) [2024-06-15 12:27:15,958][1648982] Fps is (10 sec: 32766.9, 60 sec: 43144.2, 300 sec: 43986.9). Total num frames: 147980288. Throughput: 0: 11263.9. Samples: 37090816. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:15,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:27:16,417][1653645] Updated weights for policy 0, policy_version 72290 (0.0015) [2024-06-15 12:27:17,411][1653645] Updated weights for policy 0, policy_version 72338 (0.0037) [2024-06-15 12:27:18,737][1653645] Updated weights for policy 0, policy_version 72388 (0.0013) [2024-06-15 12:27:19,919][1653645] Updated weights for policy 0, policy_version 72435 (0.0014) [2024-06-15 12:27:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 148373504. Throughput: 0: 11161.6. Samples: 37122048. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:20,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:27:25,180][1653645] Updated weights for policy 0, policy_version 72480 (0.0072) [2024-06-15 12:27:25,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 148471808. Throughput: 0: 11218.5. Samples: 37197824. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:27:27,998][1653645] Updated weights for policy 0, policy_version 72536 (0.0012) [2024-06-15 12:27:29,786][1653645] Updated weights for policy 0, policy_version 72624 (0.0099) [2024-06-15 12:27:30,811][1653645] Updated weights for policy 0, policy_version 72660 (0.0011) [2024-06-15 12:27:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 148799488. Throughput: 0: 11138.8. Samples: 37254656. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:30,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:27:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 148897792. Throughput: 0: 10968.2. Samples: 37288448. Policy #0 lag: (min: 15.0, avg: 98.9, max: 271.0) [2024-06-15 12:27:35,960][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:27:37,050][1653645] Updated weights for policy 0, policy_version 72737 (0.0013) [2024-06-15 12:27:39,428][1653645] Updated weights for policy 0, policy_version 72774 (0.0055) [2024-06-15 12:27:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.8, 300 sec: 44431.3). Total num frames: 149159936. Throughput: 0: 11355.1. Samples: 37361664. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:27:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:27:41,317][1653645] Updated weights for policy 0, policy_version 72864 (0.0014) [2024-06-15 12:27:42,716][1653645] Updated weights for policy 0, policy_version 72915 (0.0011) [2024-06-15 12:27:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 149422080. Throughput: 0: 10899.9. Samples: 37419520. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:27:45,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:27:48,944][1653645] Updated weights for policy 0, policy_version 72992 (0.0058) [2024-06-15 12:27:50,962][1648982] Fps is (10 sec: 39304.0, 60 sec: 43687.5, 300 sec: 44319.5). Total num frames: 149553152. Throughput: 0: 11024.0. Samples: 37457408. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:27:50,963][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:27:51,467][1653645] Updated weights for policy 0, policy_version 73034 (0.0013) [2024-06-15 12:27:52,694][1651596] Signal inference workers to stop experience collection... (3800 times) [2024-06-15 12:27:52,728][1653645] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-15 12:27:52,979][1651596] Signal inference workers to resume experience collection... (3800 times) [2024-06-15 12:27:52,980][1653645] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-15 12:27:54,044][1653645] Updated weights for policy 0, policy_version 73145 (0.0014) [2024-06-15 12:27:55,958][1648982] Fps is (10 sec: 49151.0, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 149913600. Throughput: 0: 11025.0. Samples: 37514752. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:27:55,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:28:00,913][1653645] Updated weights for policy 0, policy_version 73220 (0.0014) [2024-06-15 12:28:00,958][1648982] Fps is (10 sec: 39339.3, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 149946368. Throughput: 0: 10979.6. Samples: 37584896. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:28:00,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:28:02,814][1653645] Updated weights for policy 0, policy_version 73281 (0.0015) [2024-06-15 12:28:04,160][1653645] Updated weights for policy 0, policy_version 73338 (0.0014) [2024-06-15 12:28:05,822][1653645] Updated weights for policy 0, policy_version 73408 (0.0016) [2024-06-15 12:28:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 150339584. Throughput: 0: 11059.1. Samples: 37619712. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:28:05,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:28:07,558][1653645] Updated weights for policy 0, policy_version 73463 (0.0014) [2024-06-15 12:28:10,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 150470656. Throughput: 0: 10877.2. Samples: 37687296. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:28:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:28:12,666][1653645] Updated weights for policy 0, policy_version 73488 (0.0012) [2024-06-15 12:28:13,665][1653645] Updated weights for policy 0, policy_version 73534 (0.0022) [2024-06-15 12:28:15,706][1653645] Updated weights for policy 0, policy_version 73600 (0.0016) [2024-06-15 12:28:15,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 45875.4, 300 sec: 44431.9). Total num frames: 150732800. Throughput: 0: 11195.7. Samples: 37758464. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:28:15,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:28:17,163][1653645] Updated weights for policy 0, policy_version 73657 (0.0051) [2024-06-15 12:28:19,500][1653645] Updated weights for policy 0, policy_version 73726 (0.0012) [2024-06-15 12:28:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 150994944. Throughput: 0: 10990.9. Samples: 37783040. Policy #0 lag: (min: 15.0, avg: 97.8, max: 271.0) [2024-06-15 12:28:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:28:25,408][1653645] Updated weights for policy 0, policy_version 73776 (0.0012) [2024-06-15 12:28:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 151126016. Throughput: 0: 10979.5. Samples: 37855744. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:25,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:28:26,346][1653645] Updated weights for policy 0, policy_version 73811 (0.0017) [2024-06-15 12:28:27,833][1653645] Updated weights for policy 0, policy_version 73858 (0.0011) [2024-06-15 12:28:29,439][1653645] Updated weights for policy 0, policy_version 73921 (0.0016) [2024-06-15 12:28:30,632][1653645] Updated weights for policy 0, policy_version 73980 (0.0013) [2024-06-15 12:28:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 151519232. Throughput: 0: 11013.7. Samples: 37915136. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:28:35,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 43987.5). Total num frames: 151519232. Throughput: 0: 11014.8. Samples: 37953024. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:28:37,434][1653645] Updated weights for policy 0, policy_version 74032 (0.0030) [2024-06-15 12:28:38,483][1651596] Signal inference workers to stop experience collection... (3850 times) [2024-06-15 12:28:38,543][1653645] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-15 12:28:38,714][1651596] Signal inference workers to resume experience collection... (3850 times) [2024-06-15 12:28:38,725][1653645] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-15 12:28:38,883][1653645] Updated weights for policy 0, policy_version 74084 (0.0012) [2024-06-15 12:28:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 151879680. Throughput: 0: 11252.7. Samples: 38021120. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:40,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:28:41,026][1653645] Updated weights for policy 0, policy_version 74161 (0.0021) [2024-06-15 12:28:42,623][1653645] Updated weights for policy 0, policy_version 74233 (0.0015) [2024-06-15 12:28:45,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 152043520. Throughput: 0: 11093.3. Samples: 38084096. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:45,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 12:28:45,960][1651596] Saving new best policy, reward=37.440! [2024-06-15 12:28:49,304][1653645] Updated weights for policy 0, policy_version 74288 (0.0011) [2024-06-15 12:28:50,629][1653645] Updated weights for policy 0, policy_version 74336 (0.0012) [2024-06-15 12:28:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45332.4, 300 sec: 44320.1). Total num frames: 152272896. Throughput: 0: 11161.7. Samples: 38121984. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:28:52,681][1653645] Updated weights for policy 0, policy_version 74400 (0.0016) [2024-06-15 12:28:53,987][1653645] Updated weights for policy 0, policy_version 74451 (0.0021) [2024-06-15 12:28:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 152567808. Throughput: 0: 10945.4. Samples: 38179840. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:28:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:28:55,981][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000074496_152567808.pth... [2024-06-15 12:28:56,039][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000069248_141819904.pth [2024-06-15 12:28:59,806][1653645] Updated weights for policy 0, policy_version 74498 (0.0012) [2024-06-15 12:29:00,966][1648982] Fps is (10 sec: 39290.9, 60 sec: 45323.2, 300 sec: 44318.9). Total num frames: 152666112. Throughput: 0: 10989.0. Samples: 38253056. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:29:00,966][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:29:01,024][1653645] Updated weights for policy 0, policy_version 74552 (0.0011) [2024-06-15 12:29:02,742][1653645] Updated weights for policy 0, policy_version 74614 (0.0012) [2024-06-15 12:29:05,073][1653645] Updated weights for policy 0, policy_version 74673 (0.0032) [2024-06-15 12:29:05,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44237.0, 300 sec: 44875.5). Total num frames: 152993792. Throughput: 0: 11104.7. Samples: 38282752. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:29:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:29:07,101][1653645] Updated weights for policy 0, policy_version 74751 (0.0012) [2024-06-15 12:29:10,959][1648982] Fps is (10 sec: 42629.1, 60 sec: 43690.2, 300 sec: 44097.8). Total num frames: 153092096. Throughput: 0: 10911.2. Samples: 38346752. Policy #0 lag: (min: 10.0, avg: 86.1, max: 266.0) [2024-06-15 12:29:10,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:29:13,103][1653645] Updated weights for policy 0, policy_version 74815 (0.0011) [2024-06-15 12:29:14,837][1653645] Updated weights for policy 0, policy_version 74875 (0.0013) [2024-06-15 12:29:15,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 153354240. Throughput: 0: 11059.2. Samples: 38412800. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:15,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:29:18,540][1653645] Updated weights for policy 0, policy_version 74962 (0.0013) [2024-06-15 12:29:20,959][1648982] Fps is (10 sec: 52432.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 153616384. Throughput: 0: 10717.9. Samples: 38435328. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:20,960][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:29:25,086][1653645] Updated weights for policy 0, policy_version 75024 (0.0011) [2024-06-15 12:29:25,693][1651596] Signal inference workers to stop experience collection... (3900 times) [2024-06-15 12:29:25,747][1653645] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-15 12:29:25,929][1651596] Signal inference workers to resume experience collection... (3900 times) [2024-06-15 12:29:25,949][1653645] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-15 12:29:25,962][1648982] Fps is (10 sec: 36028.3, 60 sec: 43141.4, 300 sec: 44208.4). Total num frames: 153714688. Throughput: 0: 10819.2. Samples: 38508032. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:25,963][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:29:27,387][1653645] Updated weights for policy 0, policy_version 75106 (0.0156) [2024-06-15 12:29:29,155][1653645] Updated weights for policy 0, policy_version 75140 (0.0011) [2024-06-15 12:29:30,885][1653645] Updated weights for policy 0, policy_version 75216 (0.0100) [2024-06-15 12:29:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 44542.3). Total num frames: 154042368. Throughput: 0: 10797.6. Samples: 38569984. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:29:31,999][1653645] Updated weights for policy 0, policy_version 75264 (0.0016) [2024-06-15 12:29:35,958][1648982] Fps is (10 sec: 42618.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 154140672. Throughput: 0: 10626.9. Samples: 38600192. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:35,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:29:38,604][1653645] Updated weights for policy 0, policy_version 75329 (0.0013) [2024-06-15 12:29:39,889][1653645] Updated weights for policy 0, policy_version 75384 (0.0011) [2024-06-15 12:29:40,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42052.3, 300 sec: 44431.2). Total num frames: 154402816. Throughput: 0: 10774.8. Samples: 38664704. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:40,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:29:42,178][1653645] Updated weights for policy 0, policy_version 75442 (0.0013) [2024-06-15 12:29:43,719][1653645] Updated weights for policy 0, policy_version 75511 (0.0019) [2024-06-15 12:29:45,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 154664960. Throughput: 0: 10651.4. Samples: 38732288. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:29:49,848][1653645] Updated weights for policy 0, policy_version 75568 (0.0014) [2024-06-15 12:29:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 154828800. Throughput: 0: 10865.8. Samples: 38771712. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:50,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:29:51,740][1653645] Updated weights for policy 0, policy_version 75641 (0.0013) [2024-06-15 12:29:53,534][1653645] Updated weights for policy 0, policy_version 75684 (0.0012) [2024-06-15 12:29:55,531][1653645] Updated weights for policy 0, policy_version 75774 (0.0013) [2024-06-15 12:29:55,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 155189248. Throughput: 0: 10774.9. Samples: 38831616. Policy #0 lag: (min: 15.0, avg: 92.3, max: 271.0) [2024-06-15 12:29:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:30:00,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43150.2, 300 sec: 44209.1). Total num frames: 155254784. Throughput: 0: 10899.9. Samples: 38903296. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:00,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 12:30:01,435][1653645] Updated weights for policy 0, policy_version 75828 (0.0011) [2024-06-15 12:30:03,301][1653645] Updated weights for policy 0, policy_version 75904 (0.0013) [2024-06-15 12:30:05,567][1653645] Updated weights for policy 0, policy_version 75967 (0.0013) [2024-06-15 12:30:05,957][1648982] Fps is (10 sec: 39322.1, 60 sec: 43144.7, 300 sec: 44431.3). Total num frames: 155582464. Throughput: 0: 10968.2. Samples: 38928896. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:30:06,069][1651596] Signal inference workers to stop experience collection... (3950 times) [2024-06-15 12:30:06,117][1653645] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-15 12:30:06,431][1651596] Signal inference workers to resume experience collection... (3950 times) [2024-06-15 12:30:06,432][1653645] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-15 12:30:07,274][1653645] Updated weights for policy 0, policy_version 76019 (0.0014) [2024-06-15 12:30:10,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 43691.0, 300 sec: 43986.9). Total num frames: 155713536. Throughput: 0: 10980.6. Samples: 39002112. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:10,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:30:13,209][1653645] Updated weights for policy 0, policy_version 76068 (0.0011) [2024-06-15 12:30:15,108][1653645] Updated weights for policy 0, policy_version 76159 (0.0104) [2024-06-15 12:30:15,970][1648982] Fps is (10 sec: 39272.2, 60 sec: 43681.6, 300 sec: 44207.2). Total num frames: 155975680. Throughput: 0: 10953.8. Samples: 39063040. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:15,971][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:30:17,880][1653645] Updated weights for policy 0, policy_version 76215 (0.0076) [2024-06-15 12:30:19,169][1653645] Updated weights for policy 0, policy_version 76257 (0.0018) [2024-06-15 12:30:20,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 156237824. Throughput: 0: 11025.0. Samples: 39096320. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:30:25,325][1653645] Updated weights for policy 0, policy_version 76352 (0.0099) [2024-06-15 12:30:25,958][1648982] Fps is (10 sec: 42651.6, 60 sec: 44786.4, 300 sec: 44098.0). Total num frames: 156401664. Throughput: 0: 11150.2. Samples: 39166464. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:25,960][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:30:26,825][1653645] Updated weights for policy 0, policy_version 76411 (0.0012) [2024-06-15 12:30:29,750][1653645] Updated weights for policy 0, policy_version 76473 (0.0010) [2024-06-15 12:30:30,958][1648982] Fps is (10 sec: 39319.6, 60 sec: 43144.2, 300 sec: 44322.7). Total num frames: 156631040. Throughput: 0: 10979.5. Samples: 39226368. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:30,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:30:32,043][1653645] Updated weights for policy 0, policy_version 76543 (0.0014) [2024-06-15 12:30:35,990][1648982] Fps is (10 sec: 42460.4, 60 sec: 44758.7, 300 sec: 44093.2). Total num frames: 156827648. Throughput: 0: 10801.1. Samples: 39258112. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:35,991][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:30:36,668][1653645] Updated weights for policy 0, policy_version 76608 (0.0012) [2024-06-15 12:30:40,958][1648982] Fps is (10 sec: 42600.5, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 157057024. Throughput: 0: 10922.7. Samples: 39323136. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:40,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:30:41,305][1653645] Updated weights for policy 0, policy_version 76704 (0.0102) [2024-06-15 12:30:43,238][1653645] Updated weights for policy 0, policy_version 76740 (0.0018) [2024-06-15 12:30:44,173][1653645] Updated weights for policy 0, policy_version 76784 (0.0013) [2024-06-15 12:30:45,958][1648982] Fps is (10 sec: 46024.4, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 157286400. Throughput: 0: 10854.4. Samples: 39391744. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 12:30:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:30:48,129][1653645] Updated weights for policy 0, policy_version 76851 (0.0013) [2024-06-15 12:30:50,685][1653645] Updated weights for policy 0, policy_version 76898 (0.0012) [2024-06-15 12:30:50,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 44782.7, 300 sec: 43875.7). Total num frames: 157515776. Throughput: 0: 11013.6. Samples: 39424512. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:30:50,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:30:51,853][1653645] Updated weights for policy 0, policy_version 76944 (0.0013) [2024-06-15 12:30:54,493][1651596] Signal inference workers to stop experience collection... (4000 times) [2024-06-15 12:30:54,539][1653645] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-15 12:30:54,746][1651596] Signal inference workers to resume experience collection... (4000 times) [2024-06-15 12:30:54,747][1653645] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-15 12:30:54,749][1653645] Updated weights for policy 0, policy_version 77008 (0.0013) [2024-06-15 12:30:55,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 157777920. Throughput: 0: 10991.0. Samples: 39496704. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:30:55,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:30:56,153][1653645] Updated weights for policy 0, policy_version 77054 (0.0014) [2024-06-15 12:30:56,177][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000077056_157810688.pth... [2024-06-15 12:30:56,237][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000071872_147193856.pth [2024-06-15 12:30:56,242][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000077056_157810688.pth [2024-06-15 12:30:59,960][1653645] Updated weights for policy 0, policy_version 77113 (0.0014) [2024-06-15 12:31:00,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 157941760. Throughput: 0: 11130.5. Samples: 39563776. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:00,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:31:02,025][1653645] Updated weights for policy 0, policy_version 77152 (0.0011) [2024-06-15 12:31:03,427][1653645] Updated weights for policy 0, policy_version 77216 (0.0072) [2024-06-15 12:31:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 158203904. Throughput: 0: 11127.5. Samples: 39597056. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:05,958][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 12:31:06,219][1653645] Updated weights for policy 0, policy_version 77264 (0.0014) [2024-06-15 12:31:10,527][1653645] Updated weights for policy 0, policy_version 77328 (0.0019) [2024-06-15 12:31:10,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 158400512. Throughput: 0: 11081.9. Samples: 39665152. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:10,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:31:11,546][1653645] Updated weights for policy 0, policy_version 77375 (0.0012) [2024-06-15 12:31:14,481][1653645] Updated weights for policy 0, policy_version 77443 (0.0046) [2024-06-15 12:31:15,595][1653645] Updated weights for policy 0, policy_version 77496 (0.0018) [2024-06-15 12:31:15,959][1648982] Fps is (10 sec: 52420.9, 60 sec: 45883.6, 300 sec: 44430.9). Total num frames: 158728192. Throughput: 0: 11297.9. Samples: 39734784. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:15,960][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:31:18,628][1653645] Updated weights for policy 0, policy_version 77561 (0.0014) [2024-06-15 12:31:20,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 158859264. Throughput: 0: 11454.3. Samples: 39773184. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:20,963][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:31:22,131][1653645] Updated weights for policy 0, policy_version 77601 (0.0012) [2024-06-15 12:31:25,512][1653645] Updated weights for policy 0, policy_version 77668 (0.0081) [2024-06-15 12:31:25,958][1648982] Fps is (10 sec: 36050.2, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 159088640. Throughput: 0: 11468.8. Samples: 39839232. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:25,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:31:27,422][1653645] Updated weights for policy 0, policy_version 77755 (0.0033) [2024-06-15 12:31:30,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.5, 300 sec: 44320.1). Total num frames: 159350784. Throughput: 0: 11355.0. Samples: 39902720. Policy #0 lag: (min: 15.0, avg: 111.1, max: 271.0) [2024-06-15 12:31:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:31:31,082][1653645] Updated weights for policy 0, policy_version 77819 (0.0014) [2024-06-15 12:31:34,882][1653645] Updated weights for policy 0, policy_version 77872 (0.0015) [2024-06-15 12:31:35,974][1648982] Fps is (10 sec: 42598.6, 60 sec: 44807.2, 300 sec: 43986.9). Total num frames: 159514624. Throughput: 0: 11366.5. Samples: 39936000. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:31:35,975][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:31:36,756][1653645] Updated weights for policy 0, policy_version 77906 (0.0015) [2024-06-15 12:31:38,314][1653645] Updated weights for policy 0, policy_version 77969 (0.0013) [2024-06-15 12:31:38,750][1651596] Signal inference workers to stop experience collection... (4050 times) [2024-06-15 12:31:38,795][1653645] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-15 12:31:39,093][1651596] Signal inference workers to resume experience collection... (4050 times) [2024-06-15 12:31:39,118][1653645] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-15 12:31:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 159776768. Throughput: 0: 11127.5. Samples: 39997440. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:31:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:31:41,933][1653645] Updated weights for policy 0, policy_version 78032 (0.0013) [2024-06-15 12:31:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 159907840. Throughput: 0: 11138.8. Samples: 40065024. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:31:45,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:31:46,136][1653645] Updated weights for policy 0, policy_version 78082 (0.0013) [2024-06-15 12:31:47,435][1653645] Updated weights for policy 0, policy_version 78144 (0.0042) [2024-06-15 12:31:50,007][1653645] Updated weights for policy 0, policy_version 78208 (0.0016) [2024-06-15 12:31:50,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45329.3, 300 sec: 44209.1). Total num frames: 160235520. Throughput: 0: 11150.2. Samples: 40098816. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:31:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:31:53,748][1653645] Updated weights for policy 0, policy_version 78275 (0.0012) [2024-06-15 12:31:55,210][1653645] Updated weights for policy 0, policy_version 78336 (0.0013) [2024-06-15 12:31:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 160432128. Throughput: 0: 10968.2. Samples: 40158720. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:31:55,963][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:32:00,039][1653645] Updated weights for policy 0, policy_version 78400 (0.0041) [2024-06-15 12:32:00,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 160595968. Throughput: 0: 10877.5. Samples: 40224256. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:32:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:32:02,123][1653645] Updated weights for policy 0, policy_version 78454 (0.0014) [2024-06-15 12:32:03,616][1653645] Updated weights for policy 0, policy_version 78519 (0.0011) [2024-06-15 12:32:05,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 160858112. Throughput: 0: 10717.9. Samples: 40255488. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:32:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:32:06,578][1653645] Updated weights for policy 0, policy_version 78564 (0.0065) [2024-06-15 12:32:10,571][1653645] Updated weights for policy 0, policy_version 78610 (0.0014) [2024-06-15 12:32:10,986][1648982] Fps is (10 sec: 42477.2, 60 sec: 43670.1, 300 sec: 44204.8). Total num frames: 161021952. Throughput: 0: 10836.2. Samples: 40327168. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:32:10,987][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:32:11,463][1653645] Updated weights for policy 0, policy_version 78649 (0.0012) [2024-06-15 12:32:13,136][1653645] Updated weights for policy 0, policy_version 78714 (0.0011) [2024-06-15 12:32:14,941][1653645] Updated weights for policy 0, policy_version 78773 (0.0012) [2024-06-15 12:32:16,014][1648982] Fps is (10 sec: 48874.7, 60 sec: 43650.5, 300 sec: 43978.4). Total num frames: 161349632. Throughput: 0: 10920.3. Samples: 40394752. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:32:16,015][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:32:18,065][1653645] Updated weights for policy 0, policy_version 78835 (0.0016) [2024-06-15 12:32:20,958][1648982] Fps is (10 sec: 46005.6, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 161480704. Throughput: 0: 10899.9. Samples: 40426496. Policy #0 lag: (min: 36.0, avg: 136.3, max: 292.0) [2024-06-15 12:32:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:32:22,879][1653645] Updated weights for policy 0, policy_version 78883 (0.0012) [2024-06-15 12:32:25,278][1653645] Updated weights for policy 0, policy_version 78974 (0.0012) [2024-06-15 12:32:25,958][1648982] Fps is (10 sec: 39545.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 161742848. Throughput: 0: 11093.3. Samples: 40496640. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:32:26,130][1651596] Signal inference workers to stop experience collection... (4100 times) [2024-06-15 12:32:26,220][1653645] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-15 12:32:26,399][1651596] Signal inference workers to resume experience collection... (4100 times) [2024-06-15 12:32:26,400][1653645] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-15 12:32:26,616][1653645] Updated weights for policy 0, policy_version 79010 (0.0014) [2024-06-15 12:32:30,641][1653645] Updated weights for policy 0, policy_version 79088 (0.0019) [2024-06-15 12:32:30,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 161972224. Throughput: 0: 10956.8. Samples: 40558080. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:30,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 12:32:34,419][1653645] Updated weights for policy 0, policy_version 79136 (0.0058) [2024-06-15 12:32:35,610][1653645] Updated weights for policy 0, policy_version 79184 (0.0013) [2024-06-15 12:32:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 162168832. Throughput: 0: 11036.4. Samples: 40595456. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:35,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:32:36,897][1653645] Updated weights for policy 0, policy_version 79230 (0.0012) [2024-06-15 12:32:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 162398208. Throughput: 0: 11093.4. Samples: 40657920. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:32:41,834][1653645] Updated weights for policy 0, policy_version 79298 (0.0023) [2024-06-15 12:32:43,162][1653645] Updated weights for policy 0, policy_version 79351 (0.0013) [2024-06-15 12:32:45,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 43987.5). Total num frames: 162529280. Throughput: 0: 11173.0. Samples: 40727040. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:32:47,083][1653645] Updated weights for policy 0, policy_version 79416 (0.0013) [2024-06-15 12:32:48,145][1653645] Updated weights for policy 0, policy_version 79456 (0.0012) [2024-06-15 12:32:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 162889728. Throughput: 0: 11104.7. Samples: 40755200. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 12:32:50,975][1653645] Updated weights for policy 0, policy_version 79544 (0.0014) [2024-06-15 12:32:51,141][1651596] Saving new best policy, reward=37.450! [2024-06-15 12:32:54,969][1653645] Updated weights for policy 0, policy_version 79587 (0.0014) [2024-06-15 12:32:55,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 163053568. Throughput: 0: 10986.5. Samples: 40821248. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:32:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:32:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000079616_163053568.pth... [2024-06-15 12:32:56,020][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000074496_152567808.pth [2024-06-15 12:32:58,906][1653645] Updated weights for policy 0, policy_version 79636 (0.0011) [2024-06-15 12:33:00,961][1648982] Fps is (10 sec: 36031.4, 60 sec: 44234.0, 300 sec: 43764.2). Total num frames: 163250176. Throughput: 0: 10753.3. Samples: 40878080. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:33:00,962][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 12:33:00,974][1653645] Updated weights for policy 0, policy_version 79728 (0.0013) [2024-06-15 12:33:01,218][1651596] Saving new best policy, reward=37.510! [2024-06-15 12:33:03,226][1653645] Updated weights for policy 0, policy_version 79803 (0.0018) [2024-06-15 12:33:05,966][1648982] Fps is (10 sec: 39291.6, 60 sec: 43138.9, 300 sec: 43985.7). Total num frames: 163446784. Throughput: 0: 10750.2. Samples: 40910336. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:33:05,966][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:33:10,863][1653645] Updated weights for policy 0, policy_version 79904 (0.0012) [2024-06-15 12:33:10,958][1648982] Fps is (10 sec: 39335.9, 60 sec: 43711.4, 300 sec: 43764.7). Total num frames: 163643392. Throughput: 0: 10729.2. Samples: 40979456. Policy #0 lag: (min: 46.0, avg: 126.4, max: 302.0) [2024-06-15 12:33:10,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:33:12,280][1653645] Updated weights for policy 0, policy_version 79952 (0.0013) [2024-06-15 12:33:13,356][1653645] Updated weights for policy 0, policy_version 79997 (0.0028) [2024-06-15 12:33:13,768][1651596] Signal inference workers to stop experience collection... (4150 times) [2024-06-15 12:33:13,811][1653645] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-15 12:33:13,979][1651596] Signal inference workers to resume experience collection... (4150 times) [2024-06-15 12:33:13,980][1653645] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-15 12:33:14,863][1653645] Updated weights for policy 0, policy_version 80058 (0.0019) [2024-06-15 12:33:15,958][1648982] Fps is (10 sec: 52469.7, 60 sec: 43732.0, 300 sec: 43986.9). Total num frames: 163971072. Throughput: 0: 10843.0. Samples: 41046016. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:15,958][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 12:33:19,646][1653645] Updated weights for policy 0, policy_version 80112 (0.0151) [2024-06-15 12:33:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 164102144. Throughput: 0: 10843.0. Samples: 41083392. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:20,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:33:22,398][1653645] Updated weights for policy 0, policy_version 80165 (0.0012) [2024-06-15 12:33:23,698][1653645] Updated weights for policy 0, policy_version 80208 (0.0012) [2024-06-15 12:33:25,275][1653645] Updated weights for policy 0, policy_version 80266 (0.0011) [2024-06-15 12:33:25,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 164429824. Throughput: 0: 10831.7. Samples: 41145344. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:33:26,349][1653645] Updated weights for policy 0, policy_version 80316 (0.0012) [2024-06-15 12:33:31,010][1648982] Fps is (10 sec: 42375.8, 60 sec: 42561.1, 300 sec: 44090.1). Total num frames: 164528128. Throughput: 0: 10864.5. Samples: 41216512. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:31,011][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 12:33:31,594][1653645] Updated weights for policy 0, policy_version 80368 (0.0016) [2024-06-15 12:33:34,752][1653645] Updated weights for policy 0, policy_version 80416 (0.0012) [2024-06-15 12:33:35,966][1648982] Fps is (10 sec: 36015.6, 60 sec: 43684.9, 300 sec: 43763.5). Total num frames: 164790272. Throughput: 0: 10977.6. Samples: 41249280. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:35,966][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:33:36,789][1653645] Updated weights for policy 0, policy_version 80500 (0.0012) [2024-06-15 12:33:37,981][1653645] Updated weights for policy 0, policy_version 80544 (0.0012) [2024-06-15 12:33:40,958][1648982] Fps is (10 sec: 49410.5, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 165019648. Throughput: 0: 10831.6. Samples: 41308672. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:40,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:33:43,124][1653645] Updated weights for policy 0, policy_version 80608 (0.0013) [2024-06-15 12:33:45,958][1648982] Fps is (10 sec: 36073.4, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 165150720. Throughput: 0: 11071.5. Samples: 41376256. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:33:46,376][1653645] Updated weights for policy 0, policy_version 80641 (0.0020) [2024-06-15 12:33:47,616][1653645] Updated weights for policy 0, policy_version 80691 (0.0017) [2024-06-15 12:33:49,358][1653645] Updated weights for policy 0, policy_version 80768 (0.0018) [2024-06-15 12:33:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 165543936. Throughput: 0: 11038.3. Samples: 41406976. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:33:54,707][1653645] Updated weights for policy 0, policy_version 80835 (0.0050) [2024-06-15 12:33:55,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43144.6, 300 sec: 43988.0). Total num frames: 165642240. Throughput: 0: 11059.2. Samples: 41477120. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:33:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:33:58,810][1653645] Updated weights for policy 0, policy_version 80928 (0.0013) [2024-06-15 12:34:00,142][1653645] Updated weights for policy 0, policy_version 80976 (0.0013) [2024-06-15 12:34:00,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 44239.5, 300 sec: 43764.7). Total num frames: 165904384. Throughput: 0: 10877.2. Samples: 41535488. Policy #0 lag: (min: 8.0, avg: 132.2, max: 296.0) [2024-06-15 12:34:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:34:01,200][1653645] Updated weights for policy 0, policy_version 81023 (0.0044) [2024-06-15 12:34:01,495][1651596] Signal inference workers to stop experience collection... (4200 times) [2024-06-15 12:34:01,539][1653645] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-15 12:34:01,814][1651596] Signal inference workers to resume experience collection... (4200 times) [2024-06-15 12:34:01,822][1653645] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-15 12:34:02,711][1653645] Updated weights for policy 0, policy_version 81085 (0.0013) [2024-06-15 12:34:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43696.4, 300 sec: 43987.0). Total num frames: 166068224. Throughput: 0: 10740.6. Samples: 41566720. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:05,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:34:08,068][1653645] Updated weights for policy 0, policy_version 81136 (0.0012) [2024-06-15 12:34:10,863][1653645] Updated weights for policy 0, policy_version 81184 (0.0119) [2024-06-15 12:34:10,966][1648982] Fps is (10 sec: 36013.7, 60 sec: 43684.4, 300 sec: 43763.4). Total num frames: 166264832. Throughput: 0: 10966.0. Samples: 41638912. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:10,967][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:34:12,012][1653645] Updated weights for policy 0, policy_version 81232 (0.0013) [2024-06-15 12:34:13,741][1653645] Updated weights for policy 0, policy_version 81282 (0.0014) [2024-06-15 12:34:15,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 166592512. Throughput: 0: 10639.3. Samples: 41694720. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:15,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:34:18,547][1653645] Updated weights for policy 0, policy_version 81348 (0.0018) [2024-06-15 12:34:19,744][1653645] Updated weights for policy 0, policy_version 81408 (0.0088) [2024-06-15 12:34:20,958][1648982] Fps is (10 sec: 45914.7, 60 sec: 43690.6, 300 sec: 44098.6). Total num frames: 166723584. Throughput: 0: 10947.4. Samples: 41741824. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:34:24,367][1653645] Updated weights for policy 0, policy_version 81473 (0.0047) [2024-06-15 12:34:25,929][1653645] Updated weights for policy 0, policy_version 81536 (0.0020) [2024-06-15 12:34:25,958][1648982] Fps is (10 sec: 39319.9, 60 sec: 42598.0, 300 sec: 43875.7). Total num frames: 166985728. Throughput: 0: 10888.5. Samples: 41798656. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:25,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:34:27,153][1653645] Updated weights for policy 0, policy_version 81588 (0.0015) [2024-06-15 12:34:30,960][1648982] Fps is (10 sec: 39311.0, 60 sec: 43180.4, 300 sec: 43986.5). Total num frames: 167116800. Throughput: 0: 11001.7. Samples: 41871360. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:30,961][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:34:31,456][1653645] Updated weights for policy 0, policy_version 81632 (0.0013) [2024-06-15 12:34:34,749][1653645] Updated weights for policy 0, policy_version 81684 (0.0011) [2024-06-15 12:34:35,958][1648982] Fps is (10 sec: 36046.4, 60 sec: 42604.1, 300 sec: 43875.8). Total num frames: 167346176. Throughput: 0: 11059.2. Samples: 41904640. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:34:36,551][1653645] Updated weights for policy 0, policy_version 81746 (0.0013) [2024-06-15 12:34:38,967][1653645] Updated weights for policy 0, policy_version 81852 (0.0014) [2024-06-15 12:34:40,959][1648982] Fps is (10 sec: 52438.6, 60 sec: 43690.2, 300 sec: 43986.8). Total num frames: 167641088. Throughput: 0: 10592.5. Samples: 41953792. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:40,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:34:44,281][1653645] Updated weights for policy 0, policy_version 81920 (0.0012) [2024-06-15 12:34:45,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 167772160. Throughput: 0: 10945.4. Samples: 42028032. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:34:48,855][1651596] Signal inference workers to stop experience collection... (4250 times) [2024-06-15 12:34:48,892][1653645] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-15 12:34:49,136][1651596] Signal inference workers to resume experience collection... (4250 times) [2024-06-15 12:34:49,137][1653645] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-15 12:34:49,139][1653645] Updated weights for policy 0, policy_version 81984 (0.0013) [2024-06-15 12:34:50,679][1653645] Updated weights for policy 0, policy_version 82051 (0.0012) [2024-06-15 12:34:50,958][1648982] Fps is (10 sec: 42601.4, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 168067072. Throughput: 0: 10979.5. Samples: 42060800. Policy #0 lag: (min: 9.0, avg: 156.3, max: 263.0) [2024-06-15 12:34:50,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:34:55,323][1653645] Updated weights for policy 0, policy_version 82116 (0.0090) [2024-06-15 12:34:55,958][1648982] Fps is (10 sec: 45873.3, 60 sec: 43144.4, 300 sec: 43986.8). Total num frames: 168230912. Throughput: 0: 10810.9. Samples: 42125312. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:34:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:34:56,230][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000082160_168263680.pth... [2024-06-15 12:34:56,285][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000077056_157810688.pth [2024-06-15 12:34:56,488][1653645] Updated weights for policy 0, policy_version 82169 (0.0013) [2024-06-15 12:35:00,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 42052.3, 300 sec: 43542.6). Total num frames: 168427520. Throughput: 0: 11025.1. Samples: 42190848. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:00,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:35:01,066][1653645] Updated weights for policy 0, policy_version 82241 (0.0014) [2024-06-15 12:35:02,859][1653645] Updated weights for policy 0, policy_version 82337 (0.0014) [2024-06-15 12:35:05,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 43690.2, 300 sec: 43986.8). Total num frames: 168689664. Throughput: 0: 10547.1. Samples: 42216448. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:05,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:35:08,248][1653645] Updated weights for policy 0, policy_version 82402 (0.0012) [2024-06-15 12:35:10,974][1648982] Fps is (10 sec: 39255.5, 60 sec: 42592.6, 300 sec: 43541.9). Total num frames: 168820736. Throughput: 0: 10941.4. Samples: 42291200. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:10,975][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:35:11,532][1653645] Updated weights for policy 0, policy_version 82448 (0.0012) [2024-06-15 12:35:13,707][1653645] Updated weights for policy 0, policy_version 82545 (0.0118) [2024-06-15 12:35:15,359][1653645] Updated weights for policy 0, policy_version 82624 (0.0014) [2024-06-15 12:35:15,958][1648982] Fps is (10 sec: 52432.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 169213952. Throughput: 0: 10513.7. Samples: 42344448. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:35:20,763][1653645] Updated weights for policy 0, policy_version 82678 (0.0012) [2024-06-15 12:35:20,958][1648982] Fps is (10 sec: 49234.7, 60 sec: 43144.6, 300 sec: 43764.7). Total num frames: 169312256. Throughput: 0: 10820.3. Samples: 42391552. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:20,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:35:23,916][1653645] Updated weights for policy 0, policy_version 82721 (0.0011) [2024-06-15 12:35:25,905][1653645] Updated weights for policy 0, policy_version 82816 (0.0120) [2024-06-15 12:35:25,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 169607168. Throughput: 0: 11184.5. Samples: 42457088. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:35:26,949][1653645] Updated weights for policy 0, policy_version 82871 (0.0013) [2024-06-15 12:35:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43692.6, 300 sec: 43769.5). Total num frames: 169738240. Throughput: 0: 11025.0. Samples: 42524160. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:35:31,080][1651596] Signal inference workers to stop experience collection... (4300 times) [2024-06-15 12:35:31,130][1653645] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-15 12:35:31,370][1651596] Signal inference workers to resume experience collection... (4300 times) [2024-06-15 12:35:31,386][1653645] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-15 12:35:32,182][1653645] Updated weights for policy 0, policy_version 82928 (0.0017) [2024-06-15 12:35:34,572][1653645] Updated weights for policy 0, policy_version 82960 (0.0041) [2024-06-15 12:35:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 170000384. Throughput: 0: 11082.0. Samples: 42559488. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:35,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:35:36,772][1653645] Updated weights for policy 0, policy_version 83069 (0.0011) [2024-06-15 12:35:38,080][1653645] Updated weights for policy 0, policy_version 83130 (0.0013) [2024-06-15 12:35:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43691.3, 300 sec: 43986.9). Total num frames: 170262528. Throughput: 0: 10900.0. Samples: 42615808. Policy #0 lag: (min: 191.0, avg: 247.0, max: 399.0) [2024-06-15 12:35:40,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:35:44,144][1653645] Updated weights for policy 0, policy_version 83192 (0.0018) [2024-06-15 12:35:45,966][1648982] Fps is (10 sec: 39287.8, 60 sec: 43684.3, 300 sec: 43652.4). Total num frames: 170393600. Throughput: 0: 11091.2. Samples: 42690048. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:35:45,967][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:35:47,108][1653645] Updated weights for policy 0, policy_version 83239 (0.0023) [2024-06-15 12:35:48,680][1653645] Updated weights for policy 0, policy_version 83325 (0.0014) [2024-06-15 12:35:49,863][1653645] Updated weights for policy 0, policy_version 83386 (0.0015) [2024-06-15 12:35:50,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45329.1, 300 sec: 44097.9). Total num frames: 170786816. Throughput: 0: 11207.2. Samples: 42720768. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:35:50,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:35:55,699][1653645] Updated weights for policy 0, policy_version 83456 (0.0014) [2024-06-15 12:35:55,958][1648982] Fps is (10 sec: 52474.2, 60 sec: 44783.2, 300 sec: 43986.9). Total num frames: 170917888. Throughput: 0: 11325.1. Samples: 42800640. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:35:55,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:35:58,399][1653645] Updated weights for policy 0, policy_version 83504 (0.0012) [2024-06-15 12:35:59,762][1653645] Updated weights for policy 0, policy_version 83568 (0.0013) [2024-06-15 12:36:00,970][1648982] Fps is (10 sec: 45818.7, 60 sec: 46957.7, 300 sec: 44207.2). Total num frames: 171245568. Throughput: 0: 11397.4. Samples: 42857472. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:00,971][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:36:01,537][1653645] Updated weights for policy 0, policy_version 83648 (0.0014) [2024-06-15 12:36:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43691.2, 300 sec: 43764.8). Total num frames: 171311104. Throughput: 0: 11093.4. Samples: 42890752. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:05,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:36:07,645][1653645] Updated weights for policy 0, policy_version 83712 (0.0012) [2024-06-15 12:36:10,444][1653645] Updated weights for policy 0, policy_version 83764 (0.0013) [2024-06-15 12:36:10,958][1648982] Fps is (10 sec: 36089.5, 60 sec: 46434.3, 300 sec: 43653.9). Total num frames: 171606016. Throughput: 0: 11252.6. Samples: 42963456. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:36:10,985][1651596] Signal inference workers to stop experience collection... (4350 times) [2024-06-15 12:36:11,040][1653645] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-15 12:36:11,168][1651596] Signal inference workers to resume experience collection... (4350 times) [2024-06-15 12:36:11,169][1653645] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-15 12:36:12,044][1653645] Updated weights for policy 0, policy_version 83843 (0.0014) [2024-06-15 12:36:13,514][1653645] Updated weights for policy 0, policy_version 83901 (0.0012) [2024-06-15 12:36:15,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 171835392. Throughput: 0: 11184.4. Samples: 43027456. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:36:19,237][1653645] Updated weights for policy 0, policy_version 83961 (0.0012) [2024-06-15 12:36:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 171999232. Throughput: 0: 11286.8. Samples: 43067392. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:20,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:36:21,674][1653645] Updated weights for policy 0, policy_version 84016 (0.0014) [2024-06-15 12:36:23,155][1653645] Updated weights for policy 0, policy_version 84080 (0.0135) [2024-06-15 12:36:25,064][1653645] Updated weights for policy 0, policy_version 84150 (0.0021) [2024-06-15 12:36:25,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.3, 300 sec: 44098.0). Total num frames: 172359680. Throughput: 0: 11343.7. Samples: 43126272. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:25,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:36:30,193][1653645] Updated weights for policy 0, policy_version 84192 (0.0012) [2024-06-15 12:36:30,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 172490752. Throughput: 0: 11425.5. Samples: 43204096. Policy #0 lag: (min: 0.0, avg: 80.2, max: 256.0) [2024-06-15 12:36:30,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:36:32,290][1653645] Updated weights for policy 0, policy_version 84256 (0.0012) [2024-06-15 12:36:33,409][1653645] Updated weights for policy 0, policy_version 84307 (0.0012) [2024-06-15 12:36:35,193][1653645] Updated weights for policy 0, policy_version 84372 (0.0013) [2024-06-15 12:36:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 47513.6, 300 sec: 44320.1). Total num frames: 172851200. Throughput: 0: 11468.8. Samples: 43236864. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:36:35,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:36:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 172883968. Throughput: 0: 11150.2. Samples: 43302400. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:36:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:36:41,814][1653645] Updated weights for policy 0, policy_version 84434 (0.0013) [2024-06-15 12:36:43,453][1653645] Updated weights for policy 0, policy_version 84500 (0.0018) [2024-06-15 12:36:45,238][1653645] Updated weights for policy 0, policy_version 84579 (0.0038) [2024-06-15 12:36:45,872][1653645] Updated weights for policy 0, policy_version 84608 (0.0012) [2024-06-15 12:36:45,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 48066.8, 300 sec: 44209.0). Total num frames: 173277184. Throughput: 0: 11415.1. Samples: 43371008. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:36:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:36:47,405][1653645] Updated weights for policy 0, policy_version 84672 (0.0013) [2024-06-15 12:36:50,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 173408256. Throughput: 0: 11320.9. Samples: 43400192. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:36:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:36:54,183][1651596] Signal inference workers to stop experience collection... (4400 times) [2024-06-15 12:36:54,216][1653645] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-15 12:36:54,382][1651596] Signal inference workers to resume experience collection... (4400 times) [2024-06-15 12:36:54,383][1653645] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-15 12:36:54,544][1653645] Updated weights for policy 0, policy_version 84739 (0.0016) [2024-06-15 12:36:55,958][1648982] Fps is (10 sec: 39318.2, 60 sec: 45874.6, 300 sec: 44320.0). Total num frames: 173670400. Throughput: 0: 11457.3. Samples: 43479040. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:36:55,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:36:56,011][1653645] Updated weights for policy 0, policy_version 84805 (0.0015) [2024-06-15 12:36:56,560][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000084832_173735936.pth... [2024-06-15 12:36:56,658][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000079616_163053568.pth [2024-06-15 12:36:57,726][1653645] Updated weights for policy 0, policy_version 84880 (0.0013) [2024-06-15 12:37:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44792.3, 300 sec: 44320.1). Total num frames: 173932544. Throughput: 0: 11332.3. Samples: 43537408. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:37:00,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:37:05,599][1653645] Updated weights for policy 0, policy_version 84960 (0.0016) [2024-06-15 12:37:05,958][1648982] Fps is (10 sec: 36046.3, 60 sec: 45328.7, 300 sec: 44102.2). Total num frames: 174030848. Throughput: 0: 11411.8. Samples: 43580928. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:37:05,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:37:06,999][1653645] Updated weights for policy 0, policy_version 85026 (0.0014) [2024-06-15 12:37:08,387][1653645] Updated weights for policy 0, policy_version 85079 (0.0141) [2024-06-15 12:37:09,392][1653645] Updated weights for policy 0, policy_version 85121 (0.0012) [2024-06-15 12:37:10,844][1653645] Updated weights for policy 0, policy_version 85176 (0.0013) [2024-06-15 12:37:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 46967.6, 300 sec: 44328.6). Total num frames: 174424064. Throughput: 0: 11434.7. Samples: 43640832. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:37:10,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:37:15,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 174456832. Throughput: 0: 11309.5. Samples: 43713024. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:37:15,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:37:18,662][1653645] Updated weights for policy 0, policy_version 85264 (0.0013) [2024-06-15 12:37:19,668][1653645] Updated weights for policy 0, policy_version 85309 (0.0014) [2024-06-15 12:37:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 46421.4, 300 sec: 44209.0). Total num frames: 174784512. Throughput: 0: 11207.1. Samples: 43741184. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:37:20,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:37:22,118][1653645] Updated weights for policy 0, policy_version 85392 (0.0090) [2024-06-15 12:37:23,288][1653645] Updated weights for policy 0, policy_version 85439 (0.0013) [2024-06-15 12:37:25,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 174981120. Throughput: 0: 11002.3. Samples: 43797504. Policy #0 lag: (min: 111.0, avg: 196.1, max: 367.0) [2024-06-15 12:37:25,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:37:30,958][1648982] Fps is (10 sec: 32766.8, 60 sec: 43690.4, 300 sec: 43875.8). Total num frames: 175112192. Throughput: 0: 11059.1. Samples: 43868672. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:37:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:37:31,196][1653645] Updated weights for policy 0, policy_version 85514 (0.0132) [2024-06-15 12:37:32,358][1653645] Updated weights for policy 0, policy_version 85560 (0.0102) [2024-06-15 12:37:33,012][1651596] Signal inference workers to stop experience collection... (4450 times) [2024-06-15 12:37:33,066][1653645] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-15 12:37:33,218][1651596] Signal inference workers to resume experience collection... (4450 times) [2024-06-15 12:37:33,219][1653645] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-15 12:37:33,404][1653645] Updated weights for policy 0, policy_version 85604 (0.0026) [2024-06-15 12:37:35,261][1653645] Updated weights for policy 0, policy_version 85680 (0.0013) [2024-06-15 12:37:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 175505408. Throughput: 0: 11059.2. Samples: 43897856. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:37:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:37:40,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 175505408. Throughput: 0: 10888.7. Samples: 43969024. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:37:40,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:37:41,995][1653645] Updated weights for policy 0, policy_version 85728 (0.0112) [2024-06-15 12:37:43,924][1653645] Updated weights for policy 0, policy_version 85815 (0.0014) [2024-06-15 12:37:45,939][1653645] Updated weights for policy 0, policy_version 85888 (0.0014) [2024-06-15 12:37:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 175898624. Throughput: 0: 10797.5. Samples: 44023296. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:37:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:37:48,004][1653645] Updated weights for policy 0, policy_version 85952 (0.0081) [2024-06-15 12:37:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 176029696. Throughput: 0: 10490.4. Samples: 44052992. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:37:50,958][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:37:55,958][1648982] Fps is (10 sec: 26213.6, 60 sec: 41506.4, 300 sec: 43765.2). Total num frames: 176160768. Throughput: 0: 10797.4. Samples: 44126720. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:37:55,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:37:56,043][1653645] Updated weights for policy 0, policy_version 86032 (0.0091) [2024-06-15 12:37:57,946][1653645] Updated weights for policy 0, policy_version 86097 (0.0122) [2024-06-15 12:37:58,923][1653645] Updated weights for policy 0, policy_version 86138 (0.0013) [2024-06-15 12:38:00,645][1653645] Updated weights for policy 0, policy_version 86196 (0.0067) [2024-06-15 12:38:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44432.4). Total num frames: 176553984. Throughput: 0: 10331.0. Samples: 44177920. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:38:00,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:38:05,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 42052.5, 300 sec: 43764.7). Total num frames: 176553984. Throughput: 0: 10444.8. Samples: 44211200. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:38:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:38:07,029][1653645] Updated weights for policy 0, policy_version 86229 (0.0011) [2024-06-15 12:38:08,924][1653645] Updated weights for policy 0, policy_version 86304 (0.0014) [2024-06-15 12:38:10,908][1653645] Updated weights for policy 0, policy_version 86391 (0.0015) [2024-06-15 12:38:10,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 41505.9, 300 sec: 43875.8). Total num frames: 176914432. Throughput: 0: 10649.5. Samples: 44276736. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:38:10,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:38:12,882][1653645] Updated weights for policy 0, policy_version 86448 (0.0014) [2024-06-15 12:38:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 177078272. Throughput: 0: 10376.6. Samples: 44335616. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:38:15,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:38:19,553][1651596] Signal inference workers to stop experience collection... (4500 times) [2024-06-15 12:38:19,590][1653645] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-15 12:38:19,592][1653645] Updated weights for policy 0, policy_version 86498 (0.0014) [2024-06-15 12:38:19,843][1651596] Signal inference workers to resume experience collection... (4500 times) [2024-06-15 12:38:19,844][1653645] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-15 12:38:20,960][1648982] Fps is (10 sec: 32763.0, 60 sec: 40958.7, 300 sec: 43431.2). Total num frames: 177242112. Throughput: 0: 10637.8. Samples: 44376576. Policy #0 lag: (min: 15.0, avg: 64.6, max: 271.0) [2024-06-15 12:38:20,960][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:38:21,154][1653645] Updated weights for policy 0, policy_version 86560 (0.0011) [2024-06-15 12:38:22,714][1653645] Updated weights for policy 0, policy_version 86624 (0.0019) [2024-06-15 12:38:24,811][1653645] Updated weights for policy 0, policy_version 86680 (0.0013) [2024-06-15 12:38:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44328.0). Total num frames: 177602560. Throughput: 0: 10365.1. Samples: 44435456. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:25,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:38:30,958][1648982] Fps is (10 sec: 39328.1, 60 sec: 42052.3, 300 sec: 43543.7). Total num frames: 177635328. Throughput: 0: 10797.5. Samples: 44509184. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:30,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:38:31,643][1653645] Updated weights for policy 0, policy_version 86768 (0.0014) [2024-06-15 12:38:33,351][1653645] Updated weights for policy 0, policy_version 86832 (0.0012) [2024-06-15 12:38:35,258][1653645] Updated weights for policy 0, policy_version 86907 (0.0013) [2024-06-15 12:38:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 177995776. Throughput: 0: 10717.9. Samples: 44535296. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:35,958][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 12:38:37,942][1653645] Updated weights for policy 0, policy_version 86964 (0.0013) [2024-06-15 12:38:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 178126848. Throughput: 0: 10479.0. Samples: 44598272. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:40,958][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 12:38:43,477][1653645] Updated weights for policy 0, policy_version 87030 (0.0012) [2024-06-15 12:38:44,644][1653645] Updated weights for policy 0, policy_version 87057 (0.0012) [2024-06-15 12:38:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 43542.6). Total num frames: 178388992. Throughput: 0: 10911.3. Samples: 44668928. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:45,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:38:46,320][1653645] Updated weights for policy 0, policy_version 87120 (0.0012) [2024-06-15 12:38:49,063][1653645] Updated weights for policy 0, policy_version 87200 (0.0015) [2024-06-15 12:38:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 178651136. Throughput: 0: 10877.2. Samples: 44700672. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:38:54,643][1653645] Updated weights for policy 0, policy_version 87264 (0.0017) [2024-06-15 12:38:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.9, 300 sec: 43653.6). Total num frames: 178782208. Throughput: 0: 11070.7. Samples: 44774912. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:38:55,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:38:56,244][1653645] Updated weights for policy 0, policy_version 87312 (0.0012) [2024-06-15 12:38:56,682][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000087328_178847744.pth... [2024-06-15 12:38:56,855][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000082160_168263680.pth [2024-06-15 12:38:58,749][1653645] Updated weights for policy 0, policy_version 87396 (0.0106) [2024-06-15 12:39:00,302][1653645] Updated weights for policy 0, policy_version 87426 (0.0012) [2024-06-15 12:39:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 179109888. Throughput: 0: 11127.5. Samples: 44836352. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:39:00,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:39:01,075][1651596] Signal inference workers to stop experience collection... (4550 times) [2024-06-15 12:39:01,118][1653645] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-15 12:39:01,309][1651596] Signal inference workers to resume experience collection... (4550 times) [2024-06-15 12:39:01,310][1653645] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-15 12:39:05,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.5, 300 sec: 43766.0). Total num frames: 179175424. Throughput: 0: 10866.1. Samples: 44865536. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:39:05,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 12:39:07,073][1653645] Updated weights for policy 0, policy_version 87522 (0.0013) [2024-06-15 12:39:09,109][1653645] Updated weights for policy 0, policy_version 87584 (0.0018) [2024-06-15 12:39:10,924][1653645] Updated weights for policy 0, policy_version 87648 (0.0051) [2024-06-15 12:39:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.7, 300 sec: 43764.7). Total num frames: 179503104. Throughput: 0: 10990.9. Samples: 44930048. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:39:10,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:39:11,645][1653645] Updated weights for policy 0, policy_version 87678 (0.0009) [2024-06-15 12:39:13,980][1653645] Updated weights for policy 0, policy_version 87742 (0.0016) [2024-06-15 12:39:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 179699712. Throughput: 0: 10740.6. Samples: 44992512. Policy #0 lag: (min: 92.0, avg: 175.0, max: 331.0) [2024-06-15 12:39:15,959][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:39:19,388][1653645] Updated weights for policy 0, policy_version 87802 (0.0011) [2024-06-15 12:39:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44238.1, 300 sec: 43764.8). Total num frames: 179896320. Throughput: 0: 11059.2. Samples: 45032960. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:20,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:39:21,268][1653645] Updated weights for policy 0, policy_version 87856 (0.0012) [2024-06-15 12:39:22,887][1653645] Updated weights for policy 0, policy_version 87908 (0.0011) [2024-06-15 12:39:25,517][1653645] Updated weights for policy 0, policy_version 87968 (0.0017) [2024-06-15 12:39:25,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43144.6, 300 sec: 44320.5). Total num frames: 180191232. Throughput: 0: 10888.6. Samples: 45088256. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:25,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:39:29,984][1653645] Updated weights for policy 0, policy_version 88017 (0.0013) [2024-06-15 12:39:30,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44783.2, 300 sec: 43986.9). Total num frames: 180322304. Throughput: 0: 10945.4. Samples: 45161472. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:39:32,559][1653645] Updated weights for policy 0, policy_version 88080 (0.0012) [2024-06-15 12:39:33,851][1653645] Updated weights for policy 0, policy_version 88128 (0.0052) [2024-06-15 12:39:35,290][1653645] Updated weights for policy 0, policy_version 88186 (0.0020) [2024-06-15 12:39:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 43987.0). Total num frames: 180617216. Throughput: 0: 10877.2. Samples: 45190144. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:39:37,842][1653645] Updated weights for policy 0, policy_version 88240 (0.0011) [2024-06-15 12:39:40,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 180748288. Throughput: 0: 10786.1. Samples: 45260288. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:39:42,850][1653645] Updated weights for policy 0, policy_version 88320 (0.0034) [2024-06-15 12:39:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 181010432. Throughput: 0: 10797.5. Samples: 45322240. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:39:46,116][1653645] Updated weights for policy 0, policy_version 88400 (0.0095) [2024-06-15 12:39:49,508][1653645] Updated weights for policy 0, policy_version 88449 (0.0019) [2024-06-15 12:39:49,842][1651596] Signal inference workers to stop experience collection... (4600 times) [2024-06-15 12:39:49,907][1653645] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-15 12:39:50,068][1651596] Signal inference workers to resume experience collection... (4600 times) [2024-06-15 12:39:50,069][1653645] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-15 12:39:50,595][1653645] Updated weights for policy 0, policy_version 88512 (0.0012) [2024-06-15 12:39:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 181272576. Throughput: 0: 10865.8. Samples: 45354496. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:50,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:39:54,437][1653645] Updated weights for policy 0, policy_version 88570 (0.0012) [2024-06-15 12:39:55,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 181436416. Throughput: 0: 10922.7. Samples: 45421568. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:39:55,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:39:57,066][1653645] Updated weights for policy 0, policy_version 88640 (0.0020) [2024-06-15 12:39:59,105][1653645] Updated weights for policy 0, policy_version 88695 (0.0015) [2024-06-15 12:40:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 43987.0). Total num frames: 181665792. Throughput: 0: 11082.0. Samples: 45491200. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:40:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 12:40:02,161][1653645] Updated weights for policy 0, policy_version 88752 (0.0012) [2024-06-15 12:40:04,638][1653645] Updated weights for policy 0, policy_version 88785 (0.0047) [2024-06-15 12:40:05,622][1653645] Updated weights for policy 0, policy_version 88830 (0.0017) [2024-06-15 12:40:05,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.3, 300 sec: 44433.7). Total num frames: 181927936. Throughput: 0: 11002.3. Samples: 45528064. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:40:05,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 12:40:08,116][1653645] Updated weights for policy 0, policy_version 88890 (0.0013) [2024-06-15 12:40:10,504][1653645] Updated weights for policy 0, policy_version 88947 (0.0020) [2024-06-15 12:40:10,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 182190080. Throughput: 0: 11275.4. Samples: 45595648. Policy #0 lag: (min: 15.0, avg: 93.0, max: 271.0) [2024-06-15 12:40:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:40:12,765][1653645] Updated weights for policy 0, policy_version 88979 (0.0011) [2024-06-15 12:40:15,291][1653645] Updated weights for policy 0, policy_version 89040 (0.0020) [2024-06-15 12:40:15,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 182386688. Throughput: 0: 11229.8. Samples: 45666816. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:15,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:40:16,649][1653645] Updated weights for policy 0, policy_version 89088 (0.0089) [2024-06-15 12:40:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 182583296. Throughput: 0: 11309.5. Samples: 45699072. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 12:40:20,983][1653645] Updated weights for policy 0, policy_version 89157 (0.0014) [2024-06-15 12:40:22,180][1653645] Updated weights for policy 0, policy_version 89209 (0.0013) [2024-06-15 12:40:24,714][1653645] Updated weights for policy 0, policy_version 89265 (0.0089) [2024-06-15 12:40:25,959][1648982] Fps is (10 sec: 45876.0, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 182845440. Throughput: 0: 11320.9. Samples: 45769728. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:25,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:40:27,689][1653645] Updated weights for policy 0, policy_version 89312 (0.0014) [2024-06-15 12:40:30,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 183074816. Throughput: 0: 11332.3. Samples: 45832192. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:40:31,054][1653645] Updated weights for policy 0, policy_version 89402 (0.0117) [2024-06-15 12:40:33,332][1653645] Updated weights for policy 0, policy_version 89471 (0.0024) [2024-06-15 12:40:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 183238656. Throughput: 0: 11343.7. Samples: 45864960. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:40:37,329][1653645] Updated weights for policy 0, policy_version 89536 (0.0014) [2024-06-15 12:40:38,824][1651596] Signal inference workers to stop experience collection... (4650 times) [2024-06-15 12:40:38,855][1653645] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-15 12:40:39,053][1651596] Signal inference workers to resume experience collection... (4650 times) [2024-06-15 12:40:39,053][1653645] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-15 12:40:40,186][1653645] Updated weights for policy 0, policy_version 89597 (0.0016) [2024-06-15 12:40:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 44432.5). Total num frames: 183500800. Throughput: 0: 11423.3. Samples: 45935616. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:40:42,550][1653645] Updated weights for policy 0, policy_version 89648 (0.0178) [2024-06-15 12:40:44,679][1653645] Updated weights for policy 0, policy_version 89712 (0.0012) [2024-06-15 12:40:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 183762944. Throughput: 0: 11286.8. Samples: 45999104. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:40:48,020][1653645] Updated weights for policy 0, policy_version 89762 (0.0018) [2024-06-15 12:40:50,841][1653645] Updated weights for policy 0, policy_version 89826 (0.0017) [2024-06-15 12:40:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 183959552. Throughput: 0: 11434.7. Samples: 46042624. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:40:53,162][1653645] Updated weights for policy 0, policy_version 89911 (0.0013) [2024-06-15 12:40:55,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 45875.1, 300 sec: 43877.6). Total num frames: 184188928. Throughput: 0: 11252.5. Samples: 46102016. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:40:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:40:56,299][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000089952_184221696.pth... [2024-06-15 12:40:56,298][1653645] Updated weights for policy 0, policy_version 89952 (0.0050) [2024-06-15 12:40:56,422][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000084832_173735936.pth [2024-06-15 12:40:59,231][1653645] Updated weights for policy 0, policy_version 89987 (0.0011) [2024-06-15 12:41:00,224][1653645] Updated weights for policy 0, policy_version 90042 (0.0024) [2024-06-15 12:41:00,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 184418304. Throughput: 0: 11503.0. Samples: 46184448. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:41:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:41:01,678][1653645] Updated weights for policy 0, policy_version 90080 (0.0159) [2024-06-15 12:41:03,405][1653645] Updated weights for policy 0, policy_version 90132 (0.0014) [2024-06-15 12:41:04,465][1653645] Updated weights for policy 0, policy_version 90176 (0.0012) [2024-06-15 12:41:05,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 45875.4, 300 sec: 44320.1). Total num frames: 184680448. Throughput: 0: 11434.7. Samples: 46213632. Policy #0 lag: (min: 8.0, avg: 116.0, max: 264.0) [2024-06-15 12:41:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:41:08,288][1653645] Updated weights for policy 0, policy_version 90233 (0.0014) [2024-06-15 12:41:10,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 184844288. Throughput: 0: 11366.3. Samples: 46281216. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:10,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:41:11,334][1653645] Updated weights for policy 0, policy_version 90275 (0.0011) [2024-06-15 12:41:12,338][1653645] Updated weights for policy 0, policy_version 90308 (0.0012) [2024-06-15 12:41:13,454][1653645] Updated weights for policy 0, policy_version 90364 (0.0012) [2024-06-15 12:41:15,173][1653645] Updated weights for policy 0, policy_version 90426 (0.0013) [2024-06-15 12:41:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 46967.7, 300 sec: 44764.4). Total num frames: 185204736. Throughput: 0: 11525.7. Samples: 46350848. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:41:19,162][1653645] Updated weights for policy 0, policy_version 90485 (0.0012) [2024-06-15 12:41:20,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 185335808. Throughput: 0: 11582.6. Samples: 46386176. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:41:21,972][1653645] Updated weights for policy 0, policy_version 90515 (0.0012) [2024-06-15 12:41:23,494][1651596] Signal inference workers to stop experience collection... (4700 times) [2024-06-15 12:41:23,532][1653645] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-15 12:41:23,788][1651596] Signal inference workers to resume experience collection... (4700 times) [2024-06-15 12:41:23,789][1653645] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-15 12:41:23,892][1653645] Updated weights for policy 0, policy_version 90592 (0.0127) [2024-06-15 12:41:25,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 46421.3, 300 sec: 44542.2). Total num frames: 185630720. Throughput: 0: 11616.7. Samples: 46458368. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:41:26,554][1653645] Updated weights for policy 0, policy_version 90663 (0.0174) [2024-06-15 12:41:30,286][1653645] Updated weights for policy 0, policy_version 90721 (0.0013) [2024-06-15 12:41:30,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 185827328. Throughput: 0: 11685.0. Samples: 46524928. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:41:34,242][1653645] Updated weights for policy 0, policy_version 90786 (0.0012) [2024-06-15 12:41:35,970][1648982] Fps is (10 sec: 42546.0, 60 sec: 46957.8, 300 sec: 44651.5). Total num frames: 186056704. Throughput: 0: 11613.5. Samples: 46565376. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:35,971][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:41:36,321][1653645] Updated weights for policy 0, policy_version 90874 (0.0098) [2024-06-15 12:41:38,284][1653645] Updated weights for policy 0, policy_version 90917 (0.0011) [2024-06-15 12:41:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 186253312. Throughput: 0: 11594.0. Samples: 46623744. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:41:42,320][1653645] Updated weights for policy 0, policy_version 90996 (0.0017) [2024-06-15 12:41:45,958][1648982] Fps is (10 sec: 32807.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 186384384. Throughput: 0: 11366.4. Samples: 46695936. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:45,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:41:46,460][1653645] Updated weights for policy 0, policy_version 91040 (0.0139) [2024-06-15 12:41:48,503][1653645] Updated weights for policy 0, policy_version 91108 (0.0015) [2024-06-15 12:41:50,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 44320.2). Total num frames: 186744832. Throughput: 0: 11229.9. Samples: 46718976. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 12:41:51,008][1653645] Updated weights for policy 0, policy_version 91200 (0.0016) [2024-06-15 12:41:54,790][1653645] Updated weights for policy 0, policy_version 91256 (0.0017) [2024-06-15 12:41:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45329.0, 300 sec: 43986.8). Total num frames: 186908672. Throughput: 0: 11218.5. Samples: 46786048. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:41:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:41:59,801][1653645] Updated weights for policy 0, policy_version 91328 (0.0013) [2024-06-15 12:42:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44783.0, 300 sec: 44320.2). Total num frames: 187105280. Throughput: 0: 11116.1. Samples: 46851072. Policy #0 lag: (min: 15.0, avg: 141.2, max: 271.0) [2024-06-15 12:42:00,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:42:01,305][1653645] Updated weights for policy 0, policy_version 91392 (0.0014) [2024-06-15 12:42:03,110][1653645] Updated weights for policy 0, policy_version 91449 (0.0013) [2024-06-15 12:42:05,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 187301888. Throughput: 0: 10934.1. Samples: 46878208. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:42:06,389][1653645] Updated weights for policy 0, policy_version 91488 (0.0014) [2024-06-15 12:42:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 187465728. Throughput: 0: 11036.5. Samples: 46955008. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:10,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:42:11,512][1653645] Updated weights for policy 0, policy_version 91568 (0.0107) [2024-06-15 12:42:11,647][1651596] Signal inference workers to stop experience collection... (4750 times) [2024-06-15 12:42:11,675][1653645] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-15 12:42:11,854][1651596] Signal inference workers to resume experience collection... (4750 times) [2024-06-15 12:42:11,855][1653645] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-15 12:42:13,529][1653645] Updated weights for policy 0, policy_version 91647 (0.0121) [2024-06-15 12:42:15,532][1653645] Updated weights for policy 0, policy_version 91703 (0.0014) [2024-06-15 12:42:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 187826176. Throughput: 0: 10649.6. Samples: 47004160. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:42:18,886][1653645] Updated weights for policy 0, policy_version 91766 (0.0014) [2024-06-15 12:42:20,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 187957248. Throughput: 0: 10584.3. Samples: 47041536. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:20,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:42:23,381][1653645] Updated weights for policy 0, policy_version 91824 (0.0013) [2024-06-15 12:42:25,013][1653645] Updated weights for policy 0, policy_version 91888 (0.0012) [2024-06-15 12:42:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 188219392. Throughput: 0: 10808.9. Samples: 47110144. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:42:26,565][1653645] Updated weights for policy 0, policy_version 91936 (0.0010) [2024-06-15 12:42:30,026][1653645] Updated weights for policy 0, policy_version 91986 (0.0017) [2024-06-15 12:42:30,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 188448768. Throughput: 0: 10649.7. Samples: 47175168. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:30,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:42:30,997][1653645] Updated weights for policy 0, policy_version 92031 (0.0012) [2024-06-15 12:42:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43153.5, 300 sec: 44542.3). Total num frames: 188645376. Throughput: 0: 11082.0. Samples: 47217664. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:42:36,284][1653645] Updated weights for policy 0, policy_version 92130 (0.0013) [2024-06-15 12:42:38,970][1653645] Updated weights for policy 0, policy_version 92194 (0.0013) [2024-06-15 12:42:40,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 188874752. Throughput: 0: 10786.1. Samples: 47271424. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:40,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:42:42,448][1653645] Updated weights for policy 0, policy_version 92242 (0.0013) [2024-06-15 12:42:45,886][1653645] Updated weights for policy 0, policy_version 92289 (0.0031) [2024-06-15 12:42:45,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 189005824. Throughput: 0: 10968.2. Samples: 47344640. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:45,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:42:48,175][1653645] Updated weights for policy 0, policy_version 92384 (0.0015) [2024-06-15 12:42:49,000][1653645] Updated weights for policy 0, policy_version 92416 (0.0011) [2024-06-15 12:42:50,832][1653645] Updated weights for policy 0, policy_version 92480 (0.0013) [2024-06-15 12:42:50,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 44236.8, 300 sec: 44875.6). Total num frames: 189399040. Throughput: 0: 10979.6. Samples: 47372288. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:50,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:42:55,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.7, 300 sec: 43986.8). Total num frames: 189530112. Throughput: 0: 10774.7. Samples: 47439872. Policy #0 lag: (min: 31.0, avg: 176.5, max: 287.0) [2024-06-15 12:42:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:42:55,974][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000092544_189530112.pth... [2024-06-15 12:42:56,113][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000087328_178847744.pth [2024-06-15 12:42:58,214][1653645] Updated weights for policy 0, policy_version 92560 (0.0013) [2024-06-15 12:42:58,822][1651596] Signal inference workers to stop experience collection... (4800 times) [2024-06-15 12:42:58,865][1653645] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-15 12:42:59,065][1651596] Signal inference workers to resume experience collection... (4800 times) [2024-06-15 12:42:59,067][1653645] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-15 12:42:59,684][1653645] Updated weights for policy 0, policy_version 92612 (0.0014) [2024-06-15 12:43:00,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 189759488. Throughput: 0: 11081.9. Samples: 47502848. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:43:01,865][1653645] Updated weights for policy 0, policy_version 92674 (0.0012) [2024-06-15 12:43:03,159][1653645] Updated weights for policy 0, policy_version 92736 (0.0023) [2024-06-15 12:43:05,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 44782.9, 300 sec: 44320.2). Total num frames: 189988864. Throughput: 0: 11036.4. Samples: 47538176. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:05,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:43:06,300][1653645] Updated weights for policy 0, policy_version 92799 (0.0014) [2024-06-15 12:43:10,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 190152704. Throughput: 0: 11104.7. Samples: 47609856. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:43:11,014][1653645] Updated weights for policy 0, policy_version 92861 (0.0012) [2024-06-15 12:43:12,753][1653645] Updated weights for policy 0, policy_version 92916 (0.0014) [2024-06-15 12:43:13,826][1653645] Updated weights for policy 0, policy_version 92960 (0.0011) [2024-06-15 12:43:15,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.5, 300 sec: 44764.7). Total num frames: 190447616. Throughput: 0: 11116.1. Samples: 47675392. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:15,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:43:17,375][1653645] Updated weights for policy 0, policy_version 92995 (0.0012) [2024-06-15 12:43:20,340][1653645] Updated weights for policy 0, policy_version 93060 (0.0042) [2024-06-15 12:43:20,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 190644224. Throughput: 0: 10934.1. Samples: 47709696. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:43:22,907][1653645] Updated weights for policy 0, policy_version 93121 (0.0015) [2024-06-15 12:43:25,241][1653645] Updated weights for policy 0, policy_version 93188 (0.0013) [2024-06-15 12:43:25,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 190906368. Throughput: 0: 11241.4. Samples: 47777280. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:43:26,510][1653645] Updated weights for policy 0, policy_version 93245 (0.0010) [2024-06-15 12:43:29,695][1653645] Updated weights for policy 0, policy_version 93302 (0.0112) [2024-06-15 12:43:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 191102976. Throughput: 0: 11150.2. Samples: 47846400. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:30,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:43:32,481][1653645] Updated weights for policy 0, policy_version 93347 (0.0011) [2024-06-15 12:43:35,202][1653645] Updated weights for policy 0, policy_version 93408 (0.0014) [2024-06-15 12:43:35,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 191365120. Throughput: 0: 11252.6. Samples: 47878656. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:43:37,177][1653645] Updated weights for policy 0, policy_version 93472 (0.0011) [2024-06-15 12:43:37,961][1653645] Updated weights for policy 0, policy_version 93504 (0.0011) [2024-06-15 12:43:40,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 191496192. Throughput: 0: 11275.4. Samples: 47947264. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:43:41,986][1653645] Updated weights for policy 0, policy_version 93568 (0.0012) [2024-06-15 12:43:44,583][1653645] Updated weights for policy 0, policy_version 93622 (0.0015) [2024-06-15 12:43:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 191758336. Throughput: 0: 11332.3. Samples: 48012800. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:43:46,745][1651596] Signal inference workers to stop experience collection... (4850 times) [2024-06-15 12:43:46,848][1653645] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-15 12:43:47,010][1651596] Signal inference workers to resume experience collection... (4850 times) [2024-06-15 12:43:47,014][1653645] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-15 12:43:47,833][1653645] Updated weights for policy 0, policy_version 93684 (0.0017) [2024-06-15 12:43:48,973][1653645] Updated weights for policy 0, policy_version 93732 (0.0012) [2024-06-15 12:43:50,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.4, 300 sec: 44875.5). Total num frames: 192020480. Throughput: 0: 11320.8. Samples: 48047616. Policy #0 lag: (min: 31.0, avg: 108.9, max: 287.0) [2024-06-15 12:43:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:43:52,765][1653645] Updated weights for policy 0, policy_version 93763 (0.0031) [2024-06-15 12:43:54,213][1653645] Updated weights for policy 0, policy_version 93824 (0.0015) [2024-06-15 12:43:55,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 45329.1, 300 sec: 44542.2). Total num frames: 192249856. Throughput: 0: 11195.7. Samples: 48113664. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:43:55,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:43:58,613][1653645] Updated weights for policy 0, policy_version 93890 (0.0071) [2024-06-15 12:44:00,491][1653645] Updated weights for policy 0, policy_version 93968 (0.0104) [2024-06-15 12:44:00,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 192479232. Throughput: 0: 11184.4. Samples: 48178688. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:44:05,003][1653645] Updated weights for policy 0, policy_version 94018 (0.0019) [2024-06-15 12:44:05,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 192610304. Throughput: 0: 11127.4. Samples: 48210432. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 12:44:06,469][1653645] Updated weights for policy 0, policy_version 94079 (0.0012) [2024-06-15 12:44:08,232][1653645] Updated weights for policy 0, policy_version 94141 (0.0013) [2024-06-15 12:44:10,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 192806912. Throughput: 0: 11013.7. Samples: 48272896. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:10,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 12:44:10,961][1651596] Saving new best policy, reward=37.520! [2024-06-15 12:44:12,832][1653645] Updated weights for policy 0, policy_version 94224 (0.0013) [2024-06-15 12:44:15,960][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.8, 300 sec: 44653.4). Total num frames: 193069056. Throughput: 0: 10979.5. Samples: 48340480. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:15,962][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:44:16,923][1653645] Updated weights for policy 0, policy_version 94274 (0.0053) [2024-06-15 12:44:18,300][1653645] Updated weights for policy 0, policy_version 94331 (0.0013) [2024-06-15 12:44:19,497][1653645] Updated weights for policy 0, policy_version 94371 (0.0026) [2024-06-15 12:44:20,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 193331200. Throughput: 0: 11002.3. Samples: 48373760. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:20,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 12:44:22,850][1653645] Updated weights for policy 0, policy_version 94432 (0.0012) [2024-06-15 12:44:24,674][1653645] Updated weights for policy 0, policy_version 94524 (0.0012) [2024-06-15 12:44:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44782.8, 300 sec: 44986.6). Total num frames: 193593344. Throughput: 0: 11173.0. Samples: 48450048. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:44:29,275][1653645] Updated weights for policy 0, policy_version 94564 (0.0053) [2024-06-15 12:44:29,663][1651596] Signal inference workers to stop experience collection... (4900 times) [2024-06-15 12:44:29,813][1653645] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-15 12:44:29,957][1651596] Signal inference workers to resume experience collection... (4900 times) [2024-06-15 12:44:29,958][1653645] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-15 12:44:30,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.8, 300 sec: 44653.3). Total num frames: 193789952. Throughput: 0: 11218.5. Samples: 48517632. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:44:31,307][1653645] Updated weights for policy 0, policy_version 94647 (0.0013) [2024-06-15 12:44:34,951][1653645] Updated weights for policy 0, policy_version 94720 (0.0012) [2024-06-15 12:44:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 194084864. Throughput: 0: 11332.3. Samples: 48557568. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:44:39,835][1653645] Updated weights for policy 0, policy_version 94785 (0.0013) [2024-06-15 12:44:40,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 194215936. Throughput: 0: 11320.9. Samples: 48623104. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:40,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 12:44:41,627][1653645] Updated weights for policy 0, policy_version 94864 (0.0122) [2024-06-15 12:44:42,635][1653645] Updated weights for policy 0, policy_version 94912 (0.0013) [2024-06-15 12:44:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 194478080. Throughput: 0: 11411.9. Samples: 48692224. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:44:46,440][1653645] Updated weights for policy 0, policy_version 94983 (0.0012) [2024-06-15 12:44:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.9, 300 sec: 44764.4). Total num frames: 194641920. Throughput: 0: 11320.9. Samples: 48719872. Policy #0 lag: (min: 31.0, avg: 152.1, max: 287.0) [2024-06-15 12:44:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 12:44:51,754][1653645] Updated weights for policy 0, policy_version 95042 (0.0020) [2024-06-15 12:44:53,428][1653645] Updated weights for policy 0, policy_version 95120 (0.0011) [2024-06-15 12:44:54,590][1653645] Updated weights for policy 0, policy_version 95168 (0.0012) [2024-06-15 12:44:55,958][1648982] Fps is (10 sec: 42596.5, 60 sec: 44236.7, 300 sec: 44875.4). Total num frames: 194904064. Throughput: 0: 11445.9. Samples: 48787968. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:44:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:44:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000095168_194904064.pth... [2024-06-15 12:44:56,228][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000089952_184221696.pth [2024-06-15 12:44:57,212][1653645] Updated weights for policy 0, policy_version 95228 (0.0013) [2024-06-15 12:44:59,426][1653645] Updated weights for policy 0, policy_version 95291 (0.0013) [2024-06-15 12:45:00,966][1648982] Fps is (10 sec: 52383.8, 60 sec: 44776.5, 300 sec: 44874.2). Total num frames: 195166208. Throughput: 0: 11534.9. Samples: 48859648. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:00,967][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:45:03,668][1653645] Updated weights for policy 0, policy_version 95344 (0.0012) [2024-06-15 12:45:05,434][1653645] Updated weights for policy 0, policy_version 95419 (0.0020) [2024-06-15 12:45:05,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 46967.6, 300 sec: 44875.5). Total num frames: 195428352. Throughput: 0: 11571.2. Samples: 48894464. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 12:45:08,202][1653645] Updated weights for policy 0, policy_version 95472 (0.0013) [2024-06-15 12:45:10,960][1648982] Fps is (10 sec: 42635.0, 60 sec: 46421.2, 300 sec: 44764.4). Total num frames: 195592192. Throughput: 0: 11377.8. Samples: 48962048. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:10,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:45:11,584][1651596] Signal inference workers to stop experience collection... (4950 times) [2024-06-15 12:45:11,648][1653645] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-15 12:45:11,652][1653645] Updated weights for policy 0, policy_version 95522 (0.0015) [2024-06-15 12:45:12,010][1651596] Signal inference workers to resume experience collection... (4950 times) [2024-06-15 12:45:12,018][1653645] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-15 12:45:14,951][1653645] Updated weights for policy 0, policy_version 95584 (0.0015) [2024-06-15 12:45:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 195821568. Throughput: 0: 11264.0. Samples: 49024512. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:45:17,312][1653645] Updated weights for policy 0, policy_version 95671 (0.0012) [2024-06-15 12:45:20,592][1653645] Updated weights for policy 0, policy_version 95743 (0.0012) [2024-06-15 12:45:20,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 196083712. Throughput: 0: 11036.5. Samples: 49054208. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:45:24,345][1653645] Updated weights for policy 0, policy_version 95798 (0.0098) [2024-06-15 12:45:25,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 43690.4, 300 sec: 44542.2). Total num frames: 196214784. Throughput: 0: 11047.8. Samples: 49120256. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:45:26,871][1653645] Updated weights for policy 0, policy_version 95840 (0.0013) [2024-06-15 12:45:29,163][1653645] Updated weights for policy 0, policy_version 95908 (0.0025) [2024-06-15 12:45:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 196476928. Throughput: 0: 11013.7. Samples: 49187840. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:45:31,871][1653645] Updated weights for policy 0, policy_version 95954 (0.0014) [2024-06-15 12:45:34,365][1653645] Updated weights for policy 0, policy_version 96001 (0.0010) [2024-06-15 12:45:35,814][1653645] Updated weights for policy 0, policy_version 96059 (0.0107) [2024-06-15 12:45:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 196739072. Throughput: 0: 11172.9. Samples: 49222656. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:45:39,305][1653645] Updated weights for policy 0, policy_version 96118 (0.0011) [2024-06-15 12:45:40,934][1653645] Updated weights for policy 0, policy_version 96162 (0.0011) [2024-06-15 12:45:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 196935680. Throughput: 0: 11161.7. Samples: 49290240. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:45:43,138][1653645] Updated weights for policy 0, policy_version 96213 (0.0018) [2024-06-15 12:45:45,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 197132288. Throughput: 0: 11232.0. Samples: 49364992. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:45:46,628][1653645] Updated weights for policy 0, policy_version 96273 (0.0018) [2024-06-15 12:45:49,582][1653645] Updated weights for policy 0, policy_version 96336 (0.0013) [2024-06-15 12:45:50,575][1653645] Updated weights for policy 0, policy_version 96379 (0.0014) [2024-06-15 12:45:50,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45875.3, 300 sec: 44764.5). Total num frames: 197394432. Throughput: 0: 11116.1. Samples: 49394688. Policy #0 lag: (min: 49.0, avg: 121.9, max: 289.0) [2024-06-15 12:45:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:45:52,165][1653645] Updated weights for policy 0, policy_version 96448 (0.0011) [2024-06-15 12:45:55,970][1648982] Fps is (10 sec: 52364.8, 60 sec: 45866.1, 300 sec: 44873.6). Total num frames: 197656576. Throughput: 0: 11113.1. Samples: 49462272. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:45:55,971][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:45:57,730][1653645] Updated weights for policy 0, policy_version 96513 (0.0017) [2024-06-15 12:45:59,167][1653645] Updated weights for policy 0, policy_version 96573 (0.0013) [2024-06-15 12:46:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43697.0, 300 sec: 44431.2). Total num frames: 197787648. Throughput: 0: 11218.5. Samples: 49529344. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:46:01,234][1651596] Signal inference workers to stop experience collection... (5000 times) [2024-06-15 12:46:01,349][1653645] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-15 12:46:01,417][1651596] Signal inference workers to resume experience collection... (5000 times) [2024-06-15 12:46:01,418][1653645] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-15 12:46:02,479][1653645] Updated weights for policy 0, policy_version 96632 (0.0026) [2024-06-15 12:46:03,921][1653645] Updated weights for policy 0, policy_version 96699 (0.0013) [2024-06-15 12:46:05,958][1648982] Fps is (10 sec: 39370.2, 60 sec: 43690.7, 300 sec: 44764.5). Total num frames: 198049792. Throughput: 0: 11241.2. Samples: 49560064. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:46:07,519][1653645] Updated weights for policy 0, policy_version 96758 (0.0013) [2024-06-15 12:46:10,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 44236.6, 300 sec: 44209.0). Total num frames: 198246400. Throughput: 0: 11298.1. Samples: 49628672. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 12:46:11,342][1653645] Updated weights for policy 0, policy_version 96827 (0.0015) [2024-06-15 12:46:14,537][1653645] Updated weights for policy 0, policy_version 96893 (0.0032) [2024-06-15 12:46:15,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 198541312. Throughput: 0: 11150.2. Samples: 49689600. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 12:46:18,955][1653645] Updated weights for policy 0, policy_version 96978 (0.0024) [2024-06-15 12:46:20,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 198705152. Throughput: 0: 11150.3. Samples: 49724416. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:20,959][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:46:21,525][1653645] Updated weights for policy 0, policy_version 97025 (0.0012) [2024-06-15 12:46:25,136][1653645] Updated weights for policy 0, policy_version 97110 (0.0014) [2024-06-15 12:46:25,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 45875.4, 300 sec: 44542.3). Total num frames: 198967296. Throughput: 0: 11207.1. Samples: 49794560. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:46:27,177][1653645] Updated weights for policy 0, policy_version 97169 (0.0019) [2024-06-15 12:46:28,156][1653645] Updated weights for policy 0, policy_version 97213 (0.0014) [2024-06-15 12:46:30,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 44783.0, 300 sec: 44433.1). Total num frames: 199163904. Throughput: 0: 10991.0. Samples: 49859584. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:46:31,428][1653645] Updated weights for policy 0, policy_version 97272 (0.0014) [2024-06-15 12:46:34,570][1653645] Updated weights for policy 0, policy_version 97344 (0.0012) [2024-06-15 12:46:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 199360512. Throughput: 0: 11047.8. Samples: 49891840. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 12:46:37,451][1653645] Updated weights for policy 0, policy_version 97403 (0.0012) [2024-06-15 12:46:40,251][1653645] Updated weights for policy 0, policy_version 97468 (0.0171) [2024-06-15 12:46:40,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 199622656. Throughput: 0: 11062.2. Samples: 49959936. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:46:43,117][1653645] Updated weights for policy 0, policy_version 97536 (0.0012) [2024-06-15 12:46:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 199819264. Throughput: 0: 10990.9. Samples: 50023936. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:46:48,768][1651596] Signal inference workers to stop experience collection... (5050 times) [2024-06-15 12:46:48,884][1653645] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-15 12:46:49,028][1651596] Signal inference workers to resume experience collection... (5050 times) [2024-06-15 12:46:49,029][1653645] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-15 12:46:49,031][1653645] Updated weights for policy 0, policy_version 97632 (0.0155) [2024-06-15 12:46:50,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 200015872. Throughput: 0: 11161.6. Samples: 50062336. Policy #0 lag: (min: 17.0, avg: 139.8, max: 261.0) [2024-06-15 12:46:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:46:51,249][1653645] Updated weights for policy 0, policy_version 97670 (0.0012) [2024-06-15 12:46:52,619][1653645] Updated weights for policy 0, policy_version 97727 (0.0013) [2024-06-15 12:46:55,074][1653645] Updated weights for policy 0, policy_version 97781 (0.0012) [2024-06-15 12:46:55,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43699.5, 300 sec: 44653.3). Total num frames: 200278016. Throughput: 0: 10968.2. Samples: 50122240. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:46:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:46:55,999][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000097792_200278016.pth... [2024-06-15 12:46:56,051][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000092544_189530112.pth [2024-06-15 12:46:58,437][1653645] Updated weights for policy 0, policy_version 97846 (0.0014) [2024-06-15 12:47:00,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 200474624. Throughput: 0: 11184.3. Samples: 50192896. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:47:01,345][1653645] Updated weights for policy 0, policy_version 97919 (0.0117) [2024-06-15 12:47:04,552][1653645] Updated weights for policy 0, policy_version 97976 (0.0088) [2024-06-15 12:47:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 200704000. Throughput: 0: 11116.1. Samples: 50224640. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:47:06,796][1653645] Updated weights for policy 0, policy_version 98043 (0.0033) [2024-06-15 12:47:10,915][1653645] Updated weights for policy 0, policy_version 98112 (0.0126) [2024-06-15 12:47:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44783.2, 300 sec: 44431.2). Total num frames: 200933376. Throughput: 0: 10934.0. Samples: 50286592. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:47:13,531][1653645] Updated weights for policy 0, policy_version 98176 (0.0013) [2024-06-15 12:47:15,961][1648982] Fps is (10 sec: 39310.4, 60 sec: 42596.3, 300 sec: 44541.8). Total num frames: 201097216. Throughput: 0: 10967.5. Samples: 50353152. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:15,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:47:17,066][1653645] Updated weights for policy 0, policy_version 98240 (0.0047) [2024-06-15 12:47:20,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 201326592. Throughput: 0: 10865.7. Samples: 50380800. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:20,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:47:22,306][1653645] Updated weights for policy 0, policy_version 98320 (0.0076) [2024-06-15 12:47:23,577][1653645] Updated weights for policy 0, policy_version 98366 (0.0025) [2024-06-15 12:47:25,958][1648982] Fps is (10 sec: 49165.7, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 201588736. Throughput: 0: 10877.2. Samples: 50449408. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:25,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:47:28,632][1653645] Updated weights for policy 0, policy_version 98448 (0.0039) [2024-06-15 12:47:30,532][1653645] Updated weights for policy 0, policy_version 98513 (0.0020) [2024-06-15 12:47:30,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.5, 300 sec: 44542.2). Total num frames: 201785344. Throughput: 0: 10672.3. Samples: 50504192. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:47:35,187][1653645] Updated weights for policy 0, policy_version 98592 (0.0013) [2024-06-15 12:47:35,962][1648982] Fps is (10 sec: 39303.5, 60 sec: 43687.2, 300 sec: 44430.5). Total num frames: 201981952. Throughput: 0: 10546.1. Samples: 50536960. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:35,963][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:47:37,027][1653645] Updated weights for policy 0, policy_version 98643 (0.0013) [2024-06-15 12:47:37,353][1651596] Signal inference workers to stop experience collection... (5100 times) [2024-06-15 12:47:37,449][1653645] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-15 12:47:37,640][1651596] Signal inference workers to resume experience collection... (5100 times) [2024-06-15 12:47:37,644][1653645] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-15 12:47:39,627][1653645] Updated weights for policy 0, policy_version 98693 (0.0103) [2024-06-15 12:47:40,784][1653645] Updated weights for policy 0, policy_version 98751 (0.0115) [2024-06-15 12:47:40,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 44875.4). Total num frames: 202244096. Throughput: 0: 10854.4. Samples: 50610688. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:40,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:47:45,958][1648982] Fps is (10 sec: 39340.1, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 202375168. Throughput: 0: 10672.4. Samples: 50673152. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:45,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:47:46,617][1653645] Updated weights for policy 0, policy_version 98832 (0.0013) [2024-06-15 12:47:49,056][1653645] Updated weights for policy 0, policy_version 98896 (0.0014) [2024-06-15 12:47:50,958][1648982] Fps is (10 sec: 39323.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 202637312. Throughput: 0: 10752.0. Samples: 50708480. Policy #0 lag: (min: 15.0, avg: 125.7, max: 271.0) [2024-06-15 12:47:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 12:47:52,076][1653645] Updated weights for policy 0, policy_version 98951 (0.0029) [2024-06-15 12:47:53,853][1653645] Updated weights for policy 0, policy_version 99024 (0.0078) [2024-06-15 12:47:55,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 202899456. Throughput: 0: 10672.4. Samples: 50766848. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:47:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:47:58,822][1653645] Updated weights for policy 0, policy_version 99088 (0.0138) [2024-06-15 12:47:59,924][1653645] Updated weights for policy 0, policy_version 99136 (0.0011) [2024-06-15 12:48:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 203030528. Throughput: 0: 10832.3. Samples: 50840576. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:48:02,122][1653645] Updated weights for policy 0, policy_version 99191 (0.0014) [2024-06-15 12:48:04,928][1653645] Updated weights for policy 0, policy_version 99248 (0.0011) [2024-06-15 12:48:05,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 203358208. Throughput: 0: 10979.6. Samples: 50874880. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:05,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:48:06,562][1653645] Updated weights for policy 0, policy_version 99326 (0.0015) [2024-06-15 12:48:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 44209.1). Total num frames: 203489280. Throughput: 0: 10899.9. Samples: 50939904. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:10,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:48:11,575][1653645] Updated weights for policy 0, policy_version 99383 (0.0013) [2024-06-15 12:48:12,858][1653645] Updated weights for policy 0, policy_version 99413 (0.0012) [2024-06-15 12:48:15,960][1648982] Fps is (10 sec: 39321.4, 60 sec: 44238.9, 300 sec: 44431.2). Total num frames: 203751424. Throughput: 0: 11252.6. Samples: 51010560. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:15,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:48:15,968][1653645] Updated weights for policy 0, policy_version 99494 (0.0014) [2024-06-15 12:48:16,957][1653645] Updated weights for policy 0, policy_version 99552 (0.0013) [2024-06-15 12:48:20,972][1648982] Fps is (10 sec: 45811.7, 60 sec: 43680.8, 300 sec: 44206.9). Total num frames: 203948032. Throughput: 0: 11250.3. Samples: 51043328. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:20,972][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:48:22,828][1653645] Updated weights for policy 0, policy_version 99621 (0.0018) [2024-06-15 12:48:23,555][1651596] Signal inference workers to stop experience collection... (5150 times) [2024-06-15 12:48:23,588][1653645] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-15 12:48:23,797][1651596] Signal inference workers to resume experience collection... (5150 times) [2024-06-15 12:48:23,798][1653645] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-15 12:48:23,943][1653645] Updated weights for policy 0, policy_version 99670 (0.0033) [2024-06-15 12:48:25,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 204210176. Throughput: 0: 11207.2. Samples: 51115008. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:48:26,184][1653645] Updated weights for policy 0, policy_version 99728 (0.0012) [2024-06-15 12:48:27,372][1653645] Updated weights for policy 0, policy_version 99792 (0.0116) [2024-06-15 12:48:30,958][1648982] Fps is (10 sec: 52500.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 204472320. Throughput: 0: 11343.6. Samples: 51183616. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:30,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:48:33,737][1653645] Updated weights for policy 0, policy_version 99844 (0.0033) [2024-06-15 12:48:35,203][1653645] Updated weights for policy 0, policy_version 99904 (0.0150) [2024-06-15 12:48:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44240.3, 300 sec: 44542.3). Total num frames: 204636160. Throughput: 0: 11457.4. Samples: 51224064. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:48:36,744][1653645] Updated weights for policy 0, policy_version 99968 (0.0014) [2024-06-15 12:48:39,407][1653645] Updated weights for policy 0, policy_version 100048 (0.0011) [2024-06-15 12:48:40,958][1648982] Fps is (10 sec: 52426.4, 60 sec: 45874.9, 300 sec: 44875.4). Total num frames: 204996608. Throughput: 0: 11389.0. Samples: 51279360. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:40,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 12:48:45,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 204996608. Throughput: 0: 11298.1. Samples: 51348992. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 12:48:46,223][1653645] Updated weights for policy 0, policy_version 100103 (0.0024) [2024-06-15 12:48:48,182][1653645] Updated weights for policy 0, policy_version 100176 (0.0030) [2024-06-15 12:48:48,974][1653645] Updated weights for policy 0, policy_version 100222 (0.0012) [2024-06-15 12:48:50,958][1648982] Fps is (10 sec: 36047.3, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 205357056. Throughput: 0: 11195.7. Samples: 51378688. Policy #0 lag: (min: 13.0, avg: 122.2, max: 269.0) [2024-06-15 12:48:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:48:51,272][1653645] Updated weights for policy 0, policy_version 100288 (0.0084) [2024-06-15 12:48:55,958][1648982] Fps is (10 sec: 52424.7, 60 sec: 43690.1, 300 sec: 44208.9). Total num frames: 205520896. Throughput: 0: 11115.9. Samples: 51440128. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:48:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:48:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000100352_205520896.pth... [2024-06-15 12:48:56,031][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000095168_194904064.pth [2024-06-15 12:48:58,253][1653645] Updated weights for policy 0, policy_version 100354 (0.0015) [2024-06-15 12:48:59,299][1653645] Updated weights for policy 0, policy_version 100407 (0.0025) [2024-06-15 12:49:00,058][1653645] Updated weights for policy 0, policy_version 100435 (0.0012) [2024-06-15 12:49:00,785][1653645] Updated weights for policy 0, policy_version 100478 (0.0085) [2024-06-15 12:49:00,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 205783040. Throughput: 0: 11184.4. Samples: 51513856. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:00,960][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 12:49:02,911][1653645] Updated weights for policy 0, policy_version 100529 (0.0017) [2024-06-15 12:49:04,118][1651596] Signal inference workers to stop experience collection... (5200 times) [2024-06-15 12:49:04,164][1653645] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-15 12:49:04,217][1653645] Updated weights for policy 0, policy_version 100583 (0.0014) [2024-06-15 12:49:04,350][1651596] Signal inference workers to resume experience collection... (5200 times) [2024-06-15 12:49:04,367][1653645] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-15 12:49:05,958][1648982] Fps is (10 sec: 52433.0, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 206045184. Throughput: 0: 11176.4. Samples: 51546112. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:05,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 12:49:09,882][1653645] Updated weights for policy 0, policy_version 100624 (0.0011) [2024-06-15 12:49:10,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 206176256. Throughput: 0: 11161.6. Samples: 51617280. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:10,958][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 12:49:11,830][1653645] Updated weights for policy 0, policy_version 100673 (0.0013) [2024-06-15 12:49:13,569][1653645] Updated weights for policy 0, policy_version 100752 (0.0014) [2024-06-15 12:49:15,495][1653645] Updated weights for policy 0, policy_version 100819 (0.0014) [2024-06-15 12:49:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 44653.4). Total num frames: 206503936. Throughput: 0: 10865.8. Samples: 51672576. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:15,958][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 12:49:20,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43700.7, 300 sec: 43986.9). Total num frames: 206569472. Throughput: 0: 10717.8. Samples: 51706368. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:20,959][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 12:49:22,394][1653645] Updated weights for policy 0, policy_version 100897 (0.0013) [2024-06-15 12:49:23,692][1653645] Updated weights for policy 0, policy_version 100947 (0.0012) [2024-06-15 12:49:24,655][1653645] Updated weights for policy 0, policy_version 100987 (0.0010) [2024-06-15 12:49:25,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 206897152. Throughput: 0: 11082.1. Samples: 51778048. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:25,959][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 12:49:26,965][1653645] Updated weights for policy 0, policy_version 101072 (0.0039) [2024-06-15 12:49:28,250][1653645] Updated weights for policy 0, policy_version 101117 (0.0060) [2024-06-15 12:49:30,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 207093760. Throughput: 0: 10922.7. Samples: 51840512. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:30,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 12:49:35,386][1653645] Updated weights for policy 0, policy_version 101185 (0.0014) [2024-06-15 12:49:35,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 207257600. Throughput: 0: 11093.3. Samples: 51877888. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:35,958][1648982] Avg episode reward: [(0, '35.940')] [2024-06-15 12:49:37,540][1653645] Updated weights for policy 0, policy_version 101251 (0.0076) [2024-06-15 12:49:39,269][1653645] Updated weights for policy 0, policy_version 101316 (0.0012) [2024-06-15 12:49:40,686][1653645] Updated weights for policy 0, policy_version 101375 (0.0024) [2024-06-15 12:49:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43691.2, 300 sec: 44542.3). Total num frames: 207618048. Throughput: 0: 11025.3. Samples: 51936256. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:40,958][1648982] Avg episode reward: [(0, '36.260')] [2024-06-15 12:49:45,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 207618048. Throughput: 0: 10968.1. Samples: 52007424. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:45,959][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 12:49:47,090][1653645] Updated weights for policy 0, policy_version 101440 (0.0013) [2024-06-15 12:49:48,380][1653645] Updated weights for policy 0, policy_version 101495 (0.0016) [2024-06-15 12:49:50,595][1651596] Signal inference workers to stop experience collection... (5250 times) [2024-06-15 12:49:50,624][1653645] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-15 12:49:50,899][1651596] Signal inference workers to resume experience collection... (5250 times) [2024-06-15 12:49:50,900][1653645] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-15 12:49:50,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 44320.2). Total num frames: 207978496. Throughput: 0: 10888.5. Samples: 52036096. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:50,958][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 12:49:51,065][1653645] Updated weights for policy 0, policy_version 101559 (0.0011) [2024-06-15 12:49:52,396][1653645] Updated weights for policy 0, policy_version 101606 (0.0011) [2024-06-15 12:49:52,936][1653645] Updated weights for policy 0, policy_version 101632 (0.0011) [2024-06-15 12:49:55,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43691.2, 300 sec: 43988.2). Total num frames: 208142336. Throughput: 0: 10683.7. Samples: 52098048. Policy #0 lag: (min: 127.0, avg: 237.8, max: 351.0) [2024-06-15 12:49:55,958][1648982] Avg episode reward: [(0, '36.870')] [2024-06-15 12:49:58,707][1653645] Updated weights for policy 0, policy_version 101690 (0.0014) [2024-06-15 12:50:00,645][1653645] Updated weights for policy 0, policy_version 101751 (0.0014) [2024-06-15 12:50:00,970][1648982] Fps is (10 sec: 42544.8, 60 sec: 43681.5, 300 sec: 43985.0). Total num frames: 208404480. Throughput: 0: 10999.2. Samples: 52167680. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:00,971][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 12:50:02,045][1653645] Updated weights for policy 0, policy_version 101793 (0.0014) [2024-06-15 12:50:03,946][1653645] Updated weights for policy 0, policy_version 101831 (0.0024) [2024-06-15 12:50:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 208666624. Throughput: 0: 10945.4. Samples: 52198912. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:05,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 12:50:09,780][1653645] Updated weights for policy 0, policy_version 101890 (0.0012) [2024-06-15 12:50:10,958][1648982] Fps is (10 sec: 36090.4, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 208764928. Throughput: 0: 10911.3. Samples: 52269056. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:50:11,185][1653645] Updated weights for policy 0, policy_version 101952 (0.0012) [2024-06-15 12:50:12,924][1653645] Updated weights for policy 0, policy_version 102007 (0.0012) [2024-06-15 12:50:13,997][1653645] Updated weights for policy 0, policy_version 102037 (0.0012) [2024-06-15 12:50:15,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 209059840. Throughput: 0: 10683.7. Samples: 52321280. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:15,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:50:16,785][1653645] Updated weights for policy 0, policy_version 102088 (0.0013) [2024-06-15 12:50:17,664][1653645] Updated weights for policy 0, policy_version 102144 (0.0088) [2024-06-15 12:50:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 209190912. Throughput: 0: 10661.0. Samples: 52357632. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:20,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:50:23,243][1653645] Updated weights for policy 0, policy_version 102192 (0.0012) [2024-06-15 12:50:24,784][1653645] Updated weights for policy 0, policy_version 102256 (0.0014) [2024-06-15 12:50:25,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 209485824. Throughput: 0: 11002.3. Samples: 52431360. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:25,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:50:26,219][1653645] Updated weights for policy 0, policy_version 102308 (0.0013) [2024-06-15 12:50:28,268][1653645] Updated weights for policy 0, policy_version 102339 (0.0012) [2024-06-15 12:50:29,327][1653645] Updated weights for policy 0, policy_version 102396 (0.0013) [2024-06-15 12:50:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 209715200. Throughput: 0: 10877.2. Samples: 52496896. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:50:35,107][1653645] Updated weights for policy 0, policy_version 102464 (0.0017) [2024-06-15 12:50:35,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 209911808. Throughput: 0: 11275.4. Samples: 52543488. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:35,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:50:36,040][1651596] Signal inference workers to stop experience collection... (5300 times) [2024-06-15 12:50:36,109][1653645] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-15 12:50:36,372][1651596] Signal inference workers to resume experience collection... (5300 times) [2024-06-15 12:50:36,372][1653645] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-15 12:50:36,804][1653645] Updated weights for policy 0, policy_version 102528 (0.0011) [2024-06-15 12:50:40,276][1653645] Updated weights for policy 0, policy_version 102608 (0.0021) [2024-06-15 12:50:40,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 210173952. Throughput: 0: 10990.9. Samples: 52592640. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:40,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:50:41,379][1653645] Updated weights for policy 0, policy_version 102652 (0.0012) [2024-06-15 12:50:45,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43690.9, 300 sec: 43542.6). Total num frames: 210239488. Throughput: 0: 11187.5. Samples: 52670976. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:45,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:50:47,288][1653645] Updated weights for policy 0, policy_version 102720 (0.0036) [2024-06-15 12:50:49,195][1653645] Updated weights for policy 0, policy_version 102802 (0.0104) [2024-06-15 12:50:50,068][1653645] Updated weights for policy 0, policy_version 102848 (0.0014) [2024-06-15 12:50:50,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 43988.7). Total num frames: 210632704. Throughput: 0: 10945.4. Samples: 52691456. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:50,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 12:50:53,275][1653645] Updated weights for policy 0, policy_version 102907 (0.0015) [2024-06-15 12:50:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 210763776. Throughput: 0: 10934.0. Samples: 52761088. Policy #0 lag: (min: 15.0, avg: 88.9, max: 271.0) [2024-06-15 12:50:55,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 12:50:55,973][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000102912_210763776.pth... [2024-06-15 12:50:56,024][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000097792_200278016.pth [2024-06-15 12:50:56,030][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000102912_210763776.pth [2024-06-15 12:50:59,288][1653645] Updated weights for policy 0, policy_version 102982 (0.0013) [2024-06-15 12:51:00,657][1653645] Updated weights for policy 0, policy_version 103040 (0.0013) [2024-06-15 12:51:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43699.9, 300 sec: 43986.9). Total num frames: 211025920. Throughput: 0: 11275.4. Samples: 52828672. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:00,968][1648982] Avg episode reward: [(0, '36.800')] [2024-06-15 12:51:02,024][1653645] Updated weights for policy 0, policy_version 103095 (0.0012) [2024-06-15 12:51:04,543][1653645] Updated weights for policy 0, policy_version 103162 (0.0028) [2024-06-15 12:51:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 211288064. Throughput: 0: 11195.7. Samples: 52861440. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:05,958][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 12:51:10,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 44236.5, 300 sec: 43653.6). Total num frames: 211419136. Throughput: 0: 11286.7. Samples: 52939264. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:10,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 12:51:11,264][1653645] Updated weights for policy 0, policy_version 103248 (0.0015) [2024-06-15 12:51:13,063][1653645] Updated weights for policy 0, policy_version 103315 (0.0011) [2024-06-15 12:51:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 211681280. Throughput: 0: 10968.2. Samples: 52990464. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:15,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:51:16,606][1653645] Updated weights for policy 0, policy_version 103392 (0.0012) [2024-06-15 12:51:20,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 211812352. Throughput: 0: 10649.5. Samples: 53022720. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:51:21,919][1651596] Signal inference workers to stop experience collection... (5350 times) [2024-06-15 12:51:22,012][1653645] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-15 12:51:22,247][1651596] Signal inference workers to resume experience collection... (5350 times) [2024-06-15 12:51:22,249][1653645] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-15 12:51:22,700][1653645] Updated weights for policy 0, policy_version 103456 (0.0043) [2024-06-15 12:51:24,235][1653645] Updated weights for policy 0, policy_version 103529 (0.0015) [2024-06-15 12:51:25,867][1653645] Updated weights for policy 0, policy_version 103616 (0.0014) [2024-06-15 12:51:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 212205568. Throughput: 0: 11070.6. Samples: 53090816. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:25,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:51:29,179][1653645] Updated weights for policy 0, policy_version 103678 (0.0012) [2024-06-15 12:51:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 212336640. Throughput: 0: 10797.5. Samples: 53156864. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:30,958][1648982] Avg episode reward: [(0, '36.230')] [2024-06-15 12:51:35,246][1653645] Updated weights for policy 0, policy_version 103761 (0.0012) [2024-06-15 12:51:35,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 212566016. Throughput: 0: 11229.9. Samples: 53196800. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:35,958][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 12:51:36,412][1653645] Updated weights for policy 0, policy_version 103814 (0.0011) [2024-06-15 12:51:37,524][1653645] Updated weights for policy 0, policy_version 103867 (0.0014) [2024-06-15 12:51:40,947][1653645] Updated weights for policy 0, policy_version 103907 (0.0012) [2024-06-15 12:51:40,958][1648982] Fps is (10 sec: 45872.5, 60 sec: 43690.2, 300 sec: 43986.8). Total num frames: 212795392. Throughput: 0: 11036.3. Samples: 53257728. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:40,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 12:51:41,513][1653645] Updated weights for policy 0, policy_version 103936 (0.0010) [2024-06-15 12:51:45,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 45328.8, 300 sec: 43875.7). Total num frames: 212959232. Throughput: 0: 11150.1. Samples: 53330432. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:45,959][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 12:51:47,378][1653645] Updated weights for policy 0, policy_version 104050 (0.0107) [2024-06-15 12:51:48,760][1653645] Updated weights for policy 0, policy_version 104112 (0.0012) [2024-06-15 12:51:50,958][1648982] Fps is (10 sec: 45878.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 213254144. Throughput: 0: 10979.6. Samples: 53355520. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:50,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:51:52,792][1653645] Updated weights for policy 0, policy_version 104161 (0.0011) [2024-06-15 12:51:55,959][1648982] Fps is (10 sec: 42595.6, 60 sec: 43690.0, 300 sec: 43764.6). Total num frames: 213385216. Throughput: 0: 10842.9. Samples: 53427200. Policy #0 lag: (min: 31.0, avg: 95.0, max: 287.0) [2024-06-15 12:51:55,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:51:56,883][1653645] Updated weights for policy 0, policy_version 104208 (0.0012) [2024-06-15 12:51:58,221][1653645] Updated weights for policy 0, policy_version 104272 (0.0104) [2024-06-15 12:51:59,566][1651596] Signal inference workers to stop experience collection... (5400 times) [2024-06-15 12:51:59,587][1653645] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-15 12:51:59,728][1651596] Signal inference workers to resume experience collection... (5400 times) [2024-06-15 12:51:59,729][1653645] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-15 12:51:59,730][1653645] Updated weights for policy 0, policy_version 104336 (0.0150) [2024-06-15 12:52:00,797][1653645] Updated weights for policy 0, policy_version 104380 (0.0011) [2024-06-15 12:52:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 213778432. Throughput: 0: 11059.2. Samples: 53488128. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:00,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 12:52:04,834][1653645] Updated weights for policy 0, policy_version 104432 (0.0012) [2024-06-15 12:52:05,958][1648982] Fps is (10 sec: 52433.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 213909504. Throughput: 0: 11207.1. Samples: 53527040. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:52:09,813][1653645] Updated weights for policy 0, policy_version 104512 (0.0013) [2024-06-15 12:52:10,958][1648982] Fps is (10 sec: 32767.2, 60 sec: 44783.1, 300 sec: 44098.4). Total num frames: 214106112. Throughput: 0: 11184.3. Samples: 53594112. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:52:11,378][1653645] Updated weights for policy 0, policy_version 104561 (0.0014) [2024-06-15 12:52:13,134][1653645] Updated weights for policy 0, policy_version 104624 (0.0012) [2024-06-15 12:52:15,850][1653645] Updated weights for policy 0, policy_version 104656 (0.0013) [2024-06-15 12:52:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 214335488. Throughput: 0: 11104.7. Samples: 53656576. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:15,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:52:20,343][1653645] Updated weights for policy 0, policy_version 104707 (0.0014) [2024-06-15 12:52:20,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44783.1, 300 sec: 43764.7). Total num frames: 214499328. Throughput: 0: 10979.6. Samples: 53690880. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 12:52:21,551][1653645] Updated weights for policy 0, policy_version 104758 (0.0071) [2024-06-15 12:52:23,185][1653645] Updated weights for policy 0, policy_version 104827 (0.0016) [2024-06-15 12:52:24,566][1653645] Updated weights for policy 0, policy_version 104866 (0.0012) [2024-06-15 12:52:25,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 214827008. Throughput: 0: 10979.7. Samples: 53751808. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:25,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:52:27,717][1653645] Updated weights for policy 0, policy_version 104912 (0.0014) [2024-06-15 12:52:28,868][1653645] Updated weights for policy 0, policy_version 104960 (0.0027) [2024-06-15 12:52:30,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43987.6). Total num frames: 214958080. Throughput: 0: 10865.9. Samples: 53819392. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:52:33,882][1653645] Updated weights for policy 0, policy_version 105009 (0.0024) [2024-06-15 12:52:35,497][1653645] Updated weights for policy 0, policy_version 105075 (0.0013) [2024-06-15 12:52:35,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 215220224. Throughput: 0: 11059.2. Samples: 53853184. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:52:36,551][1653645] Updated weights for policy 0, policy_version 105107 (0.0016) [2024-06-15 12:52:40,314][1653645] Updated weights for policy 0, policy_version 105184 (0.0026) [2024-06-15 12:52:40,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44783.3, 300 sec: 44431.2). Total num frames: 215482368. Throughput: 0: 10888.7. Samples: 53917184. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:40,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:52:44,866][1651596] Signal inference workers to stop experience collection... (5450 times) [2024-06-15 12:52:44,917][1653645] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-15 12:52:45,221][1651596] Signal inference workers to resume experience collection... (5450 times) [2024-06-15 12:52:45,222][1653645] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-15 12:52:45,415][1653645] Updated weights for policy 0, policy_version 105265 (0.0013) [2024-06-15 12:52:45,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44237.1, 300 sec: 43986.9). Total num frames: 215613440. Throughput: 0: 11025.1. Samples: 53984256. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:52:47,128][1653645] Updated weights for policy 0, policy_version 105333 (0.0065) [2024-06-15 12:52:48,670][1653645] Updated weights for policy 0, policy_version 105376 (0.0036) [2024-06-15 12:52:50,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 215875584. Throughput: 0: 10729.3. Samples: 54009856. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:52:53,221][1653645] Updated weights for policy 0, policy_version 105444 (0.0014) [2024-06-15 12:52:55,958][1648982] Fps is (10 sec: 39319.9, 60 sec: 43691.1, 300 sec: 43986.8). Total num frames: 216006656. Throughput: 0: 10763.3. Samples: 54078464. Policy #0 lag: (min: 94.0, avg: 162.3, max: 354.0) [2024-06-15 12:52:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:52:56,472][1653645] Updated weights for policy 0, policy_version 105504 (0.0016) [2024-06-15 12:52:56,474][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000105504_216072192.pth... [2024-06-15 12:52:56,623][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000100352_205520896.pth [2024-06-15 12:52:58,669][1653645] Updated weights for policy 0, policy_version 105569 (0.0012) [2024-06-15 12:53:00,580][1653645] Updated weights for policy 0, policy_version 105632 (0.0030) [2024-06-15 12:53:00,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 43986.9). Total num frames: 216334336. Throughput: 0: 10808.9. Samples: 54142976. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:00,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:53:04,668][1653645] Updated weights for policy 0, policy_version 105699 (0.0014) [2024-06-15 12:53:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 216530944. Throughput: 0: 10831.6. Samples: 54178304. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:05,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 12:53:07,874][1653645] Updated weights for policy 0, policy_version 105764 (0.0097) [2024-06-15 12:53:10,084][1653645] Updated weights for policy 0, policy_version 105812 (0.0013) [2024-06-15 12:53:10,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44237.0, 300 sec: 44098.0). Total num frames: 216760320. Throughput: 0: 11070.6. Samples: 54249984. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:53:10,970][1653645] Updated weights for policy 0, policy_version 105853 (0.0020) [2024-06-15 12:53:15,128][1653645] Updated weights for policy 0, policy_version 105923 (0.0018) [2024-06-15 12:53:15,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44236.9, 300 sec: 44211.1). Total num frames: 216989696. Throughput: 0: 11070.6. Samples: 54317568. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:53:16,194][1653645] Updated weights for policy 0, policy_version 105980 (0.0044) [2024-06-15 12:53:19,178][1653645] Updated weights for policy 0, policy_version 106041 (0.0022) [2024-06-15 12:53:20,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 217186304. Throughput: 0: 11104.7. Samples: 54352896. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:53:22,075][1653645] Updated weights for policy 0, policy_version 106096 (0.0016) [2024-06-15 12:53:23,580][1653645] Updated weights for policy 0, policy_version 106131 (0.0042) [2024-06-15 12:53:25,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 217448448. Throughput: 0: 11138.8. Samples: 54418432. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:25,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:53:27,052][1653645] Updated weights for policy 0, policy_version 106192 (0.0011) [2024-06-15 12:53:28,025][1653645] Updated weights for policy 0, policy_version 106232 (0.0013) [2024-06-15 12:53:30,835][1653645] Updated weights for policy 0, policy_version 106288 (0.0013) [2024-06-15 12:53:30,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 217677824. Throughput: 0: 11207.1. Samples: 54488576. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:53:32,306][1651596] Signal inference workers to stop experience collection... (5500 times) [2024-06-15 12:53:32,387][1653645] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-15 12:53:32,594][1651596] Signal inference workers to resume experience collection... (5500 times) [2024-06-15 12:53:32,595][1653645] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-15 12:53:32,794][1653645] Updated weights for policy 0, policy_version 106321 (0.0012) [2024-06-15 12:53:34,472][1653645] Updated weights for policy 0, policy_version 106384 (0.0011) [2024-06-15 12:53:35,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45875.2, 300 sec: 43987.0). Total num frames: 217972736. Throughput: 0: 11446.0. Samples: 54524928. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:53:39,318][1653645] Updated weights for policy 0, policy_version 106452 (0.0123) [2024-06-15 12:53:40,958][1648982] Fps is (10 sec: 42596.6, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 218103808. Throughput: 0: 11332.3. Samples: 54588416. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:53:41,374][1653645] Updated weights for policy 0, policy_version 106501 (0.0014) [2024-06-15 12:53:42,747][1653645] Updated weights for policy 0, policy_version 106552 (0.0012) [2024-06-15 12:53:45,408][1653645] Updated weights for policy 0, policy_version 106608 (0.0013) [2024-06-15 12:53:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.1, 300 sec: 44097.9). Total num frames: 218365952. Throughput: 0: 11468.8. Samples: 54659072. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 12:53:46,637][1653645] Updated weights for policy 0, policy_version 106643 (0.0017) [2024-06-15 12:53:50,275][1653645] Updated weights for policy 0, policy_version 106706 (0.0012) [2024-06-15 12:53:50,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 45329.0, 300 sec: 44320.2). Total num frames: 218595328. Throughput: 0: 11514.3. Samples: 54696448. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:53:52,532][1653645] Updated weights for policy 0, policy_version 106768 (0.0012) [2024-06-15 12:53:53,792][1653645] Updated weights for policy 0, policy_version 106816 (0.0011) [2024-06-15 12:53:55,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46967.7, 300 sec: 44209.0). Total num frames: 218824704. Throughput: 0: 11320.9. Samples: 54759424. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:53:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:53:56,430][1653645] Updated weights for policy 0, policy_version 106876 (0.0015) [2024-06-15 12:54:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 219021312. Throughput: 0: 11389.2. Samples: 54830080. Policy #0 lag: (min: 5.0, avg: 148.0, max: 261.0) [2024-06-15 12:54:00,962][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:54:01,281][1653645] Updated weights for policy 0, policy_version 106946 (0.0013) [2024-06-15 12:54:04,524][1653645] Updated weights for policy 0, policy_version 107027 (0.0013) [2024-06-15 12:54:05,552][1653645] Updated weights for policy 0, policy_version 107071 (0.0104) [2024-06-15 12:54:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 219283456. Throughput: 0: 11298.1. Samples: 54861312. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:54:08,072][1653645] Updated weights for policy 0, policy_version 107120 (0.0013) [2024-06-15 12:54:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 219480064. Throughput: 0: 11423.3. Samples: 54932480. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:54:11,124][1653645] Updated weights for policy 0, policy_version 107184 (0.0013) [2024-06-15 12:54:13,892][1653645] Updated weights for policy 0, policy_version 107236 (0.0013) [2024-06-15 12:54:15,845][1653645] Updated weights for policy 0, policy_version 107296 (0.0012) [2024-06-15 12:54:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 219742208. Throughput: 0: 11309.5. Samples: 54997504. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:15,991][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 12:54:18,362][1653645] Updated weights for policy 0, policy_version 107331 (0.0015) [2024-06-15 12:54:19,115][1651596] Signal inference workers to stop experience collection... (5550 times) [2024-06-15 12:54:19,165][1653645] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-15 12:54:19,288][1651596] Signal inference workers to resume experience collection... (5550 times) [2024-06-15 12:54:19,288][1653645] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-15 12:54:19,479][1653645] Updated weights for policy 0, policy_version 107384 (0.0013) [2024-06-15 12:54:20,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45875.2, 300 sec: 44209.1). Total num frames: 219938816. Throughput: 0: 11343.6. Samples: 55035392. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 12:54:22,947][1653645] Updated weights for policy 0, policy_version 107440 (0.0013) [2024-06-15 12:54:25,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 220168192. Throughput: 0: 11423.3. Samples: 55102464. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:54:26,010][1653645] Updated weights for policy 0, policy_version 107512 (0.0086) [2024-06-15 12:54:27,745][1653645] Updated weights for policy 0, policy_version 107574 (0.0018) [2024-06-15 12:54:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 46421.2, 300 sec: 44764.4). Total num frames: 220463104. Throughput: 0: 11309.5. Samples: 55168000. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:30,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 12:54:33,846][1653645] Updated weights for policy 0, policy_version 107649 (0.0015) [2024-06-15 12:54:35,106][1653645] Updated weights for policy 0, policy_version 107700 (0.0016) [2024-06-15 12:54:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 220594176. Throughput: 0: 11252.6. Samples: 55202816. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:54:37,776][1653645] Updated weights for policy 0, policy_version 107760 (0.0015) [2024-06-15 12:54:39,405][1653645] Updated weights for policy 0, policy_version 107811 (0.0027) [2024-06-15 12:54:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 220856320. Throughput: 0: 11161.6. Samples: 55261696. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:54:42,932][1653645] Updated weights for policy 0, policy_version 107874 (0.0051) [2024-06-15 12:54:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 220987392. Throughput: 0: 11093.3. Samples: 55329280. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:54:47,375][1653645] Updated weights for policy 0, policy_version 107952 (0.0014) [2024-06-15 12:54:50,204][1653645] Updated weights for policy 0, policy_version 108000 (0.0090) [2024-06-15 12:54:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 221249536. Throughput: 0: 10979.6. Samples: 55355392. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:50,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:54:51,642][1653645] Updated weights for policy 0, policy_version 108052 (0.0039) [2024-06-15 12:54:54,447][1653645] Updated weights for policy 0, policy_version 108097 (0.0014) [2024-06-15 12:54:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44782.9, 300 sec: 44433.1). Total num frames: 221511680. Throughput: 0: 11025.1. Samples: 55428608. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:54:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:54:55,971][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000108160_221511680.pth... [2024-06-15 12:54:56,028][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000102912_210763776.pth [2024-06-15 12:54:57,518][1653645] Updated weights for policy 0, policy_version 108176 (0.0023) [2024-06-15 12:55:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 221642752. Throughput: 0: 11013.7. Samples: 55493120. Policy #0 lag: (min: 2.0, avg: 114.2, max: 258.0) [2024-06-15 12:55:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:55:01,504][1653645] Updated weights for policy 0, policy_version 108240 (0.0039) [2024-06-15 12:55:03,319][1653645] Updated weights for policy 0, policy_version 108304 (0.0013) [2024-06-15 12:55:05,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 221904896. Throughput: 0: 10843.0. Samples: 55523328. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 12:55:06,455][1653645] Updated weights for policy 0, policy_version 108358 (0.0016) [2024-06-15 12:55:07,612][1653645] Updated weights for policy 0, policy_version 108415 (0.0011) [2024-06-15 12:55:08,900][1651596] Signal inference workers to stop experience collection... (5600 times) [2024-06-15 12:55:08,967][1653645] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-15 12:55:09,161][1651596] Signal inference workers to resume experience collection... (5600 times) [2024-06-15 12:55:09,162][1653645] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-15 12:55:10,167][1653645] Updated weights for policy 0, policy_version 108477 (0.0012) [2024-06-15 12:55:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 222167040. Throughput: 0: 11013.7. Samples: 55598080. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:55:13,851][1653645] Updated weights for policy 0, policy_version 108533 (0.0123) [2024-06-15 12:55:14,929][1653645] Updated weights for policy 0, policy_version 108576 (0.0011) [2024-06-15 12:55:15,852][1653645] Updated weights for policy 0, policy_version 108608 (0.0012) [2024-06-15 12:55:15,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 222429184. Throughput: 0: 10945.4. Samples: 55660544. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:15,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:55:19,134][1653645] Updated weights for policy 0, policy_version 108672 (0.0030) [2024-06-15 12:55:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 222593024. Throughput: 0: 10956.8. Samples: 55695872. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:20,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 12:55:21,730][1653645] Updated weights for policy 0, policy_version 108730 (0.0012) [2024-06-15 12:55:25,173][1653645] Updated weights for policy 0, policy_version 108768 (0.0028) [2024-06-15 12:55:25,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 222789632. Throughput: 0: 11241.2. Samples: 55767552. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:25,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:55:26,399][1653645] Updated weights for policy 0, policy_version 108816 (0.0096) [2024-06-15 12:55:29,479][1653645] Updated weights for policy 0, policy_version 108880 (0.0014) [2024-06-15 12:55:30,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 223084544. Throughput: 0: 11127.5. Samples: 55830016. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:55:32,451][1653645] Updated weights for policy 0, policy_version 108935 (0.0030) [2024-06-15 12:55:33,608][1653645] Updated weights for policy 0, policy_version 108988 (0.0025) [2024-06-15 12:55:35,959][1648982] Fps is (10 sec: 42599.7, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 223215616. Throughput: 0: 11286.7. Samples: 55863296. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:35,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 12:55:38,105][1653645] Updated weights for policy 0, policy_version 109056 (0.0013) [2024-06-15 12:55:40,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 223477760. Throughput: 0: 11002.3. Samples: 55923712. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:40,958][1648982] Avg episode reward: [(0, '37.550')] [2024-06-15 12:55:40,964][1651596] Saving new best policy, reward=37.550! [2024-06-15 12:55:41,551][1653645] Updated weights for policy 0, policy_version 109136 (0.0017) [2024-06-15 12:55:44,888][1653645] Updated weights for policy 0, policy_version 109188 (0.0014) [2024-06-15 12:55:45,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 223707136. Throughput: 0: 11150.2. Samples: 55994880. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:55:45,976][1653645] Updated weights for policy 0, policy_version 109239 (0.0013) [2024-06-15 12:55:49,805][1653645] Updated weights for policy 0, policy_version 109299 (0.0012) [2024-06-15 12:55:50,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 44782.7, 300 sec: 44653.3). Total num frames: 223936512. Throughput: 0: 11320.8. Samples: 56032768. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:55:51,367][1653645] Updated weights for policy 0, policy_version 109372 (0.0012) [2024-06-15 12:55:53,459][1651596] Signal inference workers to stop experience collection... (5650 times) [2024-06-15 12:55:53,569][1653645] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-15 12:55:53,755][1651596] Signal inference workers to resume experience collection... (5650 times) [2024-06-15 12:55:53,756][1653645] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-15 12:55:54,318][1653645] Updated weights for policy 0, policy_version 109429 (0.0014) [2024-06-15 12:55:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 224133120. Throughput: 0: 10990.9. Samples: 56092672. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:55:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:55:56,952][1653645] Updated weights for policy 0, policy_version 109460 (0.0011) [2024-06-15 12:56:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 224296960. Throughput: 0: 11127.4. Samples: 56161280. Policy #0 lag: (min: 45.0, avg: 148.1, max: 301.0) [2024-06-15 12:56:00,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:56:01,001][1653645] Updated weights for policy 0, policy_version 109529 (0.0014) [2024-06-15 12:56:02,540][1653645] Updated weights for policy 0, policy_version 109600 (0.0013) [2024-06-15 12:56:03,209][1653645] Updated weights for policy 0, policy_version 109627 (0.0009) [2024-06-15 12:56:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 224591872. Throughput: 0: 11002.3. Samples: 56190976. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:56:06,364][1653645] Updated weights for policy 0, policy_version 109686 (0.0105) [2024-06-15 12:56:09,202][1653645] Updated weights for policy 0, policy_version 109729 (0.0013) [2024-06-15 12:56:10,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 224788480. Throughput: 0: 10900.0. Samples: 56258048. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:56:13,075][1653645] Updated weights for policy 0, policy_version 109780 (0.0012) [2024-06-15 12:56:15,335][1653645] Updated weights for policy 0, policy_version 109872 (0.0012) [2024-06-15 12:56:15,958][1648982] Fps is (10 sec: 45873.3, 60 sec: 43690.4, 300 sec: 44875.5). Total num frames: 225050624. Throughput: 0: 10899.8. Samples: 56320512. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:56:17,349][1653645] Updated weights for policy 0, policy_version 109893 (0.0019) [2024-06-15 12:56:18,515][1653645] Updated weights for policy 0, policy_version 109952 (0.0015) [2024-06-15 12:56:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 225214464. Throughput: 0: 10945.4. Samples: 56355840. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 12:56:25,346][1653645] Updated weights for policy 0, policy_version 110048 (0.0014) [2024-06-15 12:56:25,958][1648982] Fps is (10 sec: 36046.4, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 225411072. Throughput: 0: 11104.7. Samples: 56423424. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:56:26,697][1653645] Updated weights for policy 0, policy_version 110096 (0.0012) [2024-06-15 12:56:29,570][1653645] Updated weights for policy 0, policy_version 110148 (0.0014) [2024-06-15 12:56:30,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 225673216. Throughput: 0: 10831.6. Samples: 56482304. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 12:56:33,313][1653645] Updated weights for policy 0, policy_version 110214 (0.0162) [2024-06-15 12:56:34,310][1653645] Updated weights for policy 0, policy_version 110272 (0.0141) [2024-06-15 12:56:35,959][1648982] Fps is (10 sec: 42595.3, 60 sec: 43690.2, 300 sec: 44209.0). Total num frames: 225837056. Throughput: 0: 10820.2. Samples: 56519680. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:35,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 12:56:38,116][1653645] Updated weights for policy 0, policy_version 110336 (0.0011) [2024-06-15 12:56:38,965][1651596] Signal inference workers to stop experience collection... (5700 times) [2024-06-15 12:56:39,014][1653645] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-15 12:56:39,140][1651596] Signal inference workers to resume experience collection... (5700 times) [2024-06-15 12:56:39,141][1653645] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-15 12:56:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 226099200. Throughput: 0: 10854.4. Samples: 56581120. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:56:42,148][1653645] Updated weights for policy 0, policy_version 110432 (0.0019) [2024-06-15 12:56:45,713][1653645] Updated weights for policy 0, policy_version 110483 (0.0014) [2024-06-15 12:56:45,958][1648982] Fps is (10 sec: 45878.5, 60 sec: 43144.6, 300 sec: 44209.0). Total num frames: 226295808. Throughput: 0: 10854.5. Samples: 56649728. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:56:48,516][1653645] Updated weights for policy 0, policy_version 110533 (0.0012) [2024-06-15 12:56:50,293][1653645] Updated weights for policy 0, policy_version 110608 (0.0025) [2024-06-15 12:56:50,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.9, 300 sec: 44653.5). Total num frames: 226557952. Throughput: 0: 11013.7. Samples: 56686592. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:56:51,454][1653645] Updated weights for policy 0, policy_version 110656 (0.0012) [2024-06-15 12:56:55,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 226754560. Throughput: 0: 10808.9. Samples: 56744448. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:56:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:56:55,987][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000110720_226754560.pth... [2024-06-15 12:56:56,032][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000105504_216072192.pth [2024-06-15 12:56:58,036][1653645] Updated weights for policy 0, policy_version 110752 (0.0033) [2024-06-15 12:57:00,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 226918400. Throughput: 0: 11025.2. Samples: 56816640. Policy #0 lag: (min: 95.0, avg: 202.3, max: 367.0) [2024-06-15 12:57:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:57:01,779][1653645] Updated weights for policy 0, policy_version 110848 (0.0014) [2024-06-15 12:57:03,009][1653645] Updated weights for policy 0, policy_version 110903 (0.0011) [2024-06-15 12:57:05,658][1653645] Updated weights for policy 0, policy_version 110929 (0.0011) [2024-06-15 12:57:05,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 227213312. Throughput: 0: 10854.4. Samples: 56844288. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:57:10,125][1653645] Updated weights for policy 0, policy_version 110996 (0.0014) [2024-06-15 12:57:10,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 227377152. Throughput: 0: 10922.6. Samples: 56914944. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:57:12,417][1653645] Updated weights for policy 0, policy_version 111056 (0.0012) [2024-06-15 12:57:14,951][1653645] Updated weights for policy 0, policy_version 111163 (0.0133) [2024-06-15 12:57:15,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 227672064. Throughput: 0: 10934.0. Samples: 56974336. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 12:57:17,947][1653645] Updated weights for policy 0, policy_version 111216 (0.0012) [2024-06-15 12:57:20,960][1648982] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 227803136. Throughput: 0: 10865.9. Samples: 57008640. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:20,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:57:23,991][1653645] Updated weights for policy 0, policy_version 111312 (0.0084) [2024-06-15 12:57:25,336][1651596] Signal inference workers to stop experience collection... (5750 times) [2024-06-15 12:57:25,374][1653645] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-15 12:57:25,735][1651596] Signal inference workers to resume experience collection... (5750 times) [2024-06-15 12:57:25,747][1653645] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-15 12:57:25,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 228098048. Throughput: 0: 11138.8. Samples: 57082368. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:57:26,291][1653645] Updated weights for policy 0, policy_version 111399 (0.0012) [2024-06-15 12:57:29,248][1653645] Updated weights for policy 0, policy_version 111472 (0.0018) [2024-06-15 12:57:30,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 228327424. Throughput: 0: 10968.2. Samples: 57143296. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 12:57:33,704][1653645] Updated weights for policy 0, policy_version 111506 (0.0014) [2024-06-15 12:57:35,690][1653645] Updated weights for policy 0, policy_version 111553 (0.0013) [2024-06-15 12:57:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44237.3, 300 sec: 44098.0). Total num frames: 228491264. Throughput: 0: 11059.2. Samples: 57184256. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:57:37,332][1653645] Updated weights for policy 0, policy_version 111620 (0.0012) [2024-06-15 12:57:38,451][1653645] Updated weights for policy 0, policy_version 111680 (0.0071) [2024-06-15 12:57:40,959][1648982] Fps is (10 sec: 39317.4, 60 sec: 43689.9, 300 sec: 44431.0). Total num frames: 228720640. Throughput: 0: 11093.1. Samples: 57243648. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:40,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:57:42,003][1653645] Updated weights for policy 0, policy_version 111738 (0.0013) [2024-06-15 12:57:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 228950016. Throughput: 0: 11104.7. Samples: 57316352. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:57:46,053][1653645] Updated weights for policy 0, policy_version 111806 (0.0012) [2024-06-15 12:57:49,561][1653645] Updated weights for policy 0, policy_version 111892 (0.0013) [2024-06-15 12:57:50,966][1648982] Fps is (10 sec: 52389.1, 60 sec: 44776.4, 300 sec: 44874.2). Total num frames: 229244928. Throughput: 0: 11170.8. Samples: 57347072. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:50,967][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:57:53,313][1653645] Updated weights for policy 0, policy_version 111968 (0.0012) [2024-06-15 12:57:54,007][1653645] Updated weights for policy 0, policy_version 112000 (0.0015) [2024-06-15 12:57:55,982][1648982] Fps is (10 sec: 42494.3, 60 sec: 43672.9, 300 sec: 44205.4). Total num frames: 229376000. Throughput: 0: 10939.5. Samples: 57407488. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:57:55,983][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:57:58,490][1653645] Updated weights for policy 0, policy_version 112057 (0.0014) [2024-06-15 12:58:00,958][1648982] Fps is (10 sec: 36076.0, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 229605376. Throughput: 0: 11161.7. Samples: 57476608. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:58:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 12:58:00,999][1653645] Updated weights for policy 0, policy_version 112128 (0.0125) [2024-06-15 12:58:02,532][1653645] Updated weights for policy 0, policy_version 112190 (0.0014) [2024-06-15 12:58:05,588][1653645] Updated weights for policy 0, policy_version 112243 (0.0087) [2024-06-15 12:58:05,958][1648982] Fps is (10 sec: 52556.4, 60 sec: 44782.8, 300 sec: 44542.2). Total num frames: 229900288. Throughput: 0: 11013.7. Samples: 57504256. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 12:58:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:58:10,429][1653645] Updated weights for policy 0, policy_version 112304 (0.0026) [2024-06-15 12:58:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 230031360. Throughput: 0: 10979.5. Samples: 57576448. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:58:12,185][1653645] Updated weights for policy 0, policy_version 112352 (0.0011) [2024-06-15 12:58:12,309][1651596] Signal inference workers to stop experience collection... (5800 times) [2024-06-15 12:58:12,342][1653645] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-15 12:58:12,623][1651596] Signal inference workers to resume experience collection... (5800 times) [2024-06-15 12:58:12,642][1653645] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-15 12:58:14,422][1653645] Updated weights for policy 0, policy_version 112416 (0.0012) [2024-06-15 12:58:15,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 230293504. Throughput: 0: 10877.2. Samples: 57632768. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:58:17,104][1653645] Updated weights for policy 0, policy_version 112450 (0.0012) [2024-06-15 12:58:18,629][1653645] Updated weights for policy 0, policy_version 112512 (0.0013) [2024-06-15 12:58:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 230424576. Throughput: 0: 10661.0. Samples: 57664000. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 12:58:23,010][1653645] Updated weights for policy 0, policy_version 112574 (0.0014) [2024-06-15 12:58:24,500][1653645] Updated weights for policy 0, policy_version 112626 (0.0012) [2024-06-15 12:58:25,960][1648982] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 230686720. Throughput: 0: 10763.6. Samples: 57728000. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:25,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:58:27,171][1653645] Updated weights for policy 0, policy_version 112680 (0.0012) [2024-06-15 12:58:29,189][1653645] Updated weights for policy 0, policy_version 112706 (0.0011) [2024-06-15 12:58:30,341][1653645] Updated weights for policy 0, policy_version 112761 (0.0013) [2024-06-15 12:58:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 230948864. Throughput: 0: 10774.7. Samples: 57801216. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:58:35,464][1653645] Updated weights for policy 0, policy_version 112848 (0.0011) [2024-06-15 12:58:35,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44236.9, 300 sec: 44209.1). Total num frames: 231145472. Throughput: 0: 10947.5. Samples: 57839616. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:35,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 12:58:36,525][1653645] Updated weights for policy 0, policy_version 112896 (0.0034) [2024-06-15 12:58:39,360][1653645] Updated weights for policy 0, policy_version 112949 (0.0036) [2024-06-15 12:58:40,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43691.5, 300 sec: 43986.9). Total num frames: 231342080. Throughput: 0: 10723.7. Samples: 57889792. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:58:41,815][1653645] Updated weights for policy 0, policy_version 112992 (0.0011) [2024-06-15 12:58:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 231505920. Throughput: 0: 10888.5. Samples: 57966592. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:58:46,574][1653645] Updated weights for policy 0, policy_version 113072 (0.0011) [2024-06-15 12:58:48,350][1653645] Updated weights for policy 0, policy_version 113152 (0.0021) [2024-06-15 12:58:50,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 42604.5, 300 sec: 43986.9). Total num frames: 231800832. Throughput: 0: 10831.7. Samples: 57991680. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 12:58:51,360][1653645] Updated weights for policy 0, policy_version 113204 (0.0013) [2024-06-15 12:58:53,007][1653645] Updated weights for policy 0, policy_version 113220 (0.0010) [2024-06-15 12:58:54,281][1653645] Updated weights for policy 0, policy_version 113275 (0.0012) [2024-06-15 12:58:55,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 43708.3, 300 sec: 43986.8). Total num frames: 231997440. Throughput: 0: 10706.5. Samples: 58058240. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:58:55,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 12:58:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000113280_231997440.pth... [2024-06-15 12:58:56,011][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000108160_221511680.pth [2024-06-15 12:58:57,852][1653645] Updated weights for policy 0, policy_version 113328 (0.0015) [2024-06-15 12:58:59,563][1651596] Signal inference workers to stop experience collection... (5850 times) [2024-06-15 12:58:59,638][1653645] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-15 12:58:59,791][1651596] Signal inference workers to resume experience collection... (5850 times) [2024-06-15 12:58:59,793][1653645] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-15 12:58:59,795][1653645] Updated weights for policy 0, policy_version 113392 (0.0013) [2024-06-15 12:59:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 232259584. Throughput: 0: 11025.1. Samples: 58128896. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:59:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 12:59:03,180][1653645] Updated weights for policy 0, policy_version 113443 (0.0012) [2024-06-15 12:59:04,856][1653645] Updated weights for policy 0, policy_version 113504 (0.0013) [2024-06-15 12:59:05,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 232521728. Throughput: 0: 11025.1. Samples: 58160128. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 12:59:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:59:09,520][1653645] Updated weights for policy 0, policy_version 113584 (0.0013) [2024-06-15 12:59:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 232685568. Throughput: 0: 11173.0. Samples: 58230784. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 12:59:11,752][1653645] Updated weights for policy 0, policy_version 113648 (0.0105) [2024-06-15 12:59:14,300][1653645] Updated weights for policy 0, policy_version 113684 (0.0013) [2024-06-15 12:59:15,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 232914944. Throughput: 0: 11036.4. Samples: 58297856. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:59:16,257][1653645] Updated weights for policy 0, policy_version 113748 (0.0028) [2024-06-15 12:59:17,183][1653645] Updated weights for policy 0, policy_version 113788 (0.0011) [2024-06-15 12:59:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 233078784. Throughput: 0: 10968.2. Samples: 58333184. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 12:59:21,571][1653645] Updated weights for policy 0, policy_version 113849 (0.0014) [2024-06-15 12:59:23,129][1653645] Updated weights for policy 0, policy_version 113875 (0.0020) [2024-06-15 12:59:24,119][1653645] Updated weights for policy 0, policy_version 113920 (0.0013) [2024-06-15 12:59:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 233308160. Throughput: 0: 11241.2. Samples: 58395648. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 12:59:27,140][1653645] Updated weights for policy 0, policy_version 113981 (0.0014) [2024-06-15 12:59:28,794][1653645] Updated weights for policy 0, policy_version 114044 (0.0013) [2024-06-15 12:59:30,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 233570304. Throughput: 0: 10979.5. Samples: 58460672. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 12:59:33,703][1653645] Updated weights for policy 0, policy_version 114101 (0.0014) [2024-06-15 12:59:34,970][1653645] Updated weights for policy 0, policy_version 114128 (0.0026) [2024-06-15 12:59:35,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 233799680. Throughput: 0: 11150.2. Samples: 58493440. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 12:59:38,470][1653645] Updated weights for policy 0, policy_version 114181 (0.0012) [2024-06-15 12:59:40,971][1648982] Fps is (10 sec: 42544.7, 60 sec: 44227.3, 300 sec: 44096.0). Total num frames: 233996288. Throughput: 0: 11078.9. Samples: 58556928. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:40,971][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 12:59:41,014][1653645] Updated weights for policy 0, policy_version 114272 (0.0011) [2024-06-15 12:59:45,101][1653645] Updated weights for policy 0, policy_version 114322 (0.0014) [2024-06-15 12:59:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 234192896. Throughput: 0: 11002.3. Samples: 58624000. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 12:59:46,347][1653645] Updated weights for policy 0, policy_version 114368 (0.0014) [2024-06-15 12:59:48,334][1653645] Updated weights for policy 0, policy_version 114431 (0.0013) [2024-06-15 12:59:50,003][1651596] Signal inference workers to stop experience collection... (5900 times) [2024-06-15 12:59:50,056][1653645] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-15 12:59:50,212][1651596] Signal inference workers to resume experience collection... (5900 times) [2024-06-15 12:59:50,213][1653645] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-15 12:59:50,692][1653645] Updated weights for policy 0, policy_version 114488 (0.0013) [2024-06-15 12:59:50,958][1648982] Fps is (10 sec: 49215.5, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 234487808. Throughput: 0: 10922.7. Samples: 58651648. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:50,978][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 12:59:53,509][1653645] Updated weights for policy 0, policy_version 114554 (0.0013) [2024-06-15 12:59:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 234618880. Throughput: 0: 10911.3. Samples: 58721792. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 12:59:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 12:59:57,750][1653645] Updated weights for policy 0, policy_version 114620 (0.0022) [2024-06-15 13:00:00,618][1653645] Updated weights for policy 0, policy_version 114686 (0.0014) [2024-06-15 13:00:00,959][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 234881024. Throughput: 0: 10672.4. Samples: 58778112. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 13:00:00,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:00:03,564][1653645] Updated weights for policy 0, policy_version 114740 (0.0012) [2024-06-15 13:00:05,895][1653645] Updated weights for policy 0, policy_version 114787 (0.0013) [2024-06-15 13:00:05,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 235077632. Throughput: 0: 10615.5. Samples: 58810880. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 13:00:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:00:09,433][1653645] Updated weights for policy 0, policy_version 114848 (0.0013) [2024-06-15 13:00:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 235274240. Throughput: 0: 10843.1. Samples: 58883584. Policy #0 lag: (min: 1.0, avg: 121.0, max: 257.0) [2024-06-15 13:00:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:00:12,473][1653645] Updated weights for policy 0, policy_version 114928 (0.0013) [2024-06-15 13:00:14,636][1653645] Updated weights for policy 0, policy_version 114976 (0.0019) [2024-06-15 13:00:15,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 235536384. Throughput: 0: 10774.8. Samples: 58945536. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:00:17,615][1653645] Updated weights for policy 0, policy_version 115040 (0.0014) [2024-06-15 13:00:18,527][1653645] Updated weights for policy 0, policy_version 115072 (0.0012) [2024-06-15 13:00:20,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43764.8). Total num frames: 235700224. Throughput: 0: 10797.5. Samples: 58979328. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:20,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:00:21,584][1653645] Updated weights for policy 0, policy_version 115130 (0.0064) [2024-06-15 13:00:23,471][1653645] Updated weights for policy 0, policy_version 115172 (0.0078) [2024-06-15 13:00:25,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.9, 300 sec: 43653.6). Total num frames: 235962368. Throughput: 0: 10982.7. Samples: 59051008. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:00:26,659][1653645] Updated weights for policy 0, policy_version 115253 (0.0014) [2024-06-15 13:00:29,616][1653645] Updated weights for policy 0, policy_version 115319 (0.0014) [2024-06-15 13:00:30,963][1648982] Fps is (10 sec: 49125.1, 60 sec: 43686.8, 300 sec: 43986.1). Total num frames: 236191744. Throughput: 0: 10875.8. Samples: 59113472. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:30,964][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:00:33,176][1653645] Updated weights for policy 0, policy_version 115391 (0.0013) [2024-06-15 13:00:35,539][1653645] Updated weights for policy 0, policy_version 115456 (0.0013) [2024-06-15 13:00:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 236453888. Throughput: 0: 11150.2. Samples: 59153408. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:00:38,886][1653645] Updated weights for policy 0, policy_version 115520 (0.0012) [2024-06-15 13:00:40,128][1651596] Signal inference workers to stop experience collection... (5950 times) [2024-06-15 13:00:40,216][1653645] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-15 13:00:40,424][1651596] Signal inference workers to resume experience collection... (5950 times) [2024-06-15 13:00:40,426][1653645] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-15 13:00:40,958][1648982] Fps is (10 sec: 42621.6, 60 sec: 43700.0, 300 sec: 43764.7). Total num frames: 236617728. Throughput: 0: 10945.4. Samples: 59214336. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:40,960][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 13:00:41,677][1653645] Updated weights for policy 0, policy_version 115576 (0.0162) [2024-06-15 13:00:44,218][1653645] Updated weights for policy 0, policy_version 115618 (0.0011) [2024-06-15 13:00:45,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44236.8, 300 sec: 43764.8). Total num frames: 236847104. Throughput: 0: 11355.0. Samples: 59289088. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 13:00:45,981][1653645] Updated weights for policy 0, policy_version 115652 (0.0014) [2024-06-15 13:00:49,601][1653645] Updated weights for policy 0, policy_version 115714 (0.0014) [2024-06-15 13:00:50,813][1653645] Updated weights for policy 0, policy_version 115770 (0.0016) [2024-06-15 13:00:50,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 237109248. Throughput: 0: 11320.9. Samples: 59320320. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:50,959][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 13:00:52,909][1653645] Updated weights for policy 0, policy_version 115836 (0.0012) [2024-06-15 13:00:55,506][1653645] Updated weights for policy 0, policy_version 115896 (0.0013) [2024-06-15 13:00:55,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 237371392. Throughput: 0: 11355.0. Samples: 59394560. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:00:55,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:00:55,978][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000115904_237371392.pth... [2024-06-15 13:00:56,017][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000110720_226754560.pth [2024-06-15 13:00:58,070][1653645] Updated weights for policy 0, policy_version 115957 (0.0120) [2024-06-15 13:01:00,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 237502464. Throughput: 0: 11514.3. Samples: 59463680. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:01:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:01:02,028][1653645] Updated weights for policy 0, policy_version 116016 (0.0014) [2024-06-15 13:01:02,977][1653645] Updated weights for policy 0, policy_version 116052 (0.0023) [2024-06-15 13:01:04,081][1653645] Updated weights for policy 0, policy_version 116095 (0.0015) [2024-06-15 13:01:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 237764608. Throughput: 0: 11389.2. Samples: 59491840. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:01:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:01:06,830][1653645] Updated weights for policy 0, policy_version 116158 (0.0123) [2024-06-15 13:01:09,600][1653645] Updated weights for policy 0, policy_version 116218 (0.0014) [2024-06-15 13:01:10,960][1648982] Fps is (10 sec: 52416.0, 60 sec: 45873.4, 300 sec: 43986.6). Total num frames: 238026752. Throughput: 0: 11399.9. Samples: 59564032. Policy #0 lag: (min: 31.0, avg: 144.8, max: 287.0) [2024-06-15 13:01:10,961][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:01:13,252][1653645] Updated weights for policy 0, policy_version 116260 (0.0013) [2024-06-15 13:01:14,924][1653645] Updated weights for policy 0, policy_version 116320 (0.0013) [2024-06-15 13:01:15,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 238288896. Throughput: 0: 11367.8. Samples: 59624960. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:15,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:01:17,601][1653645] Updated weights for policy 0, policy_version 116361 (0.0013) [2024-06-15 13:01:18,745][1653645] Updated weights for policy 0, policy_version 116413 (0.0012) [2024-06-15 13:01:20,958][1648982] Fps is (10 sec: 42608.4, 60 sec: 45875.1, 300 sec: 44209.0). Total num frames: 238452736. Throughput: 0: 11411.9. Samples: 59666944. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:20,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:01:21,374][1653645] Updated weights for policy 0, policy_version 116464 (0.0012) [2024-06-15 13:01:24,673][1653645] Updated weights for policy 0, policy_version 116500 (0.0011) [2024-06-15 13:01:25,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 238682112. Throughput: 0: 11650.9. Samples: 59738624. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:25,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 13:01:26,538][1651596] Signal inference workers to stop experience collection... (6000 times) [2024-06-15 13:01:26,559][1653645] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-15 13:01:26,794][1651596] Signal inference workers to resume experience collection... (6000 times) [2024-06-15 13:01:26,795][1653645] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-15 13:01:26,797][1653645] Updated weights for policy 0, policy_version 116576 (0.0161) [2024-06-15 13:01:30,196][1653645] Updated weights for policy 0, policy_version 116664 (0.0013) [2024-06-15 13:01:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45879.4, 300 sec: 44431.3). Total num frames: 238944256. Throughput: 0: 11173.0. Samples: 59791872. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:01:33,726][1653645] Updated weights for policy 0, policy_version 116710 (0.0012) [2024-06-15 13:01:35,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 239075328. Throughput: 0: 11366.4. Samples: 59831808. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:01:37,504][1653645] Updated weights for policy 0, policy_version 116784 (0.0055) [2024-06-15 13:01:38,777][1653645] Updated weights for policy 0, policy_version 116837 (0.0013) [2024-06-15 13:01:40,907][1653645] Updated weights for policy 0, policy_version 116883 (0.0012) [2024-06-15 13:01:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 239370240. Throughput: 0: 11116.1. Samples: 59894784. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:40,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 13:01:44,767][1653645] Updated weights for policy 0, policy_version 116930 (0.0014) [2024-06-15 13:01:45,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 45328.9, 300 sec: 44097.9). Total num frames: 239566848. Throughput: 0: 11252.6. Samples: 59970048. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:45,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:01:45,981][1653645] Updated weights for policy 0, policy_version 116990 (0.0014) [2024-06-15 13:01:49,168][1653645] Updated weights for policy 0, policy_version 117044 (0.0012) [2024-06-15 13:01:50,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 239861760. Throughput: 0: 11355.0. Samples: 60002816. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:50,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 13:01:51,811][1653645] Updated weights for policy 0, policy_version 117122 (0.0013) [2024-06-15 13:01:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 239992832. Throughput: 0: 11082.5. Samples: 60062720. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:01:55,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:01:56,581][1653645] Updated weights for policy 0, policy_version 117190 (0.0013) [2024-06-15 13:01:57,769][1653645] Updated weights for policy 0, policy_version 117244 (0.0012) [2024-06-15 13:02:00,818][1653645] Updated weights for policy 0, policy_version 117304 (0.0056) [2024-06-15 13:02:00,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 240222208. Throughput: 0: 11229.9. Samples: 60130304. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:02:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:02:02,675][1653645] Updated weights for policy 0, policy_version 117348 (0.0012) [2024-06-15 13:02:04,477][1653645] Updated weights for policy 0, policy_version 117424 (0.0016) [2024-06-15 13:02:05,965][1648982] Fps is (10 sec: 52392.5, 60 sec: 45869.6, 300 sec: 44541.2). Total num frames: 240517120. Throughput: 0: 10977.8. Samples: 60161024. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:02:05,966][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:02:09,513][1653645] Updated weights for policy 0, policy_version 117458 (0.0014) [2024-06-15 13:02:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43692.4, 300 sec: 43986.9). Total num frames: 240648192. Throughput: 0: 10922.6. Samples: 60230144. Policy #0 lag: (min: 47.0, avg: 139.8, max: 303.0) [2024-06-15 13:02:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:02:11,823][1653645] Updated weights for policy 0, policy_version 117507 (0.0013) [2024-06-15 13:02:13,164][1653645] Updated weights for policy 0, policy_version 117568 (0.0016) [2024-06-15 13:02:14,284][1651596] Signal inference workers to stop experience collection... (6050 times) [2024-06-15 13:02:14,353][1653645] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-15 13:02:14,634][1651596] Signal inference workers to resume experience collection... (6050 times) [2024-06-15 13:02:14,636][1653645] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-15 13:02:15,255][1653645] Updated weights for policy 0, policy_version 117618 (0.0013) [2024-06-15 13:02:15,958][1648982] Fps is (10 sec: 42629.3, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 240943104. Throughput: 0: 11127.5. Samples: 60292608. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:02:16,769][1653645] Updated weights for policy 0, policy_version 117688 (0.0012) [2024-06-15 13:02:20,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 241041408. Throughput: 0: 10934.0. Samples: 60323840. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:02:22,116][1653645] Updated weights for policy 0, policy_version 117757 (0.0013) [2024-06-15 13:02:24,383][1653645] Updated weights for policy 0, policy_version 117824 (0.0013) [2024-06-15 13:02:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 241303552. Throughput: 0: 11116.1. Samples: 60395008. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:02:27,340][1653645] Updated weights for policy 0, policy_version 117890 (0.0025) [2024-06-15 13:02:28,634][1653645] Updated weights for policy 0, policy_version 117952 (0.0012) [2024-06-15 13:02:30,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 241565696. Throughput: 0: 10922.7. Samples: 60461568. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:02:33,870][1653645] Updated weights for policy 0, policy_version 118011 (0.0013) [2024-06-15 13:02:35,772][1653645] Updated weights for policy 0, policy_version 118069 (0.0011) [2024-06-15 13:02:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 44320.3). Total num frames: 241795072. Throughput: 0: 11059.2. Samples: 60500480. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:02:37,658][1653645] Updated weights for policy 0, policy_version 118098 (0.0012) [2024-06-15 13:02:38,952][1653645] Updated weights for policy 0, policy_version 118165 (0.0013) [2024-06-15 13:02:40,038][1653645] Updated weights for policy 0, policy_version 118202 (0.0012) [2024-06-15 13:02:40,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 242089984. Throughput: 0: 11184.4. Samples: 60566016. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:02:45,701][1653645] Updated weights for policy 0, policy_version 118275 (0.0012) [2024-06-15 13:02:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.1, 300 sec: 44099.2). Total num frames: 242253824. Throughput: 0: 11173.0. Samples: 60633088. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:02:47,118][1653645] Updated weights for policy 0, policy_version 118336 (0.0016) [2024-06-15 13:02:50,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 43690.4, 300 sec: 44434.8). Total num frames: 242483200. Throughput: 0: 11243.0. Samples: 60666880. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:50,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:02:51,318][1653645] Updated weights for policy 0, policy_version 118417 (0.0016) [2024-06-15 13:02:52,311][1653645] Updated weights for policy 0, policy_version 118464 (0.0014) [2024-06-15 13:02:55,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.8, 300 sec: 44097.9). Total num frames: 242614272. Throughput: 0: 11138.8. Samples: 60731392. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:02:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:02:55,977][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000118464_242614272.pth... [2024-06-15 13:02:56,086][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000113280_231997440.pth [2024-06-15 13:02:57,951][1651596] Signal inference workers to stop experience collection... (6100 times) [2024-06-15 13:02:57,996][1653645] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-15 13:02:58,149][1651596] Signal inference workers to resume experience collection... (6100 times) [2024-06-15 13:02:58,149][1653645] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-15 13:02:58,871][1653645] Updated weights for policy 0, policy_version 118564 (0.0016) [2024-06-15 13:03:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.6, 300 sec: 43986.9). Total num frames: 242876416. Throughput: 0: 11150.1. Samples: 60794368. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:03:00,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:03:02,124][1653645] Updated weights for policy 0, policy_version 118628 (0.0015) [2024-06-15 13:03:04,134][1653645] Updated weights for policy 0, policy_version 118704 (0.0013) [2024-06-15 13:03:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43695.9, 300 sec: 44431.2). Total num frames: 243138560. Throughput: 0: 11150.3. Samples: 60825600. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:03:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:03:09,441][1653645] Updated weights for policy 0, policy_version 118736 (0.0021) [2024-06-15 13:03:10,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 243269632. Throughput: 0: 11127.5. Samples: 60895744. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:03:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:03:11,227][1653645] Updated weights for policy 0, policy_version 118800 (0.0012) [2024-06-15 13:03:12,336][1653645] Updated weights for policy 0, policy_version 118848 (0.0108) [2024-06-15 13:03:15,569][1653645] Updated weights for policy 0, policy_version 118930 (0.0014) [2024-06-15 13:03:15,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 243597312. Throughput: 0: 10831.6. Samples: 60948992. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:03:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:03:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 243662848. Throughput: 0: 10717.9. Samples: 60982784. Policy #0 lag: (min: 40.0, avg: 155.9, max: 296.0) [2024-06-15 13:03:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:03:21,615][1653645] Updated weights for policy 0, policy_version 118978 (0.0012) [2024-06-15 13:03:23,154][1653645] Updated weights for policy 0, policy_version 119043 (0.0012) [2024-06-15 13:03:25,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 243924992. Throughput: 0: 10763.3. Samples: 61050368. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:25,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:03:26,304][1653645] Updated weights for policy 0, policy_version 119121 (0.0015) [2024-06-15 13:03:27,750][1653645] Updated weights for policy 0, policy_version 119185 (0.0025) [2024-06-15 13:03:28,811][1653645] Updated weights for policy 0, policy_version 119231 (0.0013) [2024-06-15 13:03:30,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 244187136. Throughput: 0: 10808.9. Samples: 61119488. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:03:34,379][1653645] Updated weights for policy 0, policy_version 119297 (0.0012) [2024-06-15 13:03:35,409][1653645] Updated weights for policy 0, policy_version 119348 (0.0013) [2024-06-15 13:03:35,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 244449280. Throughput: 0: 11002.4. Samples: 61161984. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:03:38,086][1653645] Updated weights for policy 0, policy_version 119408 (0.0013) [2024-06-15 13:03:38,952][1651596] Signal inference workers to stop experience collection... (6150 times) [2024-06-15 13:03:39,002][1653645] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-06-15 13:03:39,216][1651596] Signal inference workers to resume experience collection... (6150 times) [2024-06-15 13:03:39,218][1653645] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-06-15 13:03:39,849][1653645] Updated weights for policy 0, policy_version 119476 (0.0014) [2024-06-15 13:03:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 244711424. Throughput: 0: 10854.4. Samples: 61219840. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:03:45,549][1653645] Updated weights for policy 0, policy_version 119526 (0.0024) [2024-06-15 13:03:45,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.4, 300 sec: 44097.9). Total num frames: 244809728. Throughput: 0: 11082.0. Samples: 61293056. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:03:47,312][1653645] Updated weights for policy 0, policy_version 119600 (0.0014) [2024-06-15 13:03:49,625][1653645] Updated weights for policy 0, policy_version 119650 (0.0013) [2024-06-15 13:03:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 245137408. Throughput: 0: 10990.9. Samples: 61320192. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:03:51,662][1653645] Updated weights for policy 0, policy_version 119734 (0.0106) [2024-06-15 13:03:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 245235712. Throughput: 0: 10820.3. Samples: 61382656. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:03:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:03:58,410][1653645] Updated weights for policy 0, policy_version 119793 (0.0011) [2024-06-15 13:04:00,095][1653645] Updated weights for policy 0, policy_version 119870 (0.0014) [2024-06-15 13:04:00,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43690.7, 300 sec: 43986.8). Total num frames: 245497856. Throughput: 0: 11013.7. Samples: 61444608. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:04:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:04:02,967][1653645] Updated weights for policy 0, policy_version 119936 (0.0015) [2024-06-15 13:04:04,240][1653645] Updated weights for policy 0, policy_version 119993 (0.0014) [2024-06-15 13:04:05,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 245760000. Throughput: 0: 10922.6. Samples: 61474304. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:04:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:04:10,775][1653645] Updated weights for policy 0, policy_version 120051 (0.0012) [2024-06-15 13:04:10,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 245858304. Throughput: 0: 11082.0. Samples: 61549056. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:04:10,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 13:04:12,415][1653645] Updated weights for policy 0, policy_version 120120 (0.0013) [2024-06-15 13:04:14,350][1653645] Updated weights for policy 0, policy_version 120176 (0.0045) [2024-06-15 13:04:15,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 246251520. Throughput: 0: 10752.0. Samples: 61603328. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:04:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:04:16,018][1653645] Updated weights for policy 0, policy_version 120252 (0.0013) [2024-06-15 13:04:20,960][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 246284288. Throughput: 0: 10626.8. Samples: 61640192. Policy #0 lag: (min: 3.0, avg: 68.0, max: 243.0) [2024-06-15 13:04:20,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:04:23,404][1653645] Updated weights for policy 0, policy_version 120336 (0.0013) [2024-06-15 13:04:23,968][1651596] Signal inference workers to stop experience collection... (6200 times) [2024-06-15 13:04:24,022][1653645] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-06-15 13:04:24,223][1651596] Signal inference workers to resume experience collection... (6200 times) [2024-06-15 13:04:24,224][1653645] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-06-15 13:04:25,834][1653645] Updated weights for policy 0, policy_version 120390 (0.0018) [2024-06-15 13:04:25,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 246546432. Throughput: 0: 10695.1. Samples: 61701120. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:25,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:04:28,613][1653645] Updated weights for policy 0, policy_version 120496 (0.0108) [2024-06-15 13:04:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 246808576. Throughput: 0: 10570.0. Samples: 61768704. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:30,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:04:34,587][1653645] Updated weights for policy 0, policy_version 120560 (0.0014) [2024-06-15 13:04:35,867][1653645] Updated weights for policy 0, policy_version 120608 (0.0012) [2024-06-15 13:04:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 44099.9). Total num frames: 247005184. Throughput: 0: 10854.4. Samples: 61808640. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:04:36,492][1653645] Updated weights for policy 0, policy_version 120639 (0.0013) [2024-06-15 13:04:38,593][1653645] Updated weights for policy 0, policy_version 120688 (0.0027) [2024-06-15 13:04:39,643][1653645] Updated weights for policy 0, policy_version 120724 (0.0012) [2024-06-15 13:04:40,622][1653645] Updated weights for policy 0, policy_version 120768 (0.0013) [2024-06-15 13:04:40,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 247332864. Throughput: 0: 10843.0. Samples: 61870592. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:40,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:04:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 43764.7). Total num frames: 247398400. Throughput: 0: 11104.8. Samples: 61944320. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:04:46,644][1653645] Updated weights for policy 0, policy_version 120837 (0.0019) [2024-06-15 13:04:49,183][1653645] Updated weights for policy 0, policy_version 120899 (0.0059) [2024-06-15 13:04:50,959][1653645] Updated weights for policy 0, policy_version 120965 (0.0117) [2024-06-15 13:04:50,963][1648982] Fps is (10 sec: 39303.3, 60 sec: 43141.1, 300 sec: 44430.5). Total num frames: 247726080. Throughput: 0: 11080.8. Samples: 61972992. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:50,963][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:04:52,136][1653645] Updated weights for policy 0, policy_version 121017 (0.0041) [2024-06-15 13:04:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 247857152. Throughput: 0: 10945.4. Samples: 62041600. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:04:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:04:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000121024_247857152.pth... [2024-06-15 13:04:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000115904_237371392.pth [2024-06-15 13:04:57,635][1653645] Updated weights for policy 0, policy_version 121059 (0.0013) [2024-06-15 13:04:58,692][1653645] Updated weights for policy 0, policy_version 121108 (0.0013) [2024-06-15 13:05:00,820][1653645] Updated weights for policy 0, policy_version 121185 (0.0016) [2024-06-15 13:05:00,958][1648982] Fps is (10 sec: 45896.0, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 248184832. Throughput: 0: 11286.7. Samples: 62111232. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:05:00,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:05:01,785][1653645] Updated weights for policy 0, policy_version 121232 (0.0013) [2024-06-15 13:05:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 248381440. Throughput: 0: 11218.5. Samples: 62145024. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:05:05,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:05:08,046][1653645] Updated weights for policy 0, policy_version 121282 (0.0012) [2024-06-15 13:05:08,850][1651596] Signal inference workers to stop experience collection... (6250 times) [2024-06-15 13:05:08,888][1653645] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-06-15 13:05:09,081][1651596] Signal inference workers to resume experience collection... (6250 times) [2024-06-15 13:05:09,082][1653645] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-06-15 13:05:09,617][1653645] Updated weights for policy 0, policy_version 121349 (0.0027) [2024-06-15 13:05:10,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 248643584. Throughput: 0: 11468.7. Samples: 62217216. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:05:10,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:05:12,880][1653645] Updated weights for policy 0, policy_version 121445 (0.0015) [2024-06-15 13:05:13,663][1653645] Updated weights for policy 0, policy_version 121474 (0.0018) [2024-06-15 13:05:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 248905728. Throughput: 0: 11218.5. Samples: 62273536. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:05:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:05:19,690][1653645] Updated weights for policy 0, policy_version 121540 (0.0018) [2024-06-15 13:05:20,589][1653645] Updated weights for policy 0, policy_version 121593 (0.0012) [2024-06-15 13:05:20,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 249036800. Throughput: 0: 11241.3. Samples: 62314496. Policy #0 lag: (min: 47.0, avg: 130.8, max: 319.0) [2024-06-15 13:05:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:05:21,987][1653645] Updated weights for policy 0, policy_version 121648 (0.0013) [2024-06-15 13:05:24,376][1653645] Updated weights for policy 0, policy_version 121712 (0.0014) [2024-06-15 13:05:25,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 46421.2, 300 sec: 44543.1). Total num frames: 249331712. Throughput: 0: 11332.2. Samples: 62380544. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:05:26,718][1653645] Updated weights for policy 0, policy_version 121782 (0.0050) [2024-06-15 13:05:30,967][1648982] Fps is (10 sec: 39287.6, 60 sec: 43684.4, 300 sec: 43985.6). Total num frames: 249430016. Throughput: 0: 11193.6. Samples: 62448128. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:30,971][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:05:32,130][1653645] Updated weights for policy 0, policy_version 121850 (0.0017) [2024-06-15 13:05:34,384][1653645] Updated weights for policy 0, policy_version 121907 (0.0066) [2024-06-15 13:05:35,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 46421.4, 300 sec: 44653.3). Total num frames: 249790464. Throughput: 0: 11401.7. Samples: 62486016. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:35,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:05:35,998][1653645] Updated weights for policy 0, policy_version 121976 (0.0015) [2024-06-15 13:05:38,642][1653645] Updated weights for policy 0, policy_version 122016 (0.0012) [2024-06-15 13:05:40,958][1648982] Fps is (10 sec: 52471.8, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 249954304. Throughput: 0: 11127.4. Samples: 62542336. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:40,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:05:43,568][1653645] Updated weights for policy 0, policy_version 122080 (0.0012) [2024-06-15 13:05:45,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 250118144. Throughput: 0: 11229.9. Samples: 62616576. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:45,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:05:46,091][1653645] Updated weights for policy 0, policy_version 122130 (0.0015) [2024-06-15 13:05:47,711][1653645] Updated weights for policy 0, policy_version 122197 (0.0013) [2024-06-15 13:05:49,567][1653645] Updated weights for policy 0, policy_version 122258 (0.0013) [2024-06-15 13:05:50,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45878.6, 300 sec: 44431.2). Total num frames: 250478592. Throughput: 0: 11093.3. Samples: 62644224. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:50,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:05:55,059][1653645] Updated weights for policy 0, policy_version 122320 (0.0013) [2024-06-15 13:05:55,181][1651596] Signal inference workers to stop experience collection... (6300 times) [2024-06-15 13:05:55,301][1653645] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-06-15 13:05:55,326][1651596] Signal inference workers to resume experience collection... (6300 times) [2024-06-15 13:05:55,328][1653645] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-06-15 13:05:55,787][1653645] Updated weights for policy 0, policy_version 122363 (0.0014) [2024-06-15 13:05:55,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 250609664. Throughput: 0: 10991.0. Samples: 62711808. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:05:55,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:05:59,913][1653645] Updated weights for policy 0, policy_version 122450 (0.0014) [2024-06-15 13:06:00,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 44237.0, 300 sec: 44320.1). Total num frames: 250839040. Throughput: 0: 11013.7. Samples: 62769152. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:06:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:06:01,083][1653645] Updated weights for policy 0, policy_version 122496 (0.0137) [2024-06-15 13:06:03,702][1653645] Updated weights for policy 0, policy_version 122555 (0.0012) [2024-06-15 13:06:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43987.2). Total num frames: 251002880. Throughput: 0: 10672.3. Samples: 62794752. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:06:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:06:08,663][1653645] Updated weights for policy 0, policy_version 122616 (0.0015) [2024-06-15 13:06:10,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42598.5, 300 sec: 43764.7). Total num frames: 251199488. Throughput: 0: 10843.1. Samples: 62868480. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:06:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:06:11,463][1653645] Updated weights for policy 0, policy_version 122672 (0.0134) [2024-06-15 13:06:13,214][1653645] Updated weights for policy 0, policy_version 122752 (0.0013) [2024-06-15 13:06:15,575][1653645] Updated weights for policy 0, policy_version 122811 (0.0012) [2024-06-15 13:06:15,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 251527168. Throughput: 0: 10617.5. Samples: 62925824. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:06:15,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:06:20,658][1653645] Updated weights for policy 0, policy_version 122870 (0.0013) [2024-06-15 13:06:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 251658240. Throughput: 0: 10638.2. Samples: 62964736. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:06:20,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:06:22,876][1653645] Updated weights for policy 0, policy_version 122899 (0.0018) [2024-06-15 13:06:25,114][1653645] Updated weights for policy 0, policy_version 123005 (0.0102) [2024-06-15 13:06:25,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 251920384. Throughput: 0: 10865.9. Samples: 63031296. Policy #0 lag: (min: 13.0, avg: 162.2, max: 269.0) [2024-06-15 13:06:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:06:27,391][1653645] Updated weights for policy 0, policy_version 123066 (0.0013) [2024-06-15 13:06:30,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44243.1, 300 sec: 44098.0). Total num frames: 252084224. Throughput: 0: 10786.1. Samples: 63101952. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:06:30,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:06:31,883][1653645] Updated weights for policy 0, policy_version 123124 (0.0013) [2024-06-15 13:06:34,342][1653645] Updated weights for policy 0, policy_version 123152 (0.0011) [2024-06-15 13:06:35,832][1653645] Updated weights for policy 0, policy_version 123216 (0.0012) [2024-06-15 13:06:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 252346368. Throughput: 0: 10991.0. Samples: 63138816. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:06:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:06:37,381][1653645] Updated weights for policy 0, policy_version 123280 (0.0014) [2024-06-15 13:06:37,517][1651596] Signal inference workers to stop experience collection... (6350 times) [2024-06-15 13:06:37,573][1653645] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-06-15 13:06:37,814][1651596] Signal inference workers to resume experience collection... (6350 times) [2024-06-15 13:06:37,815][1653645] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-06-15 13:06:40,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43691.0, 300 sec: 44098.0). Total num frames: 252575744. Throughput: 0: 10774.8. Samples: 63196672. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:06:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:06:42,904][1653645] Updated weights for policy 0, policy_version 123344 (0.0013) [2024-06-15 13:06:45,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 252706816. Throughput: 0: 11150.2. Samples: 63270912. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:06:45,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:06:46,500][1653645] Updated weights for policy 0, policy_version 123409 (0.0012) [2024-06-15 13:06:48,425][1653645] Updated weights for policy 0, policy_version 123490 (0.0013) [2024-06-15 13:06:49,768][1653645] Updated weights for policy 0, policy_version 123552 (0.0027) [2024-06-15 13:06:50,957][1648982] Fps is (10 sec: 52429.9, 60 sec: 43691.0, 300 sec: 44431.3). Total num frames: 253100032. Throughput: 0: 11138.9. Samples: 63296000. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:06:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:06:55,537][1653645] Updated weights for policy 0, policy_version 123632 (0.0018) [2024-06-15 13:06:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 253231104. Throughput: 0: 11264.0. Samples: 63375360. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:06:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:06:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000123648_253231104.pth... [2024-06-15 13:06:56,022][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000118464_242614272.pth [2024-06-15 13:06:57,516][1653645] Updated weights for policy 0, policy_version 123651 (0.0014) [2024-06-15 13:06:58,653][1653645] Updated weights for policy 0, policy_version 123712 (0.0017) [2024-06-15 13:07:00,951][1653645] Updated weights for policy 0, policy_version 123792 (0.0013) [2024-06-15 13:07:00,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44782.9, 300 sec: 44099.0). Total num frames: 253526016. Throughput: 0: 11207.2. Samples: 63430144. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:07:00,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:07:01,964][1653645] Updated weights for policy 0, policy_version 123838 (0.0012) [2024-06-15 13:07:05,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 253624320. Throughput: 0: 11161.5. Samples: 63467008. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:07:05,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:07:07,340][1653645] Updated weights for policy 0, policy_version 123888 (0.0012) [2024-06-15 13:07:09,162][1653645] Updated weights for policy 0, policy_version 123952 (0.0012) [2024-06-15 13:07:10,958][1648982] Fps is (10 sec: 42596.9, 60 sec: 45875.0, 300 sec: 44097.9). Total num frames: 253952000. Throughput: 0: 11241.2. Samples: 63537152. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:07:10,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:07:11,606][1653645] Updated weights for policy 0, policy_version 124026 (0.0014) [2024-06-15 13:07:14,521][1653645] Updated weights for policy 0, policy_version 124095 (0.0013) [2024-06-15 13:07:15,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 254148608. Throughput: 0: 11116.1. Samples: 63602176. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:07:15,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:07:20,299][1653645] Updated weights for policy 0, policy_version 124167 (0.0052) [2024-06-15 13:07:20,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 254345216. Throughput: 0: 11138.8. Samples: 63640064. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:07:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:07:21,304][1653645] Updated weights for policy 0, policy_version 124218 (0.0013) [2024-06-15 13:07:23,095][1653645] Updated weights for policy 0, policy_version 124274 (0.0023) [2024-06-15 13:07:25,906][1651596] Signal inference workers to stop experience collection... (6400 times) [2024-06-15 13:07:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 254574592. Throughput: 0: 11229.9. Samples: 63702016. Policy #0 lag: (min: 63.0, avg: 208.7, max: 319.0) [2024-06-15 13:07:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:07:25,964][1653645] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-06-15 13:07:26,189][1651596] Signal inference workers to resume experience collection... (6400 times) [2024-06-15 13:07:26,190][1653645] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-06-15 13:07:26,573][1653645] Updated weights for policy 0, policy_version 124336 (0.0012) [2024-06-15 13:07:28,931][1653645] Updated weights for policy 0, policy_version 124353 (0.0012) [2024-06-15 13:07:30,196][1653645] Updated weights for policy 0, policy_version 124410 (0.0014) [2024-06-15 13:07:30,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 254803968. Throughput: 0: 11116.1. Samples: 63771136. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:07:30,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:07:34,191][1653645] Updated weights for policy 0, policy_version 124482 (0.0014) [2024-06-15 13:07:35,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 45329.0, 300 sec: 43986.8). Total num frames: 255066112. Throughput: 0: 11320.8. Samples: 63805440. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:07:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:07:37,687][1653645] Updated weights for policy 0, policy_version 124565 (0.0121) [2024-06-15 13:07:40,199][1653645] Updated weights for policy 0, policy_version 124614 (0.0016) [2024-06-15 13:07:40,957][1648982] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 255262720. Throughput: 0: 11013.7. Samples: 63870976. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:07:40,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:07:41,238][1653645] Updated weights for policy 0, policy_version 124667 (0.0012) [2024-06-15 13:07:43,586][1653645] Updated weights for policy 0, policy_version 124726 (0.0165) [2024-06-15 13:07:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 255459328. Throughput: 0: 11446.0. Samples: 63945216. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:07:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:07:47,104][1653645] Updated weights for policy 0, policy_version 124784 (0.0040) [2024-06-15 13:07:49,122][1653645] Updated weights for policy 0, policy_version 124816 (0.0013) [2024-06-15 13:07:50,127][1653645] Updated weights for policy 0, policy_version 124864 (0.0013) [2024-06-15 13:07:50,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 255721472. Throughput: 0: 11298.2. Samples: 63975424. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:07:50,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:07:53,055][1653645] Updated weights for policy 0, policy_version 124925 (0.0012) [2024-06-15 13:07:54,652][1653645] Updated weights for policy 0, policy_version 124981 (0.0013) [2024-06-15 13:07:55,966][1648982] Fps is (10 sec: 52384.2, 60 sec: 45868.6, 300 sec: 44429.9). Total num frames: 255983616. Throughput: 0: 11284.7. Samples: 64045056. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:07:55,967][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:07:57,843][1653645] Updated weights for policy 0, policy_version 125024 (0.0012) [2024-06-15 13:08:00,456][1653645] Updated weights for policy 0, policy_version 125074 (0.0026) [2024-06-15 13:08:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 256180224. Throughput: 0: 11423.3. Samples: 64116224. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:08:03,035][1653645] Updated weights for policy 0, policy_version 125136 (0.0012) [2024-06-15 13:08:05,958][1648982] Fps is (10 sec: 42635.3, 60 sec: 46421.6, 300 sec: 44542.3). Total num frames: 256409600. Throughput: 0: 11423.3. Samples: 64154112. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:05,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 13:08:06,021][1653645] Updated weights for policy 0, policy_version 125216 (0.0108) [2024-06-15 13:08:10,303][1653645] Updated weights for policy 0, policy_version 125311 (0.0038) [2024-06-15 13:08:10,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 256638976. Throughput: 0: 11457.4. Samples: 64217600. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:08:13,051][1653645] Updated weights for policy 0, policy_version 125372 (0.0016) [2024-06-15 13:08:14,481][1651596] Signal inference workers to stop experience collection... (6450 times) [2024-06-15 13:08:14,535][1653645] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-06-15 13:08:14,664][1651596] Signal inference workers to resume experience collection... (6450 times) [2024-06-15 13:08:14,665][1653645] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-06-15 13:08:15,668][1653645] Updated weights for policy 0, policy_version 125440 (0.0013) [2024-06-15 13:08:15,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 256901120. Throughput: 0: 11366.4. Samples: 64282624. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:15,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:08:18,590][1653645] Updated weights for policy 0, policy_version 125504 (0.0027) [2024-06-15 13:08:20,959][1648982] Fps is (10 sec: 39318.2, 60 sec: 44782.3, 300 sec: 44431.1). Total num frames: 257032192. Throughput: 0: 11320.7. Samples: 64314880. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:20,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:08:22,474][1653645] Updated weights for policy 0, policy_version 125558 (0.0012) [2024-06-15 13:08:24,151][1653645] Updated weights for policy 0, policy_version 125586 (0.0017) [2024-06-15 13:08:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 257294336. Throughput: 0: 11389.1. Samples: 64383488. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:25,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 13:08:26,245][1653645] Updated weights for policy 0, policy_version 125633 (0.0014) [2024-06-15 13:08:29,722][1653645] Updated weights for policy 0, policy_version 125730 (0.0018) [2024-06-15 13:08:30,958][1648982] Fps is (10 sec: 52433.6, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 257556480. Throughput: 0: 11036.5. Samples: 64441856. Policy #0 lag: (min: 28.0, avg: 123.1, max: 284.0) [2024-06-15 13:08:30,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 13:08:34,852][1653645] Updated weights for policy 0, policy_version 125824 (0.0011) [2024-06-15 13:08:35,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 257687552. Throughput: 0: 11241.2. Samples: 64481280. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:08:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:08:38,636][1653645] Updated weights for policy 0, policy_version 125890 (0.0012) [2024-06-15 13:08:39,816][1653645] Updated weights for policy 0, policy_version 125949 (0.0030) [2024-06-15 13:08:40,958][1648982] Fps is (10 sec: 42595.9, 60 sec: 45328.5, 300 sec: 44653.3). Total num frames: 257982464. Throughput: 0: 11118.1. Samples: 64545280. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:08:40,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:08:41,866][1653645] Updated weights for policy 0, policy_version 126016 (0.0024) [2024-06-15 13:08:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 258080768. Throughput: 0: 10979.6. Samples: 64610304. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:08:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:08:47,272][1653645] Updated weights for policy 0, policy_version 126078 (0.0014) [2024-06-15 13:08:49,837][1653645] Updated weights for policy 0, policy_version 126116 (0.0010) [2024-06-15 13:08:50,958][1648982] Fps is (10 sec: 39323.9, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 258375680. Throughput: 0: 10808.9. Samples: 64640512. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:08:50,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:08:51,672][1653645] Updated weights for policy 0, policy_version 126198 (0.0082) [2024-06-15 13:08:53,809][1653645] Updated weights for policy 0, policy_version 126241 (0.0019) [2024-06-15 13:08:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43696.9, 300 sec: 44431.2). Total num frames: 258605056. Throughput: 0: 10706.5. Samples: 64699392. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:08:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:08:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000126272_258605056.pth... [2024-06-15 13:08:56,013][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000121024_247857152.pth [2024-06-15 13:08:58,367][1653645] Updated weights for policy 0, policy_version 126289 (0.0015) [2024-06-15 13:09:00,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 42598.2, 300 sec: 43986.9). Total num frames: 258736128. Throughput: 0: 10911.3. Samples: 64773632. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:00,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:09:01,036][1653645] Updated weights for policy 0, policy_version 126337 (0.0013) [2024-06-15 13:09:02,425][1651596] Signal inference workers to stop experience collection... (6500 times) [2024-06-15 13:09:02,478][1653645] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-06-15 13:09:02,648][1651596] Signal inference workers to resume experience collection... (6500 times) [2024-06-15 13:09:02,649][1653645] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-06-15 13:09:03,743][1653645] Updated weights for policy 0, policy_version 126458 (0.0019) [2024-06-15 13:09:05,890][1653645] Updated weights for policy 0, policy_version 126512 (0.0013) [2024-06-15 13:09:05,958][1648982] Fps is (10 sec: 49151.0, 60 sec: 44782.7, 300 sec: 44875.5). Total num frames: 259096576. Throughput: 0: 10763.5. Samples: 64799232. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:05,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 13:09:10,514][1653645] Updated weights for policy 0, policy_version 126564 (0.0013) [2024-06-15 13:09:10,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 259260416. Throughput: 0: 10843.1. Samples: 64871424. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:09:14,217][1653645] Updated weights for policy 0, policy_version 126656 (0.0012) [2024-06-15 13:09:15,823][1653645] Updated weights for policy 0, policy_version 126713 (0.0027) [2024-06-15 13:09:15,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 259522560. Throughput: 0: 10854.4. Samples: 64930304. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:09:18,302][1653645] Updated weights for policy 0, policy_version 126754 (0.0012) [2024-06-15 13:09:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43691.4, 300 sec: 44431.2). Total num frames: 259653632. Throughput: 0: 10706.5. Samples: 64963072. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:09:21,454][1653645] Updated weights for policy 0, policy_version 126785 (0.0011) [2024-06-15 13:09:22,564][1653645] Updated weights for policy 0, policy_version 126840 (0.0012) [2024-06-15 13:09:25,875][1653645] Updated weights for policy 0, policy_version 126898 (0.0015) [2024-06-15 13:09:25,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.7, 300 sec: 44320.1). Total num frames: 259883008. Throughput: 0: 11025.2. Samples: 65041408. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:25,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:09:27,551][1653645] Updated weights for policy 0, policy_version 126976 (0.0089) [2024-06-15 13:09:30,455][1653645] Updated weights for policy 0, policy_version 127039 (0.0013) [2024-06-15 13:09:30,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 260177920. Throughput: 0: 10752.0. Samples: 65094144. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:30,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:09:34,471][1653645] Updated weights for policy 0, policy_version 127088 (0.0014) [2024-06-15 13:09:35,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 260308992. Throughput: 0: 11047.8. Samples: 65137664. Policy #0 lag: (min: 11.0, avg: 103.1, max: 267.0) [2024-06-15 13:09:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:09:38,014][1653645] Updated weights for policy 0, policy_version 127152 (0.0163) [2024-06-15 13:09:39,675][1653645] Updated weights for policy 0, policy_version 127229 (0.0111) [2024-06-15 13:09:40,982][1648982] Fps is (10 sec: 39225.3, 60 sec: 43127.2, 300 sec: 44649.6). Total num frames: 260571136. Throughput: 0: 11041.8. Samples: 65196544. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:09:40,983][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:09:41,804][1653645] Updated weights for policy 0, policy_version 127282 (0.0013) [2024-06-15 13:09:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44209.7). Total num frames: 260767744. Throughput: 0: 11002.4. Samples: 65268736. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:09:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:09:46,227][1653645] Updated weights for policy 0, policy_version 127344 (0.0014) [2024-06-15 13:09:50,528][1651596] Signal inference workers to stop experience collection... (6550 times) [2024-06-15 13:09:50,594][1653645] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-06-15 13:09:50,598][1653645] Updated weights for policy 0, policy_version 127412 (0.0014) [2024-06-15 13:09:50,755][1651596] Signal inference workers to resume experience collection... (6550 times) [2024-06-15 13:09:50,770][1653645] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-06-15 13:09:50,958][1648982] Fps is (10 sec: 39419.2, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 260964352. Throughput: 0: 11093.4. Samples: 65298432. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:09:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:09:53,337][1653645] Updated weights for policy 0, policy_version 127504 (0.0013) [2024-06-15 13:09:55,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 261226496. Throughput: 0: 10660.9. Samples: 65351168. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:09:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:09:58,603][1653645] Updated weights for policy 0, policy_version 127568 (0.0102) [2024-06-15 13:10:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 261357568. Throughput: 0: 10922.7. Samples: 65421824. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:00,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:10:01,771][1653645] Updated weights for policy 0, policy_version 127639 (0.0014) [2024-06-15 13:10:03,256][1653645] Updated weights for policy 0, policy_version 127696 (0.0014) [2024-06-15 13:10:05,432][1653645] Updated weights for policy 0, policy_version 127760 (0.0042) [2024-06-15 13:10:05,959][1648982] Fps is (10 sec: 45869.9, 60 sec: 43143.8, 300 sec: 44208.9). Total num frames: 261685248. Throughput: 0: 10956.5. Samples: 65456128. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:10:09,898][1653645] Updated weights for policy 0, policy_version 127810 (0.0012) [2024-06-15 13:10:10,960][1648982] Fps is (10 sec: 49151.4, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 261849088. Throughput: 0: 10797.5. Samples: 65527296. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:10,961][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:10:12,422][1653645] Updated weights for policy 0, policy_version 127875 (0.0012) [2024-06-15 13:10:13,490][1653645] Updated weights for policy 0, policy_version 127933 (0.0011) [2024-06-15 13:10:15,566][1653645] Updated weights for policy 0, policy_version 127992 (0.0106) [2024-06-15 13:10:15,958][1648982] Fps is (10 sec: 45881.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 262144000. Throughput: 0: 11093.4. Samples: 65593344. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:15,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:10:17,488][1653645] Updated weights for policy 0, policy_version 128056 (0.0014) [2024-06-15 13:10:20,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 262275072. Throughput: 0: 10854.4. Samples: 65626112. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:10:22,779][1653645] Updated weights for policy 0, policy_version 128112 (0.0012) [2024-06-15 13:10:23,717][1653645] Updated weights for policy 0, policy_version 128160 (0.0053) [2024-06-15 13:10:25,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44236.5, 300 sec: 44432.4). Total num frames: 262537216. Throughput: 0: 11053.8. Samples: 65693696. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:25,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:10:26,978][1653645] Updated weights for policy 0, policy_version 128229 (0.0014) [2024-06-15 13:10:28,878][1653645] Updated weights for policy 0, policy_version 128289 (0.0013) [2024-06-15 13:10:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 262799360. Throughput: 0: 10922.7. Samples: 65760256. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:30,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:10:33,916][1653645] Updated weights for policy 0, policy_version 128357 (0.0013) [2024-06-15 13:10:34,856][1653645] Updated weights for policy 0, policy_version 128408 (0.0071) [2024-06-15 13:10:35,033][1651596] Signal inference workers to stop experience collection... (6600 times) [2024-06-15 13:10:35,159][1653645] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-06-15 13:10:35,232][1651596] Signal inference workers to resume experience collection... (6600 times) [2024-06-15 13:10:35,233][1653645] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-06-15 13:10:35,462][1653645] Updated weights for policy 0, policy_version 128446 (0.0072) [2024-06-15 13:10:35,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 263061504. Throughput: 0: 11229.8. Samples: 65803776. Policy #0 lag: (min: 13.0, avg: 97.5, max: 269.0) [2024-06-15 13:10:35,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:10:38,734][1653645] Updated weights for policy 0, policy_version 128496 (0.0012) [2024-06-15 13:10:40,342][1653645] Updated weights for policy 0, policy_version 128560 (0.0013) [2024-06-15 13:10:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45894.1, 300 sec: 44764.4). Total num frames: 263323648. Throughput: 0: 11537.1. Samples: 65870336. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:10:40,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:10:45,355][1653645] Updated weights for policy 0, policy_version 128628 (0.0017) [2024-06-15 13:10:45,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 263487488. Throughput: 0: 11559.8. Samples: 65942016. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:10:45,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:10:46,393][1653645] Updated weights for policy 0, policy_version 128688 (0.0013) [2024-06-15 13:10:50,865][1653645] Updated weights for policy 0, policy_version 128768 (0.0014) [2024-06-15 13:10:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 263716864. Throughput: 0: 11617.1. Samples: 65978880. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:10:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:10:55,958][1648982] Fps is (10 sec: 36043.2, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 263847936. Throughput: 0: 11298.1. Samples: 66035712. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:10:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:10:56,313][1653645] Updated weights for policy 0, policy_version 128851 (0.0013) [2024-06-15 13:10:56,450][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000128864_263913472.pth... [2024-06-15 13:10:56,591][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000123648_253231104.pth [2024-06-15 13:10:56,597][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000128864_263913472.pth [2024-06-15 13:10:58,288][1653645] Updated weights for policy 0, policy_version 128947 (0.0014) [2024-06-15 13:11:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 264110080. Throughput: 0: 11525.7. Samples: 66112000. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:00,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:11:01,281][1653645] Updated weights for policy 0, policy_version 128976 (0.0012) [2024-06-15 13:11:03,350][1653645] Updated weights for policy 0, policy_version 129072 (0.0012) [2024-06-15 13:11:05,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 44784.0, 300 sec: 44653.4). Total num frames: 264372224. Throughput: 0: 11355.0. Samples: 66137088. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:05,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:11:08,588][1653645] Updated weights for policy 0, policy_version 129120 (0.0036) [2024-06-15 13:11:10,486][1653645] Updated weights for policy 0, policy_version 129189 (0.0013) [2024-06-15 13:11:10,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 45875.0, 300 sec: 44320.1). Total num frames: 264601600. Throughput: 0: 11457.4. Samples: 66209280. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:10,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:11:13,217][1653645] Updated weights for policy 0, policy_version 129217 (0.0012) [2024-06-15 13:11:14,932][1653645] Updated weights for policy 0, policy_version 129284 (0.0013) [2024-06-15 13:11:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 264830976. Throughput: 0: 11309.5. Samples: 66269184. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:11:16,316][1653645] Updated weights for policy 0, policy_version 129342 (0.0116) [2024-06-15 13:11:20,928][1651596] Signal inference workers to stop experience collection... (6650 times) [2024-06-15 13:11:20,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 264929280. Throughput: 0: 11150.2. Samples: 66305536. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:20,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:11:21,011][1653645] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-06-15 13:11:21,264][1651596] Signal inference workers to resume experience collection... (6650 times) [2024-06-15 13:11:21,266][1653645] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-06-15 13:11:21,728][1653645] Updated weights for policy 0, policy_version 129392 (0.0011) [2024-06-15 13:11:23,265][1653645] Updated weights for policy 0, policy_version 129456 (0.0023) [2024-06-15 13:11:25,840][1653645] Updated weights for policy 0, policy_version 129504 (0.0012) [2024-06-15 13:11:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.2, 300 sec: 44542.3). Total num frames: 265224192. Throughput: 0: 11116.1. Samples: 66370560. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:25,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 13:11:27,566][1653645] Updated weights for policy 0, policy_version 129568 (0.0013) [2024-06-15 13:11:30,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 265420800. Throughput: 0: 10820.2. Samples: 66428928. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:11:33,445][1653645] Updated weights for policy 0, policy_version 129632 (0.0012) [2024-06-15 13:11:34,994][1653645] Updated weights for policy 0, policy_version 129700 (0.0012) [2024-06-15 13:11:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 265682944. Throughput: 0: 10911.3. Samples: 66469888. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:11:38,098][1653645] Updated weights for policy 0, policy_version 129760 (0.0013) [2024-06-15 13:11:39,572][1653645] Updated weights for policy 0, policy_version 129824 (0.0012) [2024-06-15 13:11:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 265945088. Throughput: 0: 10934.1. Samples: 66527744. Policy #0 lag: (min: 58.0, avg: 162.1, max: 266.0) [2024-06-15 13:11:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:11:44,759][1653645] Updated weights for policy 0, policy_version 129872 (0.0021) [2024-06-15 13:11:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 43986.8). Total num frames: 266076160. Throughput: 0: 11002.3. Samples: 66607104. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:11:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:11:46,031][1653645] Updated weights for policy 0, policy_version 129921 (0.0012) [2024-06-15 13:11:47,578][1653645] Updated weights for policy 0, policy_version 129984 (0.0015) [2024-06-15 13:11:50,318][1653645] Updated weights for policy 0, policy_version 130080 (0.0014) [2024-06-15 13:11:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 266469376. Throughput: 0: 11093.3. Samples: 66636288. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:11:50,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:11:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 266469376. Throughput: 0: 10877.2. Samples: 66698752. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:11:55,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:11:57,105][1653645] Updated weights for policy 0, policy_version 130160 (0.0015) [2024-06-15 13:11:58,299][1653645] Updated weights for policy 0, policy_version 130195 (0.0010) [2024-06-15 13:12:00,958][1648982] Fps is (10 sec: 29490.8, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 266764288. Throughput: 0: 10990.9. Samples: 66763776. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:12:01,382][1651596] Signal inference workers to stop experience collection... (6700 times) [2024-06-15 13:12:01,436][1653645] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-06-15 13:12:01,566][1651596] Signal inference workers to resume experience collection... (6700 times) [2024-06-15 13:12:01,567][1653645] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-06-15 13:12:01,569][1653645] Updated weights for policy 0, policy_version 130288 (0.0014) [2024-06-15 13:12:03,219][1653645] Updated weights for policy 0, policy_version 130365 (0.0015) [2024-06-15 13:12:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 266993664. Throughput: 0: 10717.9. Samples: 66787840. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:12:09,237][1653645] Updated weights for policy 0, policy_version 130422 (0.0012) [2024-06-15 13:12:10,396][1653645] Updated weights for policy 0, policy_version 130464 (0.0013) [2024-06-15 13:12:10,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 267223040. Throughput: 0: 11002.3. Samples: 66865664. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:10,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:12:13,276][1653645] Updated weights for policy 0, policy_version 130544 (0.0125) [2024-06-15 13:12:14,624][1653645] Updated weights for policy 0, policy_version 130578 (0.0013) [2024-06-15 13:12:15,457][1653645] Updated weights for policy 0, policy_version 130621 (0.0015) [2024-06-15 13:12:15,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 44782.7, 300 sec: 44653.3). Total num frames: 267517952. Throughput: 0: 10979.5. Samples: 66923008. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:15,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:12:20,642][1653645] Updated weights for policy 0, policy_version 130683 (0.0020) [2024-06-15 13:12:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 267649024. Throughput: 0: 11070.6. Samples: 66968064. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:12:22,277][1653645] Updated weights for policy 0, policy_version 130747 (0.0015) [2024-06-15 13:12:24,635][1653645] Updated weights for policy 0, policy_version 130811 (0.0013) [2024-06-15 13:12:25,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 267943936. Throughput: 0: 11207.1. Samples: 67032064. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:25,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:12:26,700][1653645] Updated weights for policy 0, policy_version 130864 (0.0012) [2024-06-15 13:12:30,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 268075008. Throughput: 0: 11229.8. Samples: 67112448. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:12:31,045][1653645] Updated weights for policy 0, policy_version 130898 (0.0041) [2024-06-15 13:12:32,290][1653645] Updated weights for policy 0, policy_version 130960 (0.0014) [2024-06-15 13:12:33,160][1653645] Updated weights for policy 0, policy_version 131002 (0.0014) [2024-06-15 13:12:35,399][1653645] Updated weights for policy 0, policy_version 131041 (0.0016) [2024-06-15 13:12:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 44542.2). Total num frames: 268402688. Throughput: 0: 11286.8. Samples: 67144192. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:12:37,292][1653645] Updated weights for policy 0, policy_version 131123 (0.0014) [2024-06-15 13:12:40,974][1648982] Fps is (10 sec: 49074.5, 60 sec: 43679.1, 300 sec: 44428.8). Total num frames: 268566528. Throughput: 0: 11362.4. Samples: 67210240. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:40,975][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:12:42,273][1653645] Updated weights for policy 0, policy_version 131152 (0.0066) [2024-06-15 13:12:44,478][1653645] Updated weights for policy 0, policy_version 131248 (0.0014) [2024-06-15 13:12:45,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 45875.1, 300 sec: 44431.1). Total num frames: 268828672. Throughput: 0: 11434.7. Samples: 67278336. Policy #0 lag: (min: 12.0, avg: 85.6, max: 268.0) [2024-06-15 13:12:45,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:12:46,723][1651596] Signal inference workers to stop experience collection... (6750 times) [2024-06-15 13:12:46,793][1653645] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-06-15 13:12:46,968][1651596] Signal inference workers to resume experience collection... (6750 times) [2024-06-15 13:12:46,969][1653645] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-06-15 13:12:47,591][1653645] Updated weights for policy 0, policy_version 131316 (0.0017) [2024-06-15 13:12:49,401][1653645] Updated weights for policy 0, policy_version 131385 (0.0026) [2024-06-15 13:12:50,958][1648982] Fps is (10 sec: 52512.6, 60 sec: 43690.7, 300 sec: 44432.5). Total num frames: 269090816. Throughput: 0: 11491.6. Samples: 67304960. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:12:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:12:55,244][1653645] Updated weights for policy 0, policy_version 131440 (0.0015) [2024-06-15 13:12:55,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 46421.0, 300 sec: 44320.0). Total num frames: 269254656. Throughput: 0: 11468.7. Samples: 67381760. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:12:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:12:56,432][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000131504_269320192.pth... [2024-06-15 13:12:56,491][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000126272_258605056.pth [2024-06-15 13:12:56,536][1653645] Updated weights for policy 0, policy_version 131505 (0.0013) [2024-06-15 13:12:59,139][1653645] Updated weights for policy 0, policy_version 131552 (0.0012) [2024-06-15 13:13:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 269516800. Throughput: 0: 11468.9. Samples: 67439104. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:00,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:13:01,728][1653645] Updated weights for policy 0, policy_version 131648 (0.0015) [2024-06-15 13:13:05,958][1648982] Fps is (10 sec: 36046.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 269615104. Throughput: 0: 11116.1. Samples: 67468288. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:13:07,746][1653645] Updated weights for policy 0, policy_version 131719 (0.0013) [2024-06-15 13:13:10,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 269877248. Throughput: 0: 11207.1. Samples: 67536384. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:13:11,138][1653645] Updated weights for policy 0, policy_version 131794 (0.0030) [2024-06-15 13:13:12,501][1653645] Updated weights for policy 0, policy_version 131856 (0.0011) [2024-06-15 13:13:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.9, 300 sec: 44431.3). Total num frames: 270139392. Throughput: 0: 10854.4. Samples: 67600896. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:15,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:13:18,290][1653645] Updated weights for policy 0, policy_version 131910 (0.0015) [2024-06-15 13:13:19,448][1653645] Updated weights for policy 0, policy_version 131968 (0.0012) [2024-06-15 13:13:20,959][1648982] Fps is (10 sec: 49147.1, 60 sec: 45328.2, 300 sec: 44320.0). Total num frames: 270368768. Throughput: 0: 10990.7. Samples: 67638784. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:20,962][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:13:21,166][1653645] Updated weights for policy 0, policy_version 132026 (0.0012) [2024-06-15 13:13:23,324][1653645] Updated weights for policy 0, policy_version 132065 (0.0012) [2024-06-15 13:13:24,630][1653645] Updated weights for policy 0, policy_version 132117 (0.0011) [2024-06-15 13:13:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 270663680. Throughput: 0: 10937.9. Samples: 67702272. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:13:30,958][1648982] Fps is (10 sec: 29494.2, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 270663680. Throughput: 0: 10956.8. Samples: 67771392. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:13:31,304][1653645] Updated weights for policy 0, policy_version 132181 (0.0014) [2024-06-15 13:13:32,006][1651596] Signal inference workers to stop experience collection... (6800 times) [2024-06-15 13:13:32,065][1653645] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-06-15 13:13:32,272][1651596] Signal inference workers to resume experience collection... (6800 times) [2024-06-15 13:13:32,272][1653645] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-06-15 13:13:32,670][1653645] Updated weights for policy 0, policy_version 132240 (0.0026) [2024-06-15 13:13:33,744][1653645] Updated weights for policy 0, policy_version 132282 (0.0011) [2024-06-15 13:13:35,195][1653645] Updated weights for policy 0, policy_version 132323 (0.0017) [2024-06-15 13:13:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44320.2). Total num frames: 271056896. Throughput: 0: 11116.1. Samples: 67805184. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:35,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:13:37,219][1653645] Updated weights for policy 0, policy_version 132414 (0.0013) [2024-06-15 13:13:40,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43702.3, 300 sec: 44431.2). Total num frames: 271187968. Throughput: 0: 10774.9. Samples: 67866624. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:13:43,330][1653645] Updated weights for policy 0, policy_version 132464 (0.0012) [2024-06-15 13:13:45,265][1653645] Updated weights for policy 0, policy_version 132532 (0.0013) [2024-06-15 13:13:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 271450112. Throughput: 0: 10990.9. Samples: 67933696. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:13:47,288][1653645] Updated weights for policy 0, policy_version 132576 (0.0013) [2024-06-15 13:13:49,546][1653645] Updated weights for policy 0, policy_version 132672 (0.0018) [2024-06-15 13:13:50,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 271712256. Throughput: 0: 11013.7. Samples: 67963904. Policy #0 lag: (min: 47.0, avg: 164.5, max: 303.0) [2024-06-15 13:13:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:13:55,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42598.8, 300 sec: 44320.1). Total num frames: 271810560. Throughput: 0: 11059.2. Samples: 68034048. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:13:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:13:56,800][1653645] Updated weights for policy 0, policy_version 132768 (0.0013) [2024-06-15 13:13:57,480][1653645] Updated weights for policy 0, policy_version 132798 (0.0011) [2024-06-15 13:13:59,840][1653645] Updated weights for policy 0, policy_version 132864 (0.0040) [2024-06-15 13:14:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 272171008. Throughput: 0: 10945.4. Samples: 68093440. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:14:01,180][1653645] Updated weights for policy 0, policy_version 132922 (0.0013) [2024-06-15 13:14:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 272236544. Throughput: 0: 10911.5. Samples: 68129792. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:05,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:14:07,545][1653645] Updated weights for policy 0, policy_version 132992 (0.0012) [2024-06-15 13:14:09,078][1653645] Updated weights for policy 0, policy_version 133045 (0.0012) [2024-06-15 13:14:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 272498688. Throughput: 0: 10911.3. Samples: 68193280. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:14:11,732][1653645] Updated weights for policy 0, policy_version 133097 (0.0013) [2024-06-15 13:14:13,153][1653645] Updated weights for policy 0, policy_version 133152 (0.0013) [2024-06-15 13:14:13,258][1651596] Signal inference workers to stop experience collection... (6850 times) [2024-06-15 13:14:13,295][1653645] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-06-15 13:14:13,440][1651596] Signal inference workers to resume experience collection... (6850 times) [2024-06-15 13:14:13,440][1653645] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-06-15 13:14:15,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 272760832. Throughput: 0: 10831.6. Samples: 68258816. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:15,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:14:18,374][1653645] Updated weights for policy 0, policy_version 133216 (0.0014) [2024-06-15 13:14:20,143][1653645] Updated weights for policy 0, policy_version 133280 (0.0015) [2024-06-15 13:14:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 44237.6, 300 sec: 44542.3). Total num frames: 273022976. Throughput: 0: 10865.8. Samples: 68294144. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:14:23,792][1653645] Updated weights for policy 0, policy_version 133348 (0.0035) [2024-06-15 13:14:25,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 42598.4, 300 sec: 44209.1). Total num frames: 273219584. Throughput: 0: 10843.0. Samples: 68354560. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:25,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:14:26,061][1653645] Updated weights for policy 0, policy_version 133424 (0.0013) [2024-06-15 13:14:30,958][1648982] Fps is (10 sec: 29490.8, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 273317888. Throughput: 0: 10865.8. Samples: 68422656. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:30,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:14:31,171][1653645] Updated weights for policy 0, policy_version 133473 (0.0011) [2024-06-15 13:14:32,306][1653645] Updated weights for policy 0, policy_version 133520 (0.0013) [2024-06-15 13:14:35,316][1653645] Updated weights for policy 0, policy_version 133574 (0.0106) [2024-06-15 13:14:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 44212.7). Total num frames: 273612800. Throughput: 0: 10706.5. Samples: 68445696. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:14:38,089][1653645] Updated weights for policy 0, policy_version 133648 (0.0013) [2024-06-15 13:14:39,105][1653645] Updated weights for policy 0, policy_version 133693 (0.0012) [2024-06-15 13:14:40,958][1648982] Fps is (10 sec: 49150.7, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 273809408. Throughput: 0: 10615.4. Samples: 68511744. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:14:43,492][1653645] Updated weights for policy 0, policy_version 133755 (0.0010) [2024-06-15 13:14:44,888][1653645] Updated weights for policy 0, policy_version 133808 (0.0015) [2024-06-15 13:14:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 274071552. Throughput: 0: 10786.1. Samples: 68578816. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:14:47,475][1653645] Updated weights for policy 0, policy_version 133840 (0.0016) [2024-06-15 13:14:48,597][1653645] Updated weights for policy 0, policy_version 133888 (0.0011) [2024-06-15 13:14:50,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 274268160. Throughput: 0: 10683.7. Samples: 68610560. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:14:51,346][1653645] Updated weights for policy 0, policy_version 133952 (0.0016) [2024-06-15 13:14:55,031][1653645] Updated weights for policy 0, policy_version 134011 (0.0038) [2024-06-15 13:14:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44542.2). Total num frames: 274497536. Throughput: 0: 10808.9. Samples: 68679680. Policy #0 lag: (min: 2.0, avg: 78.6, max: 258.0) [2024-06-15 13:14:55,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:14:56,411][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000134064_274563072.pth... [2024-06-15 13:14:56,457][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000128864_263913472.pth [2024-06-15 13:14:56,651][1653645] Updated weights for policy 0, policy_version 134073 (0.0012) [2024-06-15 13:15:00,158][1653645] Updated weights for policy 0, policy_version 134128 (0.0132) [2024-06-15 13:15:00,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 42598.5, 300 sec: 44209.2). Total num frames: 274726912. Throughput: 0: 10740.7. Samples: 68742144. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:00,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:15:02,154][1653645] Updated weights for policy 0, policy_version 134176 (0.0012) [2024-06-15 13:15:02,230][1651596] Signal inference workers to stop experience collection... (6900 times) [2024-06-15 13:15:02,269][1653645] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-06-15 13:15:02,534][1651596] Signal inference workers to resume experience collection... (6900 times) [2024-06-15 13:15:02,535][1653645] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-06-15 13:15:05,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 274857984. Throughput: 0: 10729.2. Samples: 68776960. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:05,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:15:06,166][1653645] Updated weights for policy 0, policy_version 134224 (0.0016) [2024-06-15 13:15:08,512][1653645] Updated weights for policy 0, policy_version 134307 (0.0108) [2024-06-15 13:15:10,964][1648982] Fps is (10 sec: 39296.1, 60 sec: 43686.1, 300 sec: 43985.9). Total num frames: 275120128. Throughput: 0: 10807.3. Samples: 68840960. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:10,965][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:15:11,755][1653645] Updated weights for policy 0, policy_version 134356 (0.0011) [2024-06-15 13:15:12,712][1653645] Updated weights for policy 0, policy_version 134399 (0.0018) [2024-06-15 13:15:14,966][1653645] Updated weights for policy 0, policy_version 134462 (0.0014) [2024-06-15 13:15:15,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 275382272. Throughput: 0: 10831.7. Samples: 68910080. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:15:18,228][1653645] Updated weights for policy 0, policy_version 134516 (0.0102) [2024-06-15 13:15:20,041][1653645] Updated weights for policy 0, policy_version 134561 (0.0011) [2024-06-15 13:15:20,958][1648982] Fps is (10 sec: 52461.3, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 275644416. Throughput: 0: 11104.7. Samples: 68945408. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:15:22,907][1653645] Updated weights for policy 0, policy_version 134594 (0.0019) [2024-06-15 13:15:24,220][1653645] Updated weights for policy 0, policy_version 134650 (0.0011) [2024-06-15 13:15:25,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 275841024. Throughput: 0: 11230.0. Samples: 69017088. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:25,958][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 13:15:28,728][1653645] Updated weights for policy 0, policy_version 134726 (0.0100) [2024-06-15 13:15:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 276037632. Throughput: 0: 11172.9. Samples: 69081600. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:30,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:15:31,209][1653645] Updated weights for policy 0, policy_version 134803 (0.0013) [2024-06-15 13:15:32,189][1653645] Updated weights for policy 0, policy_version 134848 (0.0011) [2024-06-15 13:15:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 276299776. Throughput: 0: 11195.8. Samples: 69114368. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:15:37,231][1653645] Updated weights for policy 0, policy_version 134928 (0.0014) [2024-06-15 13:15:40,746][1653645] Updated weights for policy 0, policy_version 135008 (0.0045) [2024-06-15 13:15:40,978][1648982] Fps is (10 sec: 45782.1, 60 sec: 44767.8, 300 sec: 44094.9). Total num frames: 276496384. Throughput: 0: 11179.3. Samples: 69182976. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:40,979][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:15:43,551][1653645] Updated weights for policy 0, policy_version 135056 (0.0013) [2024-06-15 13:15:44,488][1653645] Updated weights for policy 0, policy_version 135102 (0.0113) [2024-06-15 13:15:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 276692992. Throughput: 0: 11184.3. Samples: 69245440. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:45,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:15:47,283][1653645] Updated weights for policy 0, policy_version 135164 (0.0032) [2024-06-15 13:15:50,102][1653645] Updated weights for policy 0, policy_version 135232 (0.0091) [2024-06-15 13:15:50,958][1648982] Fps is (10 sec: 45969.6, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 276955136. Throughput: 0: 11218.5. Samples: 69281792. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:50,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 13:15:51,929][1651596] Signal inference workers to stop experience collection... (6950 times) [2024-06-15 13:15:51,988][1653645] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-06-15 13:15:52,191][1651596] Signal inference workers to resume experience collection... (6950 times) [2024-06-15 13:15:52,192][1653645] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-06-15 13:15:52,871][1653645] Updated weights for policy 0, policy_version 135292 (0.0012) [2024-06-15 13:15:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 277151744. Throughput: 0: 11288.3. Samples: 69348864. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:15:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:15:58,257][1653645] Updated weights for policy 0, policy_version 135376 (0.0012) [2024-06-15 13:16:00,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 44236.5, 300 sec: 44097.9). Total num frames: 277381120. Throughput: 0: 11263.9. Samples: 69416960. Policy #0 lag: (min: 15.0, avg: 124.7, max: 271.0) [2024-06-15 13:16:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:16:00,979][1653645] Updated weights for policy 0, policy_version 135444 (0.0012) [2024-06-15 13:16:03,226][1653645] Updated weights for policy 0, policy_version 135490 (0.0077) [2024-06-15 13:16:04,562][1653645] Updated weights for policy 0, policy_version 135545 (0.0017) [2024-06-15 13:16:05,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 277610496. Throughput: 0: 11241.3. Samples: 69451264. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:16:07,940][1653645] Updated weights for policy 0, policy_version 135584 (0.0017) [2024-06-15 13:16:09,835][1653645] Updated weights for policy 0, policy_version 135618 (0.0013) [2024-06-15 13:16:10,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 45333.9, 300 sec: 44097.9). Total num frames: 277839872. Throughput: 0: 11195.7. Samples: 69520896. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:10,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:16:11,237][1653645] Updated weights for policy 0, policy_version 135680 (0.0032) [2024-06-15 13:16:15,301][1653645] Updated weights for policy 0, policy_version 135776 (0.0027) [2024-06-15 13:16:15,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45328.9, 300 sec: 44653.3). Total num frames: 278102016. Throughput: 0: 11036.5. Samples: 69578240. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:16:19,959][1653645] Updated weights for policy 0, policy_version 135829 (0.0013) [2024-06-15 13:16:20,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.9, 300 sec: 44209.0). Total num frames: 278265856. Throughput: 0: 11059.2. Samples: 69612032. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:16:22,098][1653645] Updated weights for policy 0, policy_version 135888 (0.0011) [2024-06-15 13:16:23,700][1653645] Updated weights for policy 0, policy_version 135952 (0.0011) [2024-06-15 13:16:24,843][1653645] Updated weights for policy 0, policy_version 136000 (0.0014) [2024-06-15 13:16:25,959][1648982] Fps is (10 sec: 42592.8, 60 sec: 44781.8, 300 sec: 44431.0). Total num frames: 278528000. Throughput: 0: 11041.1. Samples: 69679616. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:25,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:16:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 278659072. Throughput: 0: 11116.1. Samples: 69745664. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:16:31,846][1653645] Updated weights for policy 0, policy_version 136066 (0.0013) [2024-06-15 13:16:32,864][1653645] Updated weights for policy 0, policy_version 136120 (0.0009) [2024-06-15 13:16:34,211][1653645] Updated weights for policy 0, policy_version 136167 (0.0012) [2024-06-15 13:16:35,736][1653645] Updated weights for policy 0, policy_version 136225 (0.0011) [2024-06-15 13:16:35,958][1648982] Fps is (10 sec: 45880.9, 60 sec: 44782.7, 300 sec: 44209.0). Total num frames: 278986752. Throughput: 0: 11252.6. Samples: 69788160. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:35,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:16:38,222][1651596] Signal inference workers to stop experience collection... (7000 times) [2024-06-15 13:16:38,357][1653645] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-06-15 13:16:38,433][1651596] Signal inference workers to resume experience collection... (7000 times) [2024-06-15 13:16:38,434][1653645] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-06-15 13:16:38,604][1653645] Updated weights for policy 0, policy_version 136275 (0.0014) [2024-06-15 13:16:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44798.2, 300 sec: 44431.2). Total num frames: 279183360. Throughput: 0: 11082.0. Samples: 69847552. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:16:44,322][1653645] Updated weights for policy 0, policy_version 136352 (0.0012) [2024-06-15 13:16:45,750][1653645] Updated weights for policy 0, policy_version 136416 (0.0012) [2024-06-15 13:16:45,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 279379968. Throughput: 0: 11138.9. Samples: 69918208. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:16:47,231][1653645] Updated weights for policy 0, policy_version 136468 (0.0013) [2024-06-15 13:16:50,569][1653645] Updated weights for policy 0, policy_version 136528 (0.0014) [2024-06-15 13:16:50,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 279642112. Throughput: 0: 11002.3. Samples: 69946368. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:16:54,973][1653645] Updated weights for policy 0, policy_version 136580 (0.0014) [2024-06-15 13:16:55,963][1648982] Fps is (10 sec: 42575.1, 60 sec: 44232.8, 300 sec: 44208.2). Total num frames: 279805952. Throughput: 0: 11171.6. Samples: 70023680. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:16:55,964][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 13:16:56,072][1653645] Updated weights for policy 0, policy_version 136638 (0.0012) [2024-06-15 13:16:56,109][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000136640_279838720.pth... [2024-06-15 13:16:56,249][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000131504_269320192.pth [2024-06-15 13:16:57,619][1653645] Updated weights for policy 0, policy_version 136703 (0.0075) [2024-06-15 13:16:59,449][1653645] Updated weights for policy 0, policy_version 136759 (0.0076) [2024-06-15 13:17:00,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45329.3, 300 sec: 44431.2). Total num frames: 280100864. Throughput: 0: 11298.2. Samples: 70086656. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:17:00,958][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 13:17:02,862][1653645] Updated weights for policy 0, policy_version 136805 (0.0012) [2024-06-15 13:17:05,958][1648982] Fps is (10 sec: 42621.6, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 280231936. Throughput: 0: 11309.5. Samples: 70120960. Policy #0 lag: (min: 5.0, avg: 124.6, max: 261.0) [2024-06-15 13:17:05,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:17:06,654][1653645] Updated weights for policy 0, policy_version 136833 (0.0015) [2024-06-15 13:17:08,340][1653645] Updated weights for policy 0, policy_version 136904 (0.0014) [2024-06-15 13:17:09,537][1653645] Updated weights for policy 0, policy_version 136954 (0.0011) [2024-06-15 13:17:10,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 45874.9, 300 sec: 44320.1). Total num frames: 280592384. Throughput: 0: 11344.0. Samples: 70190080. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:10,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:17:11,076][1653645] Updated weights for policy 0, policy_version 137021 (0.0011) [2024-06-15 13:17:14,970][1653645] Updated weights for policy 0, policy_version 137088 (0.0015) [2024-06-15 13:17:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 280756224. Throughput: 0: 11264.0. Samples: 70252544. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:17:20,217][1653645] Updated weights for policy 0, policy_version 137168 (0.0202) [2024-06-15 13:17:20,701][1651596] Signal inference workers to stop experience collection... (7050 times) [2024-06-15 13:17:20,745][1653645] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-06-15 13:17:20,958][1648982] Fps is (10 sec: 36046.1, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 280952832. Throughput: 0: 11264.1. Samples: 70295040. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:17:20,977][1651596] Signal inference workers to resume experience collection... (7050 times) [2024-06-15 13:17:20,977][1653645] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-06-15 13:17:21,267][1653645] Updated weights for policy 0, policy_version 137216 (0.0011) [2024-06-15 13:17:25,447][1653645] Updated weights for policy 0, policy_version 137303 (0.0021) [2024-06-15 13:17:25,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44784.0, 300 sec: 44542.3). Total num frames: 281214976. Throughput: 0: 11252.6. Samples: 70353920. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:17:30,679][1653645] Updated weights for policy 0, policy_version 137345 (0.0144) [2024-06-15 13:17:30,960][1648982] Fps is (10 sec: 36036.6, 60 sec: 44235.1, 300 sec: 43764.4). Total num frames: 281313280. Throughput: 0: 11297.6. Samples: 70426624. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:30,963][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:17:32,497][1653645] Updated weights for policy 0, policy_version 137411 (0.0114) [2024-06-15 13:17:34,579][1653645] Updated weights for policy 0, policy_version 137488 (0.0145) [2024-06-15 13:17:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44783.1, 300 sec: 44433.6). Total num frames: 281673728. Throughput: 0: 11207.1. Samples: 70450688. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:17:37,615][1653645] Updated weights for policy 0, policy_version 137552 (0.0014) [2024-06-15 13:17:40,958][1648982] Fps is (10 sec: 49163.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 281804800. Throughput: 0: 10833.0. Samples: 70511104. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:17:43,013][1653645] Updated weights for policy 0, policy_version 137616 (0.0013) [2024-06-15 13:17:45,275][1653645] Updated weights for policy 0, policy_version 137724 (0.0095) [2024-06-15 13:17:45,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 44782.7, 300 sec: 43986.8). Total num frames: 282066944. Throughput: 0: 10899.8. Samples: 70577152. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:17:47,632][1653645] Updated weights for policy 0, policy_version 137776 (0.0019) [2024-06-15 13:17:49,522][1653645] Updated weights for policy 0, policy_version 137827 (0.0014) [2024-06-15 13:17:50,177][1653645] Updated weights for policy 0, policy_version 137854 (0.0012) [2024-06-15 13:17:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44783.0, 300 sec: 44320.2). Total num frames: 282329088. Throughput: 0: 10956.8. Samples: 70614016. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:17:55,410][1653645] Updated weights for policy 0, policy_version 137914 (0.0094) [2024-06-15 13:17:55,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44786.8, 300 sec: 43986.8). Total num frames: 282492928. Throughput: 0: 11104.7. Samples: 70689792. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:17:55,959][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 13:17:56,895][1653645] Updated weights for policy 0, policy_version 137973 (0.0027) [2024-06-15 13:17:59,236][1653645] Updated weights for policy 0, policy_version 138041 (0.0019) [2024-06-15 13:18:00,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 282820608. Throughput: 0: 11047.8. Samples: 70749696. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:18:00,958][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 13:18:01,119][1653645] Updated weights for policy 0, policy_version 138108 (0.0013) [2024-06-15 13:18:05,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 282886144. Throughput: 0: 10854.4. Samples: 70783488. Policy #0 lag: (min: 15.0, avg: 97.1, max: 271.0) [2024-06-15 13:18:05,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:18:06,780][1653645] Updated weights for policy 0, policy_version 138171 (0.0012) [2024-06-15 13:18:08,046][1651596] Signal inference workers to stop experience collection... (7100 times) [2024-06-15 13:18:08,092][1653645] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-06-15 13:18:08,228][1651596] Signal inference workers to resume experience collection... (7100 times) [2024-06-15 13:18:08,230][1653645] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-06-15 13:18:08,232][1653645] Updated weights for policy 0, policy_version 138208 (0.0013) [2024-06-15 13:18:09,918][1653645] Updated weights for policy 0, policy_version 138246 (0.0012) [2024-06-15 13:18:10,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 283213824. Throughput: 0: 11138.8. Samples: 70855168. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:18:11,267][1653645] Updated weights for policy 0, policy_version 138299 (0.0014) [2024-06-15 13:18:13,272][1653645] Updated weights for policy 0, policy_version 138363 (0.0015) [2024-06-15 13:18:15,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.7, 300 sec: 44098.1). Total num frames: 283377664. Throughput: 0: 10923.2. Samples: 70918144. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:15,960][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 13:18:19,897][1653645] Updated weights for policy 0, policy_version 138448 (0.0088) [2024-06-15 13:18:20,923][1653645] Updated weights for policy 0, policy_version 138496 (0.0020) [2024-06-15 13:18:20,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 283639808. Throughput: 0: 11138.9. Samples: 70951936. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:18:23,477][1653645] Updated weights for policy 0, policy_version 138561 (0.0021) [2024-06-15 13:18:24,876][1653645] Updated weights for policy 0, policy_version 138622 (0.0012) [2024-06-15 13:18:25,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 283901952. Throughput: 0: 11116.1. Samples: 71011328. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:18:30,154][1653645] Updated weights for policy 0, policy_version 138680 (0.0025) [2024-06-15 13:18:30,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 45330.6, 300 sec: 43986.8). Total num frames: 284033024. Throughput: 0: 11332.3. Samples: 71087104. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:30,959][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 13:18:32,335][1653645] Updated weights for policy 0, policy_version 138723 (0.0014) [2024-06-15 13:18:34,222][1653645] Updated weights for policy 0, policy_version 138760 (0.0012) [2024-06-15 13:18:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 284295168. Throughput: 0: 11252.6. Samples: 71120384. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:35,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 13:18:36,160][1653645] Updated weights for policy 0, policy_version 138834 (0.0013) [2024-06-15 13:18:40,444][1653645] Updated weights for policy 0, policy_version 138881 (0.0013) [2024-06-15 13:18:40,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 284491776. Throughput: 0: 11047.8. Samples: 71186944. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:40,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:18:41,489][1653645] Updated weights for policy 0, policy_version 138938 (0.0014) [2024-06-15 13:18:43,100][1653645] Updated weights for policy 0, policy_version 138963 (0.0012) [2024-06-15 13:18:44,040][1653645] Updated weights for policy 0, policy_version 139007 (0.0012) [2024-06-15 13:18:45,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44237.0, 300 sec: 44097.9). Total num frames: 284721152. Throughput: 0: 11389.1. Samples: 71262208. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:45,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:18:46,506][1653645] Updated weights for policy 0, policy_version 139042 (0.0017) [2024-06-15 13:18:48,292][1653645] Updated weights for policy 0, policy_version 139110 (0.0014) [2024-06-15 13:18:50,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 284950528. Throughput: 0: 11173.0. Samples: 71286272. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:18:51,960][1653645] Updated weights for policy 0, policy_version 139153 (0.0013) [2024-06-15 13:18:53,663][1651596] Signal inference workers to stop experience collection... (7150 times) [2024-06-15 13:18:53,761][1653645] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-06-15 13:18:53,800][1653645] Updated weights for policy 0, policy_version 139206 (0.0019) [2024-06-15 13:18:53,987][1651596] Signal inference workers to resume experience collection... (7150 times) [2024-06-15 13:18:53,989][1653645] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-06-15 13:18:55,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 285212672. Throughput: 0: 11309.5. Samples: 71364096. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:18:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:18:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000139264_285212672.pth... [2024-06-15 13:18:56,011][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000134064_274563072.pth [2024-06-15 13:18:56,931][1653645] Updated weights for policy 0, policy_version 139265 (0.0010) [2024-06-15 13:18:58,133][1653645] Updated weights for policy 0, policy_version 139317 (0.0014) [2024-06-15 13:18:59,838][1653645] Updated weights for policy 0, policy_version 139360 (0.0013) [2024-06-15 13:19:00,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 285474816. Throughput: 0: 11184.3. Samples: 71421440. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:19:00,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:19:04,126][1653645] Updated weights for policy 0, policy_version 139440 (0.0026) [2024-06-15 13:19:05,742][1653645] Updated weights for policy 0, policy_version 139472 (0.0027) [2024-06-15 13:19:05,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 285638656. Throughput: 0: 11343.6. Samples: 71462400. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:19:05,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:19:06,933][1653645] Updated weights for policy 0, policy_version 139514 (0.0013) [2024-06-15 13:19:08,618][1653645] Updated weights for policy 0, policy_version 139552 (0.0013) [2024-06-15 13:19:10,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 285900800. Throughput: 0: 11502.9. Samples: 71528960. Policy #0 lag: (min: 15.0, avg: 128.8, max: 271.0) [2024-06-15 13:19:10,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:19:11,114][1653645] Updated weights for policy 0, policy_version 139608 (0.0013) [2024-06-15 13:19:14,534][1653645] Updated weights for policy 0, policy_version 139680 (0.0084) [2024-06-15 13:19:15,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 286130176. Throughput: 0: 11400.6. Samples: 71600128. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:19:17,238][1653645] Updated weights for policy 0, policy_version 139728 (0.0016) [2024-06-15 13:19:19,721][1653645] Updated weights for policy 0, policy_version 139778 (0.0013) [2024-06-15 13:19:20,960][1648982] Fps is (10 sec: 45866.9, 60 sec: 45327.4, 300 sec: 44541.9). Total num frames: 286359552. Throughput: 0: 11468.3. Samples: 71636480. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:20,961][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:19:21,270][1653645] Updated weights for policy 0, policy_version 139837 (0.0030) [2024-06-15 13:19:23,318][1653645] Updated weights for policy 0, policy_version 139894 (0.0015) [2024-06-15 13:19:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 286556160. Throughput: 0: 11355.1. Samples: 71697920. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:25,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:19:26,156][1653645] Updated weights for policy 0, policy_version 139937 (0.0012) [2024-06-15 13:19:28,340][1653645] Updated weights for policy 0, policy_version 139974 (0.0046) [2024-06-15 13:19:29,728][1653645] Updated weights for policy 0, policy_version 140032 (0.0011) [2024-06-15 13:19:30,967][1648982] Fps is (10 sec: 45840.6, 60 sec: 46414.0, 300 sec: 44763.0). Total num frames: 286818304. Throughput: 0: 11489.1. Samples: 71779328. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:30,968][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:19:31,790][1653645] Updated weights for policy 0, policy_version 140080 (0.0016) [2024-06-15 13:19:34,337][1653645] Updated weights for policy 0, policy_version 140144 (0.0102) [2024-06-15 13:19:35,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 287047680. Throughput: 0: 11662.2. Samples: 71811072. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:35,958][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 13:19:37,774][1653645] Updated weights for policy 0, policy_version 140215 (0.0013) [2024-06-15 13:19:40,046][1651596] Signal inference workers to stop experience collection... (7200 times) [2024-06-15 13:19:40,081][1653645] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-06-15 13:19:40,341][1651596] Signal inference workers to resume experience collection... (7200 times) [2024-06-15 13:19:40,344][1653645] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-06-15 13:19:40,347][1653645] Updated weights for policy 0, policy_version 140272 (0.0013) [2024-06-15 13:19:40,958][1648982] Fps is (10 sec: 49199.7, 60 sec: 46967.7, 300 sec: 44875.5). Total num frames: 287309824. Throughput: 0: 11480.2. Samples: 71880704. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:40,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:19:42,183][1653645] Updated weights for policy 0, policy_version 140295 (0.0012) [2024-06-15 13:19:45,704][1653645] Updated weights for policy 0, policy_version 140389 (0.0115) [2024-06-15 13:19:45,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 46967.5, 300 sec: 44986.6). Total num frames: 287539200. Throughput: 0: 11616.7. Samples: 71944192. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:19:48,843][1653645] Updated weights for policy 0, policy_version 140434 (0.0013) [2024-06-15 13:19:49,738][1653645] Updated weights for policy 0, policy_version 140476 (0.0011) [2024-06-15 13:19:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 287703040. Throughput: 0: 11480.2. Samples: 71979008. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:19:52,161][1653645] Updated weights for policy 0, policy_version 140534 (0.0012) [2024-06-15 13:19:54,701][1653645] Updated weights for policy 0, policy_version 140578 (0.0012) [2024-06-15 13:19:55,959][1648982] Fps is (10 sec: 42595.1, 60 sec: 45874.7, 300 sec: 44875.4). Total num frames: 287965184. Throughput: 0: 11605.2. Samples: 72051200. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:19:55,959][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 13:19:56,962][1653645] Updated weights for policy 0, policy_version 140659 (0.0013) [2024-06-15 13:20:00,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.2, 300 sec: 45208.7). Total num frames: 288194560. Throughput: 0: 11446.1. Samples: 72115200. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:20:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:20:01,011][1653645] Updated weights for policy 0, policy_version 140734 (0.0013) [2024-06-15 13:20:05,958][1648982] Fps is (10 sec: 39325.0, 60 sec: 45329.1, 300 sec: 44876.5). Total num frames: 288358400. Throughput: 0: 11423.8. Samples: 72150528. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:20:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:20:06,389][1653645] Updated weights for policy 0, policy_version 140816 (0.0103) [2024-06-15 13:20:08,139][1653645] Updated weights for policy 0, policy_version 140871 (0.0014) [2024-06-15 13:20:09,265][1653645] Updated weights for policy 0, policy_version 140922 (0.0110) [2024-06-15 13:20:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 288620544. Throughput: 0: 11423.3. Samples: 72211968. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:20:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:20:12,830][1653645] Updated weights for policy 0, policy_version 140987 (0.0012) [2024-06-15 13:20:15,862][1653645] Updated weights for policy 0, policy_version 141040 (0.0030) [2024-06-15 13:20:15,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.2, 300 sec: 44764.5). Total num frames: 288849920. Throughput: 0: 11073.0. Samples: 72277504. Policy #0 lag: (min: 15.0, avg: 120.2, max: 271.0) [2024-06-15 13:20:15,962][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:20:18,886][1653645] Updated weights for policy 0, policy_version 141072 (0.0014) [2024-06-15 13:20:20,302][1653645] Updated weights for policy 0, policy_version 141125 (0.0014) [2024-06-15 13:20:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45330.6, 300 sec: 44875.5). Total num frames: 289079296. Throughput: 0: 11184.3. Samples: 72314368. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:20:21,389][1653645] Updated weights for policy 0, policy_version 141184 (0.0096) [2024-06-15 13:20:24,590][1653645] Updated weights for policy 0, policy_version 141232 (0.0012) [2024-06-15 13:20:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 289275904. Throughput: 0: 11036.4. Samples: 72377344. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:20:27,034][1653645] Updated weights for policy 0, policy_version 141280 (0.0127) [2024-06-15 13:20:30,503][1651596] Signal inference workers to stop experience collection... (7250 times) [2024-06-15 13:20:30,548][1653645] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-06-15 13:20:30,715][1651596] Signal inference workers to resume experience collection... (7250 times) [2024-06-15 13:20:30,716][1653645] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-06-15 13:20:30,925][1653645] Updated weights for policy 0, policy_version 141332 (0.0012) [2024-06-15 13:20:30,974][1648982] Fps is (10 sec: 36020.2, 60 sec: 43692.7, 300 sec: 44541.2). Total num frames: 289439744. Throughput: 0: 11148.5. Samples: 72445952. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:30,975][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:20:32,120][1653645] Updated weights for policy 0, policy_version 141377 (0.0012) [2024-06-15 13:20:33,208][1653645] Updated weights for policy 0, policy_version 141437 (0.0013) [2024-06-15 13:20:35,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44878.6). Total num frames: 289734656. Throughput: 0: 11036.4. Samples: 72475648. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:35,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:20:36,181][1653645] Updated weights for policy 0, policy_version 141494 (0.0027) [2024-06-15 13:20:39,185][1653645] Updated weights for policy 0, policy_version 141537 (0.0012) [2024-06-15 13:20:40,960][1648982] Fps is (10 sec: 49185.4, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 289931264. Throughput: 0: 10979.7. Samples: 72545280. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:40,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:20:43,035][1653645] Updated weights for policy 0, policy_version 141617 (0.0015) [2024-06-15 13:20:45,346][1653645] Updated weights for policy 0, policy_version 141687 (0.0019) [2024-06-15 13:20:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 290193408. Throughput: 0: 10911.3. Samples: 72606208. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:20:47,481][1653645] Updated weights for policy 0, policy_version 141717 (0.0012) [2024-06-15 13:20:50,338][1653645] Updated weights for policy 0, policy_version 141764 (0.0012) [2024-06-15 13:20:50,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 290390016. Throughput: 0: 10877.1. Samples: 72640000. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:50,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:20:51,393][1653645] Updated weights for policy 0, policy_version 141820 (0.0011) [2024-06-15 13:20:54,645][1653645] Updated weights for policy 0, policy_version 141872 (0.0015) [2024-06-15 13:20:55,958][1648982] Fps is (10 sec: 42596.7, 60 sec: 44237.1, 300 sec: 44875.5). Total num frames: 290619392. Throughput: 0: 11172.9. Samples: 72714752. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:20:55,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:20:56,222][1653645] Updated weights for policy 0, policy_version 141921 (0.0039) [2024-06-15 13:20:56,482][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000141936_290684928.pth... [2024-06-15 13:20:56,523][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000136640_279838720.pth [2024-06-15 13:20:57,925][1653645] Updated weights for policy 0, policy_version 141954 (0.0012) [2024-06-15 13:20:59,348][1653645] Updated weights for policy 0, policy_version 142016 (0.0013) [2024-06-15 13:21:00,960][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 290848768. Throughput: 0: 11173.0. Samples: 72780288. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:21:00,961][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:21:03,167][1653645] Updated weights for policy 0, policy_version 142077 (0.0073) [2024-06-15 13:21:05,928][1653645] Updated weights for policy 0, policy_version 142144 (0.0013) [2024-06-15 13:21:05,958][1648982] Fps is (10 sec: 49154.1, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 291110912. Throughput: 0: 11138.9. Samples: 72815616. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:21:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:21:08,778][1653645] Updated weights for policy 0, policy_version 142203 (0.0013) [2024-06-15 13:21:10,311][1653645] Updated weights for policy 0, policy_version 142243 (0.0013) [2024-06-15 13:21:10,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 291373056. Throughput: 0: 11184.3. Samples: 72880640. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:21:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:21:14,098][1653645] Updated weights for policy 0, policy_version 142293 (0.0013) [2024-06-15 13:21:14,856][1653645] Updated weights for policy 0, policy_version 142336 (0.0012) [2024-06-15 13:21:15,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 291504128. Throughput: 0: 11345.3. Samples: 72956416. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:21:15,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:21:17,012][1651596] Signal inference workers to stop experience collection... (7300 times) [2024-06-15 13:21:17,052][1653645] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-06-15 13:21:17,309][1651596] Signal inference workers to resume experience collection... (7300 times) [2024-06-15 13:21:17,310][1653645] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-06-15 13:21:17,521][1653645] Updated weights for policy 0, policy_version 142393 (0.0015) [2024-06-15 13:21:19,848][1653645] Updated weights for policy 0, policy_version 142457 (0.0017) [2024-06-15 13:21:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44875.7). Total num frames: 291766272. Throughput: 0: 11411.9. Samples: 72989184. Policy #0 lag: (min: 15.0, avg: 108.7, max: 271.0) [2024-06-15 13:21:20,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:21:22,196][1653645] Updated weights for policy 0, policy_version 142512 (0.0012) [2024-06-15 13:21:25,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 291897344. Throughput: 0: 11218.5. Samples: 73050112. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:21:26,383][1653645] Updated weights for policy 0, policy_version 142547 (0.0012) [2024-06-15 13:21:27,189][1653645] Updated weights for policy 0, policy_version 142589 (0.0013) [2024-06-15 13:21:29,410][1653645] Updated weights for policy 0, policy_version 142656 (0.0014) [2024-06-15 13:21:30,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45334.3, 300 sec: 44653.4). Total num frames: 292159488. Throughput: 0: 11400.5. Samples: 73119232. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:30,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 13:21:32,371][1653645] Updated weights for policy 0, policy_version 142715 (0.0013) [2024-06-15 13:21:34,005][1653645] Updated weights for policy 0, policy_version 142768 (0.0015) [2024-06-15 13:21:35,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 292421632. Throughput: 0: 11252.6. Samples: 73146368. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:21:38,816][1653645] Updated weights for policy 0, policy_version 142804 (0.0013) [2024-06-15 13:21:40,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 292618240. Throughput: 0: 11093.4. Samples: 73213952. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:21:40,974][1653645] Updated weights for policy 0, policy_version 142896 (0.0121) [2024-06-15 13:21:44,627][1653645] Updated weights for policy 0, policy_version 142960 (0.0012) [2024-06-15 13:21:45,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 292880384. Throughput: 0: 11059.2. Samples: 73277952. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:45,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:21:45,976][1653645] Updated weights for policy 0, policy_version 143009 (0.0014) [2024-06-15 13:21:50,789][1653645] Updated weights for policy 0, policy_version 143073 (0.0029) [2024-06-15 13:21:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44765.2). Total num frames: 293011456. Throughput: 0: 11059.2. Samples: 73313280. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:21:53,154][1653645] Updated weights for policy 0, policy_version 143160 (0.0013) [2024-06-15 13:21:55,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 43690.9, 300 sec: 44542.3). Total num frames: 293240832. Throughput: 0: 11036.4. Samples: 73377280. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:21:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:21:56,169][1653645] Updated weights for policy 0, policy_version 143201 (0.0013) [2024-06-15 13:21:57,839][1653645] Updated weights for policy 0, policy_version 143285 (0.0013) [2024-06-15 13:22:00,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 43690.5, 300 sec: 44875.4). Total num frames: 293470208. Throughput: 0: 10945.4. Samples: 73448960. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:22:00,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:22:03,007][1653645] Updated weights for policy 0, policy_version 143355 (0.0185) [2024-06-15 13:22:05,065][1651596] Signal inference workers to stop experience collection... (7350 times) [2024-06-15 13:22:05,114][1653645] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-06-15 13:22:05,459][1651596] Signal inference workers to resume experience collection... (7350 times) [2024-06-15 13:22:05,460][1653645] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-06-15 13:22:05,822][1653645] Updated weights for policy 0, policy_version 143422 (0.0132) [2024-06-15 13:22:05,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 293732352. Throughput: 0: 10843.0. Samples: 73477120. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:22:05,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:22:08,892][1653645] Updated weights for policy 0, policy_version 143490 (0.0012) [2024-06-15 13:22:10,960][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 293994496. Throughput: 0: 11036.4. Samples: 73546752. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:22:10,961][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:22:13,995][1653645] Updated weights for policy 0, policy_version 143554 (0.0012) [2024-06-15 13:22:15,524][1653645] Updated weights for policy 0, policy_version 143613 (0.0020) [2024-06-15 13:22:15,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 294125568. Throughput: 0: 10808.9. Samples: 73605632. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:22:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:22:17,762][1653645] Updated weights for policy 0, policy_version 143677 (0.0014) [2024-06-15 13:22:20,359][1653645] Updated weights for policy 0, policy_version 143715 (0.0012) [2024-06-15 13:22:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 294387712. Throughput: 0: 10877.1. Samples: 73635840. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:22:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:22:22,526][1653645] Updated weights for policy 0, policy_version 143801 (0.0068) [2024-06-15 13:22:25,959][1648982] Fps is (10 sec: 39315.2, 60 sec: 43689.5, 300 sec: 44764.5). Total num frames: 294518784. Throughput: 0: 10876.7. Samples: 73703424. Policy #0 lag: (min: 47.0, avg: 185.2, max: 303.0) [2024-06-15 13:22:25,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:22:28,512][1653645] Updated weights for policy 0, policy_version 143874 (0.0013) [2024-06-15 13:22:29,630][1653645] Updated weights for policy 0, policy_version 143931 (0.0103) [2024-06-15 13:22:30,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 294780928. Throughput: 0: 10956.9. Samples: 73771008. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:22:30,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:22:32,688][1653645] Updated weights for policy 0, policy_version 143973 (0.0015) [2024-06-15 13:22:34,500][1653645] Updated weights for policy 0, policy_version 144058 (0.0108) [2024-06-15 13:22:35,958][1648982] Fps is (10 sec: 52437.6, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 295043072. Throughput: 0: 10945.4. Samples: 73805824. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:22:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:22:39,173][1653645] Updated weights for policy 0, policy_version 144112 (0.0014) [2024-06-15 13:22:40,924][1653645] Updated weights for policy 0, policy_version 144176 (0.0013) [2024-06-15 13:22:40,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44236.8, 300 sec: 44764.5). Total num frames: 295272448. Throughput: 0: 10968.2. Samples: 73870848. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:22:40,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:22:44,751][1653645] Updated weights for policy 0, policy_version 144241 (0.0014) [2024-06-15 13:22:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 295501824. Throughput: 0: 10706.6. Samples: 73930752. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:22:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:22:46,083][1653645] Updated weights for policy 0, policy_version 144309 (0.0091) [2024-06-15 13:22:50,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 295632896. Throughput: 0: 10899.9. Samples: 73967616. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:22:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:22:50,969][1651596] Signal inference workers to stop experience collection... (7400 times) [2024-06-15 13:22:51,012][1653645] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-06-15 13:22:51,307][1651596] Signal inference workers to resume experience collection... (7400 times) [2024-06-15 13:22:51,309][1653645] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-06-15 13:22:51,313][1653645] Updated weights for policy 0, policy_version 144368 (0.0013) [2024-06-15 13:22:53,064][1653645] Updated weights for policy 0, policy_version 144432 (0.0012) [2024-06-15 13:22:55,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 295862272. Throughput: 0: 10843.0. Samples: 74034688. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:22:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:22:56,190][1653645] Updated weights for policy 0, policy_version 144482 (0.0012) [2024-06-15 13:22:56,368][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000144496_295927808.pth... [2024-06-15 13:22:56,508][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000139264_285212672.pth [2024-06-15 13:22:57,456][1653645] Updated weights for policy 0, policy_version 144546 (0.0012) [2024-06-15 13:23:00,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 296091648. Throughput: 0: 11093.3. Samples: 74104832. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:23:00,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:23:02,137][1653645] Updated weights for policy 0, policy_version 144579 (0.0013) [2024-06-15 13:23:04,031][1653645] Updated weights for policy 0, policy_version 144672 (0.0011) [2024-06-15 13:23:05,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 296353792. Throughput: 0: 11138.9. Samples: 74137088. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:23:05,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:23:07,380][1653645] Updated weights for policy 0, policy_version 144722 (0.0012) [2024-06-15 13:23:09,398][1653645] Updated weights for policy 0, policy_version 144807 (0.0147) [2024-06-15 13:23:10,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 296615936. Throughput: 0: 10923.1. Samples: 74194944. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:23:10,960][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 13:23:14,996][1653645] Updated weights for policy 0, policy_version 144853 (0.0013) [2024-06-15 13:23:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 296747008. Throughput: 0: 11036.4. Samples: 74267648. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:23:15,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:23:16,590][1653645] Updated weights for policy 0, policy_version 144916 (0.0012) [2024-06-15 13:23:20,425][1653645] Updated weights for policy 0, policy_version 144993 (0.0014) [2024-06-15 13:23:20,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 296976384. Throughput: 0: 10899.9. Samples: 74296320. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:23:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:23:21,952][1653645] Updated weights for policy 0, policy_version 145056 (0.0013) [2024-06-15 13:23:25,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43691.8, 300 sec: 44431.2). Total num frames: 297140224. Throughput: 0: 10820.2. Samples: 74357760. Policy #0 lag: (min: 4.0, avg: 99.0, max: 260.0) [2024-06-15 13:23:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:23:27,044][1653645] Updated weights for policy 0, policy_version 145089 (0.0015) [2024-06-15 13:23:28,303][1653645] Updated weights for policy 0, policy_version 145152 (0.0013) [2024-06-15 13:23:30,960][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 297402368. Throughput: 0: 10945.4. Samples: 74423296. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:23:30,963][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:23:31,689][1653645] Updated weights for policy 0, policy_version 145218 (0.0011) [2024-06-15 13:23:33,453][1651596] Signal inference workers to stop experience collection... (7450 times) [2024-06-15 13:23:33,485][1653645] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-06-15 13:23:33,758][1651596] Signal inference workers to resume experience collection... (7450 times) [2024-06-15 13:23:33,759][1653645] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-06-15 13:23:33,760][1653645] Updated weights for policy 0, policy_version 145296 (0.0072) [2024-06-15 13:23:35,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44653.4). Total num frames: 297664512. Throughput: 0: 10763.3. Samples: 74451968. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:23:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:23:39,893][1653645] Updated weights for policy 0, policy_version 145345 (0.0015) [2024-06-15 13:23:40,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 41506.2, 300 sec: 44209.0). Total num frames: 297762816. Throughput: 0: 10752.0. Samples: 74518528. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:23:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:23:41,711][1653645] Updated weights for policy 0, policy_version 145424 (0.0012) [2024-06-15 13:23:42,905][1653645] Updated weights for policy 0, policy_version 145472 (0.0014) [2024-06-15 13:23:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 44431.2). Total num frames: 298057728. Throughput: 0: 10513.1. Samples: 74577920. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:23:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:23:46,650][1653645] Updated weights for policy 0, policy_version 145568 (0.0099) [2024-06-15 13:23:50,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 43986.9). Total num frames: 298188800. Throughput: 0: 10444.8. Samples: 74607104. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:23:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:23:52,095][1653645] Updated weights for policy 0, policy_version 145603 (0.0014) [2024-06-15 13:23:53,921][1653645] Updated weights for policy 0, policy_version 145681 (0.0012) [2024-06-15 13:23:55,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 298450944. Throughput: 0: 10535.8. Samples: 74669056. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:23:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:23:57,058][1653645] Updated weights for policy 0, policy_version 145731 (0.0013) [2024-06-15 13:23:58,658][1653645] Updated weights for policy 0, policy_version 145795 (0.0012) [2024-06-15 13:23:59,758][1653645] Updated weights for policy 0, policy_version 145849 (0.0013) [2024-06-15 13:24:00,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 298713088. Throughput: 0: 10433.4. Samples: 74737152. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:24:05,039][1653645] Updated weights for policy 0, policy_version 145904 (0.0012) [2024-06-15 13:24:05,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 42052.0, 300 sec: 43986.9). Total num frames: 298876928. Throughput: 0: 10592.6. Samples: 74772992. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:05,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:24:06,590][1653645] Updated weights for policy 0, policy_version 145954 (0.0011) [2024-06-15 13:24:09,021][1653645] Updated weights for policy 0, policy_version 146000 (0.0011) [2024-06-15 13:24:10,847][1653645] Updated weights for policy 0, policy_version 146070 (0.0012) [2024-06-15 13:24:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 44098.0). Total num frames: 299139072. Throughput: 0: 10615.5. Samples: 74835456. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:24:11,546][1653645] Updated weights for policy 0, policy_version 146112 (0.0014) [2024-06-15 13:24:15,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 41506.1, 300 sec: 43653.9). Total num frames: 299237376. Throughput: 0: 10706.5. Samples: 74905088. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:24:17,481][1653645] Updated weights for policy 0, policy_version 146162 (0.0016) [2024-06-15 13:24:18,632][1651596] Signal inference workers to stop experience collection... (7500 times) [2024-06-15 13:24:18,789][1653645] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-06-15 13:24:18,804][1653645] Updated weights for policy 0, policy_version 146218 (0.0012) [2024-06-15 13:24:18,965][1651596] Signal inference workers to resume experience collection... (7500 times) [2024-06-15 13:24:18,966][1653645] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-06-15 13:24:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 42052.3, 300 sec: 43875.8). Total num frames: 299499520. Throughput: 0: 10683.8. Samples: 74932736. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:24:21,415][1653645] Updated weights for policy 0, policy_version 146272 (0.0013) [2024-06-15 13:24:23,148][1653645] Updated weights for policy 0, policy_version 146352 (0.0013) [2024-06-15 13:24:25,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.8, 300 sec: 43877.2). Total num frames: 299761664. Throughput: 0: 10683.7. Samples: 74999296. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:24:29,274][1653645] Updated weights for policy 0, policy_version 146429 (0.0079) [2024-06-15 13:24:30,179][1653645] Updated weights for policy 0, policy_version 146467 (0.0013) [2024-06-15 13:24:30,958][1648982] Fps is (10 sec: 52426.2, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 300023808. Throughput: 0: 10831.6. Samples: 75065344. Policy #0 lag: (min: 91.0, avg: 183.3, max: 366.0) [2024-06-15 13:24:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:24:33,630][1653645] Updated weights for policy 0, policy_version 146546 (0.0092) [2024-06-15 13:24:34,873][1653645] Updated weights for policy 0, policy_version 146609 (0.0012) [2024-06-15 13:24:35,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 300285952. Throughput: 0: 11036.5. Samples: 75103744. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:24:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:24:39,939][1653645] Updated weights for policy 0, policy_version 146656 (0.0122) [2024-06-15 13:24:40,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 300417024. Throughput: 0: 11229.9. Samples: 75174400. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:24:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:24:41,491][1653645] Updated weights for policy 0, policy_version 146707 (0.0081) [2024-06-15 13:24:44,613][1653645] Updated weights for policy 0, policy_version 146756 (0.0029) [2024-06-15 13:24:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 300679168. Throughput: 0: 11002.3. Samples: 75232256. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:24:45,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:24:46,109][1653645] Updated weights for policy 0, policy_version 146818 (0.0012) [2024-06-15 13:24:47,333][1653645] Updated weights for policy 0, policy_version 146875 (0.0013) [2024-06-15 13:24:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43542.7). Total num frames: 300810240. Throughput: 0: 10922.8. Samples: 75264512. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:24:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:24:51,904][1653645] Updated weights for policy 0, policy_version 146915 (0.0013) [2024-06-15 13:24:53,364][1653645] Updated weights for policy 0, policy_version 146976 (0.0012) [2024-06-15 13:24:55,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 301072384. Throughput: 0: 10968.1. Samples: 75329024. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:24:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:24:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000147008_301072384.pth... [2024-06-15 13:24:56,021][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000141936_290684928.pth [2024-06-15 13:24:56,952][1653645] Updated weights for policy 0, policy_version 147040 (0.0014) [2024-06-15 13:24:58,964][1653645] Updated weights for policy 0, policy_version 147131 (0.0015) [2024-06-15 13:25:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 301334528. Throughput: 0: 10911.3. Samples: 75396096. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:25:03,566][1651596] Signal inference workers to stop experience collection... (7550 times) [2024-06-15 13:25:03,604][1653645] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-06-15 13:25:03,785][1651596] Signal inference workers to resume experience collection... (7550 times) [2024-06-15 13:25:03,787][1653645] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-06-15 13:25:04,312][1653645] Updated weights for policy 0, policy_version 147189 (0.0120) [2024-06-15 13:25:05,276][1653645] Updated weights for policy 0, policy_version 147217 (0.0027) [2024-06-15 13:25:05,958][1648982] Fps is (10 sec: 49153.8, 60 sec: 44783.2, 300 sec: 43875.8). Total num frames: 301563904. Throughput: 0: 11195.7. Samples: 75436544. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:25:08,261][1653645] Updated weights for policy 0, policy_version 147265 (0.0013) [2024-06-15 13:25:09,865][1653645] Updated weights for policy 0, policy_version 147344 (0.0013) [2024-06-15 13:25:10,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 301826048. Throughput: 0: 11195.7. Samples: 75503104. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:25:11,001][1653645] Updated weights for policy 0, policy_version 147390 (0.0015) [2024-06-15 13:25:15,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44783.0, 300 sec: 43542.6). Total num frames: 301924352. Throughput: 0: 11184.5. Samples: 75568640. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:25:17,204][1653645] Updated weights for policy 0, policy_version 147488 (0.0117) [2024-06-15 13:25:20,666][1653645] Updated weights for policy 0, policy_version 147552 (0.0015) [2024-06-15 13:25:20,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 302186496. Throughput: 0: 10945.4. Samples: 75596288. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:25:22,857][1653645] Updated weights for policy 0, policy_version 147632 (0.0012) [2024-06-15 13:25:25,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 43876.8). Total num frames: 302383104. Throughput: 0: 10763.4. Samples: 75658752. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:25:28,089][1653645] Updated weights for policy 0, policy_version 147664 (0.0013) [2024-06-15 13:25:29,849][1653645] Updated weights for policy 0, policy_version 147744 (0.0013) [2024-06-15 13:25:30,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43691.0, 300 sec: 43764.7). Total num frames: 302645248. Throughput: 0: 10899.9. Samples: 75722752. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:25:32,794][1653645] Updated weights for policy 0, policy_version 147808 (0.0011) [2024-06-15 13:25:34,945][1653645] Updated weights for policy 0, policy_version 147897 (0.0011) [2024-06-15 13:25:35,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 302907392. Throughput: 0: 11025.1. Samples: 75760640. Policy #0 lag: (min: 15.0, avg: 123.4, max: 271.0) [2024-06-15 13:25:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:25:40,722][1653645] Updated weights for policy 0, policy_version 147957 (0.0013) [2024-06-15 13:25:40,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 303038464. Throughput: 0: 11047.9. Samples: 75826176. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:25:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:25:42,255][1653645] Updated weights for policy 0, policy_version 148016 (0.0012) [2024-06-15 13:25:44,876][1653645] Updated weights for policy 0, policy_version 148065 (0.0060) [2024-06-15 13:25:45,295][1651596] Signal inference workers to stop experience collection... (7600 times) [2024-06-15 13:25:45,327][1653645] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-06-15 13:25:45,602][1651596] Signal inference workers to resume experience collection... (7600 times) [2024-06-15 13:25:45,604][1653645] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-06-15 13:25:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 303333376. Throughput: 0: 11002.3. Samples: 75891200. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:25:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:25:46,686][1653645] Updated weights for policy 0, policy_version 148144 (0.0013) [2024-06-15 13:25:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 303431680. Throughput: 0: 10854.4. Samples: 75924992. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:25:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:25:51,687][1653645] Updated weights for policy 0, policy_version 148197 (0.0126) [2024-06-15 13:25:53,905][1653645] Updated weights for policy 0, policy_version 148272 (0.0014) [2024-06-15 13:25:55,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.8, 300 sec: 43542.5). Total num frames: 303693824. Throughput: 0: 10774.7. Samples: 75987968. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:25:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:25:57,014][1653645] Updated weights for policy 0, policy_version 148324 (0.0013) [2024-06-15 13:25:58,494][1653645] Updated weights for policy 0, policy_version 148388 (0.0013) [2024-06-15 13:26:00,959][1648982] Fps is (10 sec: 52423.4, 60 sec: 43689.9, 300 sec: 43542.4). Total num frames: 303955968. Throughput: 0: 10888.3. Samples: 76058624. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:26:02,845][1653645] Updated weights for policy 0, policy_version 148436 (0.0014) [2024-06-15 13:26:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 43209.3). Total num frames: 304119808. Throughput: 0: 10945.4. Samples: 76088832. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:05,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:26:06,659][1653645] Updated weights for policy 0, policy_version 148539 (0.0013) [2024-06-15 13:26:09,195][1653645] Updated weights for policy 0, policy_version 148592 (0.0177) [2024-06-15 13:26:10,958][1648982] Fps is (10 sec: 49156.8, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 304447488. Throughput: 0: 10922.7. Samples: 76150272. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:26:11,100][1653645] Updated weights for policy 0, policy_version 148666 (0.0012) [2024-06-15 13:26:15,805][1653645] Updated weights for policy 0, policy_version 148732 (0.0042) [2024-06-15 13:26:15,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 304611328. Throughput: 0: 11070.6. Samples: 76220928. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:26:19,698][1653645] Updated weights for policy 0, policy_version 148803 (0.0012) [2024-06-15 13:26:20,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 304840704. Throughput: 0: 10979.6. Samples: 76254720. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:20,960][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:26:21,139][1653645] Updated weights for policy 0, policy_version 148860 (0.0013) [2024-06-15 13:26:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 305004544. Throughput: 0: 10877.2. Samples: 76315648. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:26:26,299][1653645] Updated weights for policy 0, policy_version 148932 (0.0013) [2024-06-15 13:26:27,393][1653645] Updated weights for policy 0, policy_version 148992 (0.0016) [2024-06-15 13:26:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 305233920. Throughput: 0: 11047.8. Samples: 76388352. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:26:31,542][1653645] Updated weights for policy 0, policy_version 149059 (0.0013) [2024-06-15 13:26:32,889][1651596] Signal inference workers to stop experience collection... (7650 times) [2024-06-15 13:26:32,931][1651596] Signal inference workers to resume experience collection... (7650 times) [2024-06-15 13:26:32,945][1653645] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-06-15 13:26:32,958][1653645] Updated weights for policy 0, policy_version 149120 (0.0022) [2024-06-15 13:26:32,977][1653645] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-06-15 13:26:35,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 305528832. Throughput: 0: 10990.9. Samples: 76419584. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:35,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:26:37,081][1653645] Updated weights for policy 0, policy_version 149186 (0.0041) [2024-06-15 13:26:37,970][1653645] Updated weights for policy 0, policy_version 149238 (0.0095) [2024-06-15 13:26:40,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 305659904. Throughput: 0: 11286.8. Samples: 76495872. Policy #0 lag: (min: 15.0, avg: 98.0, max: 271.0) [2024-06-15 13:26:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:26:42,012][1653645] Updated weights for policy 0, policy_version 149296 (0.0015) [2024-06-15 13:26:43,672][1653645] Updated weights for policy 0, policy_version 149344 (0.0013) [2024-06-15 13:26:45,341][1653645] Updated weights for policy 0, policy_version 149408 (0.0096) [2024-06-15 13:26:45,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 306020352. Throughput: 0: 11013.9. Samples: 76554240. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:26:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:26:48,799][1653645] Updated weights for policy 0, policy_version 149444 (0.0012) [2024-06-15 13:26:49,902][1653645] Updated weights for policy 0, policy_version 149496 (0.0012) [2024-06-15 13:26:50,960][1648982] Fps is (10 sec: 52423.4, 60 sec: 45874.4, 300 sec: 43875.6). Total num frames: 306184192. Throughput: 0: 11115.9. Samples: 76589056. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:26:50,962][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:26:54,097][1653645] Updated weights for policy 0, policy_version 149553 (0.0016) [2024-06-15 13:26:55,816][1653645] Updated weights for policy 0, policy_version 149616 (0.0013) [2024-06-15 13:26:55,993][1648982] Fps is (10 sec: 39182.2, 60 sec: 45302.3, 300 sec: 43870.6). Total num frames: 306413568. Throughput: 0: 11312.0. Samples: 76659712. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:26:55,994][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:26:56,131][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000149632_306446336.pth... [2024-06-15 13:26:56,164][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000144496_295927808.pth [2024-06-15 13:26:57,563][1653645] Updated weights for policy 0, policy_version 149651 (0.0014) [2024-06-15 13:27:00,097][1653645] Updated weights for policy 0, policy_version 149716 (0.0014) [2024-06-15 13:27:00,958][1648982] Fps is (10 sec: 49156.8, 60 sec: 45329.8, 300 sec: 43875.8). Total num frames: 306675712. Throughput: 0: 11252.6. Samples: 76727296. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:00,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 13:27:05,352][1653645] Updated weights for policy 0, policy_version 149762 (0.0047) [2024-06-15 13:27:05,958][1648982] Fps is (10 sec: 32885.3, 60 sec: 43690.9, 300 sec: 43209.3). Total num frames: 306741248. Throughput: 0: 11309.5. Samples: 76763648. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:27:07,950][1653645] Updated weights for policy 0, policy_version 149856 (0.0296) [2024-06-15 13:27:10,032][1653645] Updated weights for policy 0, policy_version 149904 (0.0012) [2024-06-15 13:27:10,962][1648982] Fps is (10 sec: 39307.8, 60 sec: 43688.1, 300 sec: 43875.3). Total num frames: 307068928. Throughput: 0: 11126.6. Samples: 76816384. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:10,964][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:27:12,403][1653645] Updated weights for policy 0, policy_version 149953 (0.0015) [2024-06-15 13:27:13,724][1653645] Updated weights for policy 0, policy_version 150007 (0.0014) [2024-06-15 13:27:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 307232768. Throughput: 0: 10956.8. Samples: 76881408. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:27:19,517][1653645] Updated weights for policy 0, policy_version 150064 (0.0013) [2024-06-15 13:27:20,058][1651596] Signal inference workers to stop experience collection... (7700 times) [2024-06-15 13:27:20,109][1653645] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-06-15 13:27:20,361][1651596] Signal inference workers to resume experience collection... (7700 times) [2024-06-15 13:27:20,362][1653645] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-06-15 13:27:20,958][1648982] Fps is (10 sec: 36057.7, 60 sec: 43144.4, 300 sec: 43765.0). Total num frames: 307429376. Throughput: 0: 11138.9. Samples: 76920832. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:20,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 13:27:21,218][1653645] Updated weights for policy 0, policy_version 150135 (0.0097) [2024-06-15 13:27:22,407][1653645] Updated weights for policy 0, policy_version 150176 (0.0027) [2024-06-15 13:27:25,320][1653645] Updated weights for policy 0, policy_version 150256 (0.0012) [2024-06-15 13:27:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 307757056. Throughput: 0: 10695.1. Samples: 76977152. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:25,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:27:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 43209.3). Total num frames: 307789824. Throughput: 0: 11025.1. Samples: 77050368. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:27:31,361][1653645] Updated weights for policy 0, policy_version 150305 (0.0109) [2024-06-15 13:27:33,300][1653645] Updated weights for policy 0, policy_version 150400 (0.0013) [2024-06-15 13:27:35,066][1653645] Updated weights for policy 0, policy_version 150458 (0.0015) [2024-06-15 13:27:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 308150272. Throughput: 0: 10809.1. Samples: 77075456. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:35,959][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 13:27:36,893][1653645] Updated weights for policy 0, policy_version 150503 (0.0026) [2024-06-15 13:27:40,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 308281344. Throughput: 0: 10828.8. Samples: 77146624. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:27:42,774][1653645] Updated weights for policy 0, policy_version 150560 (0.0015) [2024-06-15 13:27:43,988][1653645] Updated weights for policy 0, policy_version 150612 (0.0060) [2024-06-15 13:27:45,709][1653645] Updated weights for policy 0, policy_version 150676 (0.0013) [2024-06-15 13:27:45,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 43875.8). Total num frames: 308576256. Throughput: 0: 10717.9. Samples: 77209600. Policy #0 lag: (min: 31.0, avg: 147.6, max: 287.0) [2024-06-15 13:27:45,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:27:48,156][1653645] Updated weights for policy 0, policy_version 150737 (0.0014) [2024-06-15 13:27:50,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43691.5, 300 sec: 43875.8). Total num frames: 308805632. Throughput: 0: 10547.2. Samples: 77238272. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:27:50,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:27:53,801][1653645] Updated weights for policy 0, policy_version 150787 (0.0020) [2024-06-15 13:27:55,713][1653645] Updated weights for policy 0, policy_version 150864 (0.0013) [2024-06-15 13:27:55,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 42623.5, 300 sec: 43653.6). Total num frames: 308969472. Throughput: 0: 11185.2. Samples: 77319680. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:27:55,959][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 13:27:58,193][1653645] Updated weights for policy 0, policy_version 150967 (0.0192) [2024-06-15 13:28:00,935][1653645] Updated weights for policy 0, policy_version 151027 (0.0013) [2024-06-15 13:28:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 309297152. Throughput: 0: 10877.2. Samples: 77370880. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:00,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:28:05,734][1651596] Signal inference workers to stop experience collection... (7750 times) [2024-06-15 13:28:05,784][1653645] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-06-15 13:28:05,964][1648982] Fps is (10 sec: 36045.6, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 309329920. Throughput: 0: 11047.8. Samples: 77417984. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:05,965][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:28:06,029][1651596] Signal inference workers to resume experience collection... (7750 times) [2024-06-15 13:28:06,030][1653645] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-06-15 13:28:06,609][1653645] Updated weights for policy 0, policy_version 151077 (0.0098) [2024-06-15 13:28:08,726][1653645] Updated weights for policy 0, policy_version 151158 (0.0013) [2024-06-15 13:28:10,395][1653645] Updated weights for policy 0, policy_version 151227 (0.0011) [2024-06-15 13:28:10,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44239.4, 300 sec: 43986.9). Total num frames: 309723136. Throughput: 0: 11116.0. Samples: 77477376. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:10,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:28:11,530][1653645] Updated weights for policy 0, policy_version 151264 (0.0021) [2024-06-15 13:28:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 309854208. Throughput: 0: 11161.6. Samples: 77552640. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:15,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:28:17,296][1653645] Updated weights for policy 0, policy_version 151328 (0.0014) [2024-06-15 13:28:19,104][1653645] Updated weights for policy 0, policy_version 151397 (0.0016) [2024-06-15 13:28:20,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 45875.3, 300 sec: 44209.1). Total num frames: 310181888. Throughput: 0: 11468.8. Samples: 77591552. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:28:21,318][1653645] Updated weights for policy 0, policy_version 151474 (0.0014) [2024-06-15 13:28:22,579][1653645] Updated weights for policy 0, policy_version 151525 (0.0025) [2024-06-15 13:28:25,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 310378496. Throughput: 0: 11229.9. Samples: 77651968. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:25,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 13:28:28,964][1653645] Updated weights for policy 0, policy_version 151584 (0.0015) [2024-06-15 13:28:30,850][1653645] Updated weights for policy 0, policy_version 151664 (0.0014) [2024-06-15 13:28:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 46967.5, 300 sec: 43875.8). Total num frames: 310607872. Throughput: 0: 11434.7. Samples: 77724160. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:30,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:28:32,740][1653645] Updated weights for policy 0, policy_version 151712 (0.0013) [2024-06-15 13:28:34,972][1653645] Updated weights for policy 0, policy_version 151797 (0.0259) [2024-06-15 13:28:35,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45875.4, 300 sec: 44542.3). Total num frames: 310902784. Throughput: 0: 11457.4. Samples: 77753856. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:35,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 13:28:40,745][1653645] Updated weights for policy 0, policy_version 151827 (0.0040) [2024-06-15 13:28:40,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.9, 300 sec: 43653.7). Total num frames: 310935552. Throughput: 0: 11218.6. Samples: 77824512. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:28:42,581][1653645] Updated weights for policy 0, policy_version 151910 (0.0013) [2024-06-15 13:28:44,034][1653645] Updated weights for policy 0, policy_version 151957 (0.0014) [2024-06-15 13:28:44,710][1651596] Signal inference workers to stop experience collection... (7800 times) [2024-06-15 13:28:44,759][1653645] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-06-15 13:28:45,011][1651596] Signal inference workers to resume experience collection... (7800 times) [2024-06-15 13:28:45,012][1653645] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-06-15 13:28:45,959][1648982] Fps is (10 sec: 45870.7, 60 sec: 46420.7, 300 sec: 44653.2). Total num frames: 311361536. Throughput: 0: 11377.6. Samples: 77882880. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:28:46,553][1653645] Updated weights for policy 0, policy_version 152055 (0.0099) [2024-06-15 13:28:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 311427072. Throughput: 0: 11127.5. Samples: 77918720. Policy #0 lag: (min: 3.0, avg: 157.6, max: 259.0) [2024-06-15 13:28:50,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:28:52,593][1653645] Updated weights for policy 0, policy_version 152100 (0.0012) [2024-06-15 13:28:54,124][1653645] Updated weights for policy 0, policy_version 152160 (0.0042) [2024-06-15 13:28:55,398][1653645] Updated weights for policy 0, policy_version 152208 (0.0013) [2024-06-15 13:28:55,962][1648982] Fps is (10 sec: 39324.5, 60 sec: 46421.5, 300 sec: 44209.0). Total num frames: 311754752. Throughput: 0: 11446.1. Samples: 77992448. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:28:55,963][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:28:56,568][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000152256_311820288.pth... [2024-06-15 13:28:56,673][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000147008_301072384.pth [2024-06-15 13:28:57,168][1653645] Updated weights for policy 0, policy_version 152273 (0.0027) [2024-06-15 13:29:00,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44236.8, 300 sec: 44320.2). Total num frames: 311951360. Throughput: 0: 11161.6. Samples: 78054912. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:29:02,994][1653645] Updated weights for policy 0, policy_version 152321 (0.0012) [2024-06-15 13:29:04,094][1653645] Updated weights for policy 0, policy_version 152382 (0.0023) [2024-06-15 13:29:05,958][1648982] Fps is (10 sec: 42596.7, 60 sec: 47513.2, 300 sec: 44209.0). Total num frames: 312180736. Throughput: 0: 11229.7. Samples: 78096896. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:05,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:29:06,162][1653645] Updated weights for policy 0, policy_version 152445 (0.0013) [2024-06-15 13:29:08,764][1653645] Updated weights for policy 0, policy_version 152514 (0.0099) [2024-06-15 13:29:10,424][1653645] Updated weights for policy 0, policy_version 152576 (0.0013) [2024-06-15 13:29:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 312475648. Throughput: 0: 11093.3. Samples: 78151168. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:10,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:29:15,958][1648982] Fps is (10 sec: 36046.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 312541184. Throughput: 0: 11150.2. Samples: 78225920. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:15,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:29:17,970][1653645] Updated weights for policy 0, policy_version 152644 (0.0081) [2024-06-15 13:29:19,439][1653645] Updated weights for policy 0, policy_version 152704 (0.0013) [2024-06-15 13:29:20,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 312836096. Throughput: 0: 11184.3. Samples: 78257152. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:20,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:29:21,453][1653645] Updated weights for policy 0, policy_version 152768 (0.0014) [2024-06-15 13:29:22,791][1653645] Updated weights for policy 0, policy_version 152828 (0.0012) [2024-06-15 13:29:25,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 312999936. Throughput: 0: 11002.3. Samples: 78319616. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:25,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:29:28,035][1653645] Updated weights for policy 0, policy_version 152888 (0.0013) [2024-06-15 13:29:30,769][1651596] Signal inference workers to stop experience collection... (7850 times) [2024-06-15 13:29:30,788][1653645] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-06-15 13:29:30,811][1653645] Updated weights for policy 0, policy_version 152946 (0.0013) [2024-06-15 13:29:30,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.4, 300 sec: 43875.8). Total num frames: 313229312. Throughput: 0: 11332.4. Samples: 78392832. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:30,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:29:31,142][1651596] Signal inference workers to resume experience collection... (7850 times) [2024-06-15 13:29:31,158][1653645] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-06-15 13:29:32,950][1653645] Updated weights for policy 0, policy_version 153024 (0.0012) [2024-06-15 13:29:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 313524224. Throughput: 0: 10899.9. Samples: 78409216. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:35,978][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:29:39,754][1653645] Updated weights for policy 0, policy_version 153104 (0.0033) [2024-06-15 13:29:40,971][1648982] Fps is (10 sec: 39271.8, 60 sec: 44773.2, 300 sec: 43873.9). Total num frames: 313622528. Throughput: 0: 10965.0. Samples: 78486016. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:40,972][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:29:42,901][1653645] Updated weights for policy 0, policy_version 153168 (0.0041) [2024-06-15 13:29:44,673][1653645] Updated weights for policy 0, policy_version 153239 (0.0014) [2024-06-15 13:29:45,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43145.1, 300 sec: 44542.3). Total num frames: 313950208. Throughput: 0: 10786.1. Samples: 78540288. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:29:46,320][1653645] Updated weights for policy 0, policy_version 153312 (0.0013) [2024-06-15 13:29:50,891][1653645] Updated weights for policy 0, policy_version 153361 (0.0024) [2024-06-15 13:29:50,958][1648982] Fps is (10 sec: 45934.0, 60 sec: 44236.6, 300 sec: 44098.0). Total num frames: 314081280. Throughput: 0: 10683.8. Samples: 78577664. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:50,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:29:51,772][1653645] Updated weights for policy 0, policy_version 153401 (0.0015) [2024-06-15 13:29:55,397][1653645] Updated weights for policy 0, policy_version 153456 (0.0012) [2024-06-15 13:29:55,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 314310656. Throughput: 0: 11116.1. Samples: 78651392. Policy #0 lag: (min: 15.0, avg: 75.4, max: 271.0) [2024-06-15 13:29:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:29:56,883][1653645] Updated weights for policy 0, policy_version 153520 (0.0012) [2024-06-15 13:29:58,317][1653645] Updated weights for policy 0, policy_version 153584 (0.0012) [2024-06-15 13:30:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 314572800. Throughput: 0: 10808.9. Samples: 78712320. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:00,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:30:02,999][1653645] Updated weights for policy 0, policy_version 153632 (0.0032) [2024-06-15 13:30:05,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42052.5, 300 sec: 43653.6). Total num frames: 314703872. Throughput: 0: 10934.0. Samples: 78749184. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:05,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:30:06,749][1653645] Updated weights for policy 0, policy_version 153696 (0.0012) [2024-06-15 13:30:08,126][1653645] Updated weights for policy 0, policy_version 153765 (0.0014) [2024-06-15 13:30:09,606][1653645] Updated weights for policy 0, policy_version 153824 (0.0012) [2024-06-15 13:30:10,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 315097088. Throughput: 0: 10820.3. Samples: 78806528. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:10,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:30:14,950][1651596] Signal inference workers to stop experience collection... (7900 times) [2024-06-15 13:30:14,975][1653645] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-06-15 13:30:15,277][1651596] Signal inference workers to resume experience collection... (7900 times) [2024-06-15 13:30:15,278][1653645] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-06-15 13:30:15,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 315162624. Throughput: 0: 10740.7. Samples: 78876160. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:15,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:30:16,285][1653645] Updated weights for policy 0, policy_version 153911 (0.0015) [2024-06-15 13:30:19,510][1653645] Updated weights for policy 0, policy_version 153958 (0.0014) [2024-06-15 13:30:20,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 315392000. Throughput: 0: 11116.1. Samples: 78909440. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:30:20,966][1653645] Updated weights for policy 0, policy_version 154016 (0.0011) [2024-06-15 13:30:22,445][1653645] Updated weights for policy 0, policy_version 154068 (0.0012) [2024-06-15 13:30:25,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 315621376. Throughput: 0: 10709.5. Samples: 78967808. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:25,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:30:27,079][1653645] Updated weights for policy 0, policy_version 154118 (0.0014) [2024-06-15 13:30:28,205][1653645] Updated weights for policy 0, policy_version 154167 (0.0013) [2024-06-15 13:30:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.6, 300 sec: 43653.6). Total num frames: 315785216. Throughput: 0: 11093.3. Samples: 79039488. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:30,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:30:31,072][1653645] Updated weights for policy 0, policy_version 154208 (0.0013) [2024-06-15 13:30:33,232][1653645] Updated weights for policy 0, policy_version 154288 (0.0012) [2024-06-15 13:30:34,232][1653645] Updated weights for policy 0, policy_version 154320 (0.0011) [2024-06-15 13:30:35,337][1653645] Updated weights for policy 0, policy_version 154368 (0.0014) [2024-06-15 13:30:35,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 316145664. Throughput: 0: 10752.0. Samples: 79061504. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:35,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:30:40,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44246.4, 300 sec: 43875.8). Total num frames: 316276736. Throughput: 0: 10672.4. Samples: 79131648. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:40,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 13:30:43,080][1653645] Updated weights for policy 0, policy_version 154448 (0.0015) [2024-06-15 13:30:45,071][1653645] Updated weights for policy 0, policy_version 154516 (0.0033) [2024-06-15 13:30:45,958][1648982] Fps is (10 sec: 36045.9, 60 sec: 42598.5, 300 sec: 44320.1). Total num frames: 316506112. Throughput: 0: 10672.4. Samples: 79192576. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:45,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 13:30:46,077][1653645] Updated weights for policy 0, policy_version 154560 (0.0010) [2024-06-15 13:30:47,935][1653645] Updated weights for policy 0, policy_version 154617 (0.0013) [2024-06-15 13:30:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 316702720. Throughput: 0: 10581.4. Samples: 79225344. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:50,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:30:51,808][1653645] Updated weights for policy 0, policy_version 154684 (0.0012) [2024-06-15 13:30:55,958][1648982] Fps is (10 sec: 32766.4, 60 sec: 42052.0, 300 sec: 43653.7). Total num frames: 316833792. Throughput: 0: 10842.9. Samples: 79294464. Policy #0 lag: (min: 79.0, avg: 156.4, max: 335.0) [2024-06-15 13:30:55,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 13:30:56,517][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000154736_316899328.pth... [2024-06-15 13:30:56,714][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000149632_306446336.pth [2024-06-15 13:30:56,717][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000154736_316899328.pth [2024-06-15 13:30:57,139][1653645] Updated weights for policy 0, policy_version 154753 (0.0029) [2024-06-15 13:30:58,617][1653645] Updated weights for policy 0, policy_version 154816 (0.0014) [2024-06-15 13:30:59,030][1651596] Signal inference workers to stop experience collection... (7950 times) [2024-06-15 13:30:59,072][1653645] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-06-15 13:30:59,213][1651596] Signal inference workers to resume experience collection... (7950 times) [2024-06-15 13:30:59,213][1653645] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-06-15 13:31:00,369][1653645] Updated weights for policy 0, policy_version 154878 (0.0013) [2024-06-15 13:31:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 317194240. Throughput: 0: 10444.8. Samples: 79346176. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:00,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:31:04,216][1653645] Updated weights for policy 0, policy_version 154935 (0.0013) [2024-06-15 13:31:05,958][1648982] Fps is (10 sec: 49154.6, 60 sec: 43690.8, 300 sec: 43653.7). Total num frames: 317325312. Throughput: 0: 10558.6. Samples: 79384576. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:31:08,711][1653645] Updated weights for policy 0, policy_version 154980 (0.0021) [2024-06-15 13:31:10,531][1653645] Updated weights for policy 0, policy_version 155056 (0.0013) [2024-06-15 13:31:10,959][1648982] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 317587456. Throughput: 0: 10786.2. Samples: 79453184. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:10,961][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:31:12,343][1653645] Updated weights for policy 0, policy_version 155128 (0.0091) [2024-06-15 13:31:15,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 317784064. Throughput: 0: 10513.1. Samples: 79512576. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:15,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:31:16,202][1653645] Updated weights for policy 0, policy_version 155184 (0.0014) [2024-06-15 13:31:20,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 42052.3, 300 sec: 43764.7). Total num frames: 317915136. Throughput: 0: 10797.6. Samples: 79547392. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:31:20,972][1653645] Updated weights for policy 0, policy_version 155244 (0.0020) [2024-06-15 13:31:22,547][1653645] Updated weights for policy 0, policy_version 155312 (0.0014) [2024-06-15 13:31:23,835][1653645] Updated weights for policy 0, policy_version 155360 (0.0032) [2024-06-15 13:31:25,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 318242816. Throughput: 0: 10535.8. Samples: 79605760. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:31:28,038][1653645] Updated weights for policy 0, policy_version 155426 (0.0017) [2024-06-15 13:31:30,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 318373888. Throughput: 0: 10808.9. Samples: 79678976. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:31:31,765][1653645] Updated weights for policy 0, policy_version 155457 (0.0013) [2024-06-15 13:31:33,004][1653645] Updated weights for policy 0, policy_version 155524 (0.0013) [2024-06-15 13:31:34,095][1653645] Updated weights for policy 0, policy_version 155572 (0.0013) [2024-06-15 13:31:34,613][1653645] Updated weights for policy 0, policy_version 155587 (0.0011) [2024-06-15 13:31:35,871][1653645] Updated weights for policy 0, policy_version 155647 (0.0018) [2024-06-15 13:31:35,959][1648982] Fps is (10 sec: 52422.6, 60 sec: 43690.0, 300 sec: 44431.0). Total num frames: 318767104. Throughput: 0: 10933.8. Samples: 79717376. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:35,960][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 13:31:39,809][1653645] Updated weights for policy 0, policy_version 155710 (0.0012) [2024-06-15 13:31:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 318898176. Throughput: 0: 10843.1. Samples: 79782400. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:31:43,883][1651596] Signal inference workers to stop experience collection... (8000 times) [2024-06-15 13:31:43,929][1653645] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-06-15 13:31:44,122][1651596] Signal inference workers to resume experience collection... (8000 times) [2024-06-15 13:31:44,123][1653645] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-06-15 13:31:44,327][1653645] Updated weights for policy 0, policy_version 155768 (0.0011) [2024-06-15 13:31:45,958][1648982] Fps is (10 sec: 36049.1, 60 sec: 43690.7, 300 sec: 43876.0). Total num frames: 319127552. Throughput: 0: 11218.5. Samples: 79851008. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:45,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:31:45,964][1653645] Updated weights for policy 0, policy_version 155831 (0.0030) [2024-06-15 13:31:47,575][1653645] Updated weights for policy 0, policy_version 155875 (0.0013) [2024-06-15 13:31:50,540][1653645] Updated weights for policy 0, policy_version 155908 (0.0011) [2024-06-15 13:31:50,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43770.0). Total num frames: 319324160. Throughput: 0: 10968.2. Samples: 79878144. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:50,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:31:55,871][1653645] Updated weights for policy 0, policy_version 156000 (0.0013) [2024-06-15 13:31:55,970][1648982] Fps is (10 sec: 35999.1, 60 sec: 44227.8, 300 sec: 43429.6). Total num frames: 319488000. Throughput: 0: 10987.8. Samples: 79947776. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:31:55,971][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:31:57,337][1653645] Updated weights for policy 0, policy_version 156051 (0.0132) [2024-06-15 13:31:58,417][1653645] Updated weights for policy 0, policy_version 156096 (0.0013) [2024-06-15 13:32:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 319815680. Throughput: 0: 10945.4. Samples: 80005120. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:32:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:32:02,473][1653645] Updated weights for policy 0, policy_version 156161 (0.0013) [2024-06-15 13:32:05,958][1648982] Fps is (10 sec: 45933.6, 60 sec: 43690.6, 300 sec: 43654.2). Total num frames: 319946752. Throughput: 0: 10979.6. Samples: 80041472. Policy #0 lag: (min: 63.0, avg: 205.5, max: 319.0) [2024-06-15 13:32:05,960][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 13:32:07,483][1653645] Updated weights for policy 0, policy_version 156226 (0.0012) [2024-06-15 13:32:08,689][1653645] Updated weights for policy 0, policy_version 156282 (0.0118) [2024-06-15 13:32:10,420][1653645] Updated weights for policy 0, policy_version 156343 (0.0111) [2024-06-15 13:32:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 320208896. Throughput: 0: 11195.7. Samples: 80109568. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:32:11,927][1653645] Updated weights for policy 0, policy_version 156400 (0.0012) [2024-06-15 13:32:14,501][1653645] Updated weights for policy 0, policy_version 156423 (0.0039) [2024-06-15 13:32:15,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 320471040. Throughput: 0: 10934.0. Samples: 80171008. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:32:19,788][1653645] Updated weights for policy 0, policy_version 156481 (0.0014) [2024-06-15 13:32:20,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 320536576. Throughput: 0: 10911.6. Samples: 80208384. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:20,960][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 13:32:21,809][1653645] Updated weights for policy 0, policy_version 156560 (0.0154) [2024-06-15 13:32:24,390][1653645] Updated weights for policy 0, policy_version 156641 (0.0013) [2024-06-15 13:32:25,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 320864256. Throughput: 0: 10683.8. Samples: 80263168. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:32:27,091][1653645] Updated weights for policy 0, policy_version 156689 (0.0012) [2024-06-15 13:32:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 320995328. Throughput: 0: 10717.9. Samples: 80333312. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:30,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:32:31,718][1651596] Signal inference workers to stop experience collection... (8050 times) [2024-06-15 13:32:31,756][1653645] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-06-15 13:32:32,044][1651596] Signal inference workers to resume experience collection... (8050 times) [2024-06-15 13:32:32,045][1653645] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-06-15 13:32:32,047][1653645] Updated weights for policy 0, policy_version 156752 (0.0014) [2024-06-15 13:32:33,162][1653645] Updated weights for policy 0, policy_version 156798 (0.0014) [2024-06-15 13:32:34,911][1653645] Updated weights for policy 0, policy_version 156851 (0.0142) [2024-06-15 13:32:35,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 41506.7, 300 sec: 43986.8). Total num frames: 321257472. Throughput: 0: 10865.7. Samples: 80367104. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:35,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:32:36,644][1653645] Updated weights for policy 0, policy_version 156897 (0.0011) [2024-06-15 13:32:38,827][1653645] Updated weights for policy 0, policy_version 156946 (0.0049) [2024-06-15 13:32:39,705][1653645] Updated weights for policy 0, policy_version 156990 (0.0013) [2024-06-15 13:32:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 321519616. Throughput: 0: 10584.3. Samples: 80423936. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:40,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 13:32:44,970][1653645] Updated weights for policy 0, policy_version 157048 (0.0011) [2024-06-15 13:32:45,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 42598.4, 300 sec: 43653.6). Total num frames: 321683456. Throughput: 0: 10945.4. Samples: 80497664. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:45,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:32:46,487][1653645] Updated weights for policy 0, policy_version 157092 (0.0012) [2024-06-15 13:32:48,644][1653645] Updated weights for policy 0, policy_version 157153 (0.0030) [2024-06-15 13:32:50,171][1653645] Updated weights for policy 0, policy_version 157200 (0.0015) [2024-06-15 13:32:50,906][1653645] Updated weights for policy 0, policy_version 157244 (0.0014) [2024-06-15 13:32:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 322043904. Throughput: 0: 10854.4. Samples: 80529920. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:32:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43153.6, 300 sec: 43320.4). Total num frames: 322076672. Throughput: 0: 10945.4. Samples: 80602112. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:32:55,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:32:56,392][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000157296_322142208.pth... [2024-06-15 13:32:56,455][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000152256_311820288.pth [2024-06-15 13:32:56,855][1653645] Updated weights for policy 0, policy_version 157312 (0.0013) [2024-06-15 13:32:58,359][1653645] Updated weights for policy 0, policy_version 157369 (0.0086) [2024-06-15 13:33:00,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 322371584. Throughput: 0: 11013.7. Samples: 80666624. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:33:00,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:33:01,032][1653645] Updated weights for policy 0, policy_version 157413 (0.0011) [2024-06-15 13:33:02,280][1653645] Updated weights for policy 0, policy_version 157474 (0.0012) [2024-06-15 13:33:05,962][1648982] Fps is (10 sec: 49130.1, 60 sec: 43687.3, 300 sec: 43541.9). Total num frames: 322568192. Throughput: 0: 10830.5. Samples: 80695808. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:33:05,963][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:33:08,777][1653645] Updated weights for policy 0, policy_version 157536 (0.0012) [2024-06-15 13:33:10,601][1653645] Updated weights for policy 0, policy_version 157600 (0.0014) [2024-06-15 13:33:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 322764800. Throughput: 0: 11047.8. Samples: 80760320. Policy #0 lag: (min: 15.0, avg: 92.0, max: 271.0) [2024-06-15 13:33:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:33:12,593][1653645] Updated weights for policy 0, policy_version 157636 (0.0014) [2024-06-15 13:33:14,391][1651596] Signal inference workers to stop experience collection... (8100 times) [2024-06-15 13:33:14,444][1653645] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-06-15 13:33:14,569][1651596] Signal inference workers to resume experience collection... (8100 times) [2024-06-15 13:33:14,583][1653645] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-06-15 13:33:14,588][1653645] Updated weights for policy 0, policy_version 157728 (0.0020) [2024-06-15 13:33:15,958][1648982] Fps is (10 sec: 52452.7, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 323092480. Throughput: 0: 10808.9. Samples: 80819712. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:15,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:33:20,905][1653645] Updated weights for policy 0, policy_version 157777 (0.0013) [2024-06-15 13:33:20,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 323125248. Throughput: 0: 10911.4. Samples: 80858112. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:33:22,975][1653645] Updated weights for policy 0, policy_version 157858 (0.0105) [2024-06-15 13:33:25,142][1653645] Updated weights for policy 0, policy_version 157904 (0.0012) [2024-06-15 13:33:25,959][1648982] Fps is (10 sec: 36044.5, 60 sec: 43144.4, 300 sec: 43542.6). Total num frames: 323452928. Throughput: 0: 11036.4. Samples: 80920576. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:25,964][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:33:26,550][1653645] Updated weights for policy 0, policy_version 157971 (0.0114) [2024-06-15 13:33:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 323616768. Throughput: 0: 11002.3. Samples: 80992768. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:33:31,991][1653645] Updated weights for policy 0, policy_version 158019 (0.0124) [2024-06-15 13:33:34,082][1653645] Updated weights for policy 0, policy_version 158096 (0.0014) [2024-06-15 13:33:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 323878912. Throughput: 0: 10945.4. Samples: 81022464. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:33:36,774][1653645] Updated weights for policy 0, policy_version 158150 (0.0017) [2024-06-15 13:33:37,963][1653645] Updated weights for policy 0, policy_version 158208 (0.0017) [2024-06-15 13:33:39,264][1653645] Updated weights for policy 0, policy_version 158270 (0.0015) [2024-06-15 13:33:40,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.5, 300 sec: 43320.5). Total num frames: 324141056. Throughput: 0: 10774.7. Samples: 81086976. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:33:44,410][1653645] Updated weights for policy 0, policy_version 158321 (0.0012) [2024-06-15 13:33:45,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 324370432. Throughput: 0: 10990.9. Samples: 81161216. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:33:46,040][1653645] Updated weights for policy 0, policy_version 158400 (0.0024) [2024-06-15 13:33:48,118][1653645] Updated weights for policy 0, policy_version 158448 (0.0044) [2024-06-15 13:33:49,476][1653645] Updated weights for policy 0, policy_version 158514 (0.0094) [2024-06-15 13:33:50,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 324665344. Throughput: 0: 11139.9. Samples: 81197056. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:33:55,138][1653645] Updated weights for policy 0, policy_version 158549 (0.0012) [2024-06-15 13:33:55,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 324763648. Throughput: 0: 11468.8. Samples: 81276416. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:33:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:33:56,783][1653645] Updated weights for policy 0, policy_version 158612 (0.0013) [2024-06-15 13:33:57,058][1651596] Signal inference workers to stop experience collection... (8150 times) [2024-06-15 13:33:57,123][1653645] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-06-15 13:33:57,276][1651596] Signal inference workers to resume experience collection... (8150 times) [2024-06-15 13:33:57,278][1653645] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-06-15 13:33:58,595][1653645] Updated weights for policy 0, policy_version 158662 (0.0092) [2024-06-15 13:33:59,715][1653645] Updated weights for policy 0, policy_version 158720 (0.0012) [2024-06-15 13:34:00,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 43875.9). Total num frames: 325124096. Throughput: 0: 11468.8. Samples: 81335808. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:34:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:34:01,212][1653645] Updated weights for policy 0, policy_version 158777 (0.0012) [2024-06-15 13:34:05,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44240.0, 300 sec: 43209.3). Total num frames: 325222400. Throughput: 0: 11434.6. Samples: 81372672. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:34:05,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:34:06,476][1653645] Updated weights for policy 0, policy_version 158820 (0.0148) [2024-06-15 13:34:08,403][1653645] Updated weights for policy 0, policy_version 158906 (0.0012) [2024-06-15 13:34:10,379][1653645] Updated weights for policy 0, policy_version 158944 (0.0012) [2024-06-15 13:34:10,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 46967.4, 300 sec: 44209.0). Total num frames: 325582848. Throughput: 0: 11480.2. Samples: 81437184. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:34:10,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:34:12,044][1653645] Updated weights for policy 0, policy_version 158992 (0.0101) [2024-06-15 13:34:13,155][1653645] Updated weights for policy 0, policy_version 159040 (0.0045) [2024-06-15 13:34:15,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 325713920. Throughput: 0: 11491.6. Samples: 81509888. Policy #0 lag: (min: 5.0, avg: 122.9, max: 261.0) [2024-06-15 13:34:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:34:19,733][1653645] Updated weights for policy 0, policy_version 159141 (0.0027) [2024-06-15 13:34:20,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 47513.4, 300 sec: 43986.8). Total num frames: 325976064. Throughput: 0: 11639.4. Samples: 81546240. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:34:21,715][1653645] Updated weights for policy 0, policy_version 159190 (0.0014) [2024-06-15 13:34:23,848][1653645] Updated weights for policy 0, policy_version 159264 (0.0013) [2024-06-15 13:34:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 46421.4, 300 sec: 44098.0). Total num frames: 326238208. Throughput: 0: 11514.4. Samples: 81605120. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:34:28,351][1653645] Updated weights for policy 0, policy_version 159298 (0.0019) [2024-06-15 13:34:29,968][1653645] Updated weights for policy 0, policy_version 159376 (0.0012) [2024-06-15 13:34:30,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 47513.7, 300 sec: 43875.8). Total num frames: 326467584. Throughput: 0: 11525.7. Samples: 81679872. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:34:32,853][1653645] Updated weights for policy 0, policy_version 159440 (0.0013) [2024-06-15 13:34:34,365][1653645] Updated weights for policy 0, policy_version 159489 (0.0018) [2024-06-15 13:34:35,816][1653645] Updated weights for policy 0, policy_version 159549 (0.0012) [2024-06-15 13:34:35,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 48059.7, 300 sec: 44544.2). Total num frames: 326762496. Throughput: 0: 11514.3. Samples: 81715200. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:35,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:34:40,723][1653645] Updated weights for policy 0, policy_version 159606 (0.0012) [2024-06-15 13:34:40,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45875.4, 300 sec: 43875.8). Total num frames: 326893568. Throughput: 0: 11457.4. Samples: 81792000. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:34:41,040][1651596] Signal inference workers to stop experience collection... (8200 times) [2024-06-15 13:34:41,108][1653645] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-06-15 13:34:41,289][1651596] Signal inference workers to resume experience collection... (8200 times) [2024-06-15 13:34:41,290][1653645] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-06-15 13:34:41,813][1653645] Updated weights for policy 0, policy_version 159649 (0.0014) [2024-06-15 13:34:44,229][1653645] Updated weights for policy 0, policy_version 159702 (0.0014) [2024-06-15 13:34:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 46421.3, 300 sec: 44320.1). Total num frames: 327155712. Throughput: 0: 11548.4. Samples: 81855488. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:34:46,160][1653645] Updated weights for policy 0, policy_version 159747 (0.0012) [2024-06-15 13:34:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 327286784. Throughput: 0: 11377.8. Samples: 81884672. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:34:51,356][1653645] Updated weights for policy 0, policy_version 159824 (0.0014) [2024-06-15 13:34:53,133][1653645] Updated weights for policy 0, policy_version 159894 (0.0012) [2024-06-15 13:34:55,975][1648982] Fps is (10 sec: 42525.6, 60 sec: 46954.2, 300 sec: 44095.4). Total num frames: 327581696. Throughput: 0: 11441.7. Samples: 81952256. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:34:55,976][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:34:56,444][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000159968_327614464.pth... [2024-06-15 13:34:56,616][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000154736_316899328.pth [2024-06-15 13:34:57,024][1653645] Updated weights for policy 0, policy_version 159995 (0.0013) [2024-06-15 13:34:59,537][1653645] Updated weights for policy 0, policy_version 160064 (0.0014) [2024-06-15 13:35:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 327811072. Throughput: 0: 11252.6. Samples: 82016256. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:35:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:35:04,711][1653645] Updated weights for policy 0, policy_version 160144 (0.0013) [2024-06-15 13:35:05,958][1648982] Fps is (10 sec: 49236.4, 60 sec: 47513.7, 300 sec: 43986.9). Total num frames: 328073216. Throughput: 0: 11343.7. Samples: 82056704. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:35:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:35:08,224][1653645] Updated weights for policy 0, policy_version 160194 (0.0014) [2024-06-15 13:35:10,060][1653645] Updated weights for policy 0, policy_version 160259 (0.0013) [2024-06-15 13:35:10,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 328269824. Throughput: 0: 11423.2. Samples: 82119168. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:35:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:35:15,327][1653645] Updated weights for policy 0, policy_version 160352 (0.0014) [2024-06-15 13:35:15,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 328433664. Throughput: 0: 11241.3. Samples: 82185728. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:35:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:35:20,240][1653645] Updated weights for policy 0, policy_version 160451 (0.0013) [2024-06-15 13:35:20,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 44783.2, 300 sec: 44209.1). Total num frames: 328663040. Throughput: 0: 11070.6. Samples: 82213376. Policy #0 lag: (min: 15.0, avg: 81.3, max: 271.0) [2024-06-15 13:35:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:35:21,419][1653645] Updated weights for policy 0, policy_version 160506 (0.0011) [2024-06-15 13:35:23,184][1653645] Updated weights for policy 0, policy_version 160546 (0.0013) [2024-06-15 13:35:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 328859648. Throughput: 0: 10831.7. Samples: 82279424. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:35:27,657][1651596] Signal inference workers to stop experience collection... (8250 times) [2024-06-15 13:35:27,744][1653645] Updated weights for policy 0, policy_version 160614 (0.0013) [2024-06-15 13:35:27,755][1653645] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-06-15 13:35:27,859][1651596] Signal inference workers to resume experience collection... (8250 times) [2024-06-15 13:35:27,859][1653645] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-06-15 13:35:29,254][1653645] Updated weights for policy 0, policy_version 160688 (0.0013) [2024-06-15 13:35:30,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 44236.6, 300 sec: 43986.9). Total num frames: 329121792. Throughput: 0: 10990.9. Samples: 82350080. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:35:32,209][1653645] Updated weights for policy 0, policy_version 160736 (0.0017) [2024-06-15 13:35:34,388][1653645] Updated weights for policy 0, policy_version 160769 (0.0024) [2024-06-15 13:35:35,863][1653645] Updated weights for policy 0, policy_version 160827 (0.0013) [2024-06-15 13:35:35,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 329351168. Throughput: 0: 11070.6. Samples: 82382848. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:35:39,533][1653645] Updated weights for policy 0, policy_version 160880 (0.0015) [2024-06-15 13:35:40,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 329613312. Throughput: 0: 11131.7. Samples: 82452992. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:35:41,004][1653645] Updated weights for policy 0, policy_version 160950 (0.0012) [2024-06-15 13:35:43,269][1653645] Updated weights for policy 0, policy_version 160978 (0.0014) [2024-06-15 13:35:44,340][1653645] Updated weights for policy 0, policy_version 161024 (0.0012) [2024-06-15 13:35:45,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 329777152. Throughput: 0: 11309.4. Samples: 82525184. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:35:47,165][1653645] Updated weights for policy 0, policy_version 161086 (0.0157) [2024-06-15 13:35:50,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 329973760. Throughput: 0: 11070.6. Samples: 82554880. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:35:51,119][1653645] Updated weights for policy 0, policy_version 161136 (0.0012) [2024-06-15 13:35:52,470][1653645] Updated weights for policy 0, policy_version 161213 (0.0013) [2024-06-15 13:35:55,485][1653645] Updated weights for policy 0, policy_version 161273 (0.0013) [2024-06-15 13:35:55,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 45342.1, 300 sec: 44431.2). Total num frames: 330301440. Throughput: 0: 11218.6. Samples: 82624000. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:35:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:35:58,040][1653645] Updated weights for policy 0, policy_version 161317 (0.0013) [2024-06-15 13:36:00,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 330432512. Throughput: 0: 11229.9. Samples: 82691072. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:36:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:36:02,341][1653645] Updated weights for policy 0, policy_version 161363 (0.0013) [2024-06-15 13:36:03,643][1653645] Updated weights for policy 0, policy_version 161426 (0.0011) [2024-06-15 13:36:04,289][1653645] Updated weights for policy 0, policy_version 161472 (0.0013) [2024-06-15 13:36:05,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 330727424. Throughput: 0: 11423.3. Samples: 82727424. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:36:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:36:06,911][1653645] Updated weights for policy 0, policy_version 161525 (0.0012) [2024-06-15 13:36:08,631][1653645] Updated weights for policy 0, policy_version 161557 (0.0012) [2024-06-15 13:36:10,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 330956800. Throughput: 0: 11434.7. Samples: 82793984. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:36:10,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:36:13,495][1651596] Signal inference workers to stop experience collection... (8300 times) [2024-06-15 13:36:13,526][1653645] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-06-15 13:36:13,726][1651596] Signal inference workers to resume experience collection... (8300 times) [2024-06-15 13:36:13,727][1653645] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-06-15 13:36:13,845][1653645] Updated weights for policy 0, policy_version 161617 (0.0038) [2024-06-15 13:36:14,977][1653645] Updated weights for policy 0, policy_version 161680 (0.0012) [2024-06-15 13:36:15,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 45875.0, 300 sec: 44986.6). Total num frames: 331186176. Throughput: 0: 11400.6. Samples: 82863104. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:36:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:36:16,022][1653645] Updated weights for policy 0, policy_version 161726 (0.0013) [2024-06-15 13:36:18,280][1653645] Updated weights for policy 0, policy_version 161790 (0.0014) [2024-06-15 13:36:20,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 331415552. Throughput: 0: 11343.7. Samples: 82893312. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:36:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:36:21,467][1653645] Updated weights for policy 0, policy_version 161848 (0.0022) [2024-06-15 13:36:25,834][1653645] Updated weights for policy 0, policy_version 161876 (0.0012) [2024-06-15 13:36:25,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 331546624. Throughput: 0: 11332.3. Samples: 82962944. Policy #0 lag: (min: 7.0, avg: 134.8, max: 263.0) [2024-06-15 13:36:25,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:36:27,594][1653645] Updated weights for policy 0, policy_version 161957 (0.0101) [2024-06-15 13:36:29,956][1653645] Updated weights for policy 0, policy_version 162045 (0.0092) [2024-06-15 13:36:30,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45875.4, 300 sec: 44431.4). Total num frames: 331874304. Throughput: 0: 11104.8. Samples: 83024896. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:36:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:36:33,574][1653645] Updated weights for policy 0, policy_version 162106 (0.0013) [2024-06-15 13:36:35,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 332005376. Throughput: 0: 11104.7. Samples: 83054592. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:36:35,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:36:38,461][1653645] Updated weights for policy 0, policy_version 162161 (0.0013) [2024-06-15 13:36:39,820][1653645] Updated weights for policy 0, policy_version 162214 (0.0010) [2024-06-15 13:36:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44236.7, 300 sec: 44542.2). Total num frames: 332267520. Throughput: 0: 11081.9. Samples: 83122688. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:36:40,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:36:41,738][1653645] Updated weights for policy 0, policy_version 162272 (0.0011) [2024-06-15 13:36:44,481][1653645] Updated weights for policy 0, policy_version 162305 (0.0028) [2024-06-15 13:36:45,786][1653645] Updated weights for policy 0, policy_version 162368 (0.0050) [2024-06-15 13:36:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.3, 300 sec: 44764.4). Total num frames: 332529664. Throughput: 0: 11070.5. Samples: 83189248. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:36:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:36:50,475][1653645] Updated weights for policy 0, policy_version 162430 (0.0022) [2024-06-15 13:36:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.7, 300 sec: 44655.2). Total num frames: 332660736. Throughput: 0: 11013.6. Samples: 83223040. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:36:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:36:52,513][1653645] Updated weights for policy 0, policy_version 162499 (0.0011) [2024-06-15 13:36:53,785][1653645] Updated weights for policy 0, policy_version 162560 (0.0012) [2024-06-15 13:36:55,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.3, 300 sec: 44431.1). Total num frames: 332922880. Throughput: 0: 10877.1. Samples: 83283456. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:36:55,959][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 13:36:55,969][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000162560_332922880.pth... [2024-06-15 13:36:56,039][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000157296_322142208.pth [2024-06-15 13:36:56,440][1651596] Signal inference workers to stop experience collection... (8350 times) [2024-06-15 13:36:56,503][1653645] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-06-15 13:36:56,691][1651596] Signal inference workers to resume experience collection... (8350 times) [2024-06-15 13:36:56,692][1653645] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-06-15 13:36:57,849][1653645] Updated weights for policy 0, policy_version 162624 (0.0123) [2024-06-15 13:37:00,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 333053952. Throughput: 0: 10968.2. Samples: 83356672. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:37:02,865][1653645] Updated weights for policy 0, policy_version 162689 (0.0020) [2024-06-15 13:37:04,150][1653645] Updated weights for policy 0, policy_version 162740 (0.0017) [2024-06-15 13:37:05,786][1653645] Updated weights for policy 0, policy_version 162809 (0.0110) [2024-06-15 13:37:05,958][1648982] Fps is (10 sec: 52431.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 333447168. Throughput: 0: 11002.3. Samples: 83388416. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:37:09,268][1653645] Updated weights for policy 0, policy_version 162836 (0.0019) [2024-06-15 13:37:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 333578240. Throughput: 0: 10854.4. Samples: 83451392. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:10,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:37:13,266][1653645] Updated weights for policy 0, policy_version 162898 (0.0015) [2024-06-15 13:37:14,397][1653645] Updated weights for policy 0, policy_version 162939 (0.0143) [2024-06-15 13:37:15,715][1653645] Updated weights for policy 0, policy_version 162995 (0.0013) [2024-06-15 13:37:15,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.9, 300 sec: 45097.6). Total num frames: 333840384. Throughput: 0: 11070.6. Samples: 83523072. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:37:17,573][1653645] Updated weights for policy 0, policy_version 163072 (0.0013) [2024-06-15 13:37:20,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 334069760. Throughput: 0: 11047.8. Samples: 83551744. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:20,960][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:37:21,289][1653645] Updated weights for policy 0, policy_version 163136 (0.0015) [2024-06-15 13:37:25,699][1653645] Updated weights for policy 0, policy_version 163194 (0.0028) [2024-06-15 13:37:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 334233600. Throughput: 0: 11207.1. Samples: 83627008. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:37:27,608][1653645] Updated weights for policy 0, policy_version 163249 (0.0015) [2024-06-15 13:37:29,186][1653645] Updated weights for policy 0, policy_version 163328 (0.0014) [2024-06-15 13:37:30,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 334495744. Throughput: 0: 11093.3. Samples: 83688448. Policy #0 lag: (min: 79.0, avg: 155.5, max: 335.0) [2024-06-15 13:37:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:37:32,902][1653645] Updated weights for policy 0, policy_version 163390 (0.0035) [2024-06-15 13:37:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 334626816. Throughput: 0: 11059.2. Samples: 83720704. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:37:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:37:37,936][1653645] Updated weights for policy 0, policy_version 163456 (0.0017) [2024-06-15 13:37:39,288][1651596] Signal inference workers to stop experience collection... (8400 times) [2024-06-15 13:37:39,331][1653645] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-06-15 13:37:39,333][1653645] Updated weights for policy 0, policy_version 163514 (0.0014) [2024-06-15 13:37:39,408][1651596] Signal inference workers to resume experience collection... (8400 times) [2024-06-15 13:37:39,410][1653645] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-06-15 13:37:40,783][1653645] Updated weights for policy 0, policy_version 163568 (0.0012) [2024-06-15 13:37:40,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 334987264. Throughput: 0: 11321.0. Samples: 83792896. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:37:40,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 13:37:43,329][1653645] Updated weights for policy 0, policy_version 163603 (0.0016) [2024-06-15 13:37:45,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 335151104. Throughput: 0: 11104.7. Samples: 83856384. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:37:45,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:37:48,263][1653645] Updated weights for policy 0, policy_version 163652 (0.0013) [2024-06-15 13:37:49,945][1653645] Updated weights for policy 0, policy_version 163728 (0.0014) [2024-06-15 13:37:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 335380480. Throughput: 0: 11389.1. Samples: 83900928. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:37:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:37:51,976][1653645] Updated weights for policy 0, policy_version 163808 (0.0012) [2024-06-15 13:37:55,383][1653645] Updated weights for policy 0, policy_version 163857 (0.0013) [2024-06-15 13:37:55,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 44782.9, 300 sec: 44875.4). Total num frames: 335609856. Throughput: 0: 11172.9. Samples: 83954176. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:37:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:38:00,643][1653645] Updated weights for policy 0, policy_version 163936 (0.0013) [2024-06-15 13:38:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 44765.1). Total num frames: 335773696. Throughput: 0: 11252.6. Samples: 84029440. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:38:01,992][1653645] Updated weights for policy 0, policy_version 163987 (0.0136) [2024-06-15 13:38:03,623][1653645] Updated weights for policy 0, policy_version 164048 (0.0012) [2024-06-15 13:38:05,958][1648982] Fps is (10 sec: 45878.0, 60 sec: 43690.6, 300 sec: 45097.7). Total num frames: 336068608. Throughput: 0: 11229.9. Samples: 84057088. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:05,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:38:07,076][1653645] Updated weights for policy 0, policy_version 164100 (0.0015) [2024-06-15 13:38:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 336199680. Throughput: 0: 10922.7. Samples: 84118528. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:38:12,888][1653645] Updated weights for policy 0, policy_version 164179 (0.0018) [2024-06-15 13:38:14,928][1653645] Updated weights for policy 0, policy_version 164256 (0.0014) [2024-06-15 13:38:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 336461824. Throughput: 0: 10922.7. Samples: 84179968. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:38:16,820][1653645] Updated weights for policy 0, policy_version 164336 (0.0016) [2024-06-15 13:38:20,847][1653645] Updated weights for policy 0, policy_version 164414 (0.0014) [2024-06-15 13:38:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 336723968. Throughput: 0: 10865.8. Samples: 84209664. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:38:25,091][1651596] Signal inference workers to stop experience collection... (8450 times) [2024-06-15 13:38:25,157][1653645] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-06-15 13:38:25,397][1651596] Signal inference workers to resume experience collection... (8450 times) [2024-06-15 13:38:25,398][1653645] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-06-15 13:38:25,592][1653645] Updated weights for policy 0, policy_version 164467 (0.0015) [2024-06-15 13:38:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 336855040. Throughput: 0: 10968.2. Samples: 84286464. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:38:26,996][1653645] Updated weights for policy 0, policy_version 164528 (0.0013) [2024-06-15 13:38:28,942][1653645] Updated weights for policy 0, policy_version 164608 (0.0013) [2024-06-15 13:38:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 337117184. Throughput: 0: 10865.8. Samples: 84345344. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:30,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:38:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 337248256. Throughput: 0: 10558.6. Samples: 84376064. Policy #0 lag: (min: 5.0, avg: 135.0, max: 261.0) [2024-06-15 13:38:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:38:36,203][1653645] Updated weights for policy 0, policy_version 164675 (0.0014) [2024-06-15 13:38:37,558][1653645] Updated weights for policy 0, policy_version 164736 (0.0012) [2024-06-15 13:38:39,792][1653645] Updated weights for policy 0, policy_version 164819 (0.0013) [2024-06-15 13:38:40,960][1648982] Fps is (10 sec: 52428.4, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 337641472. Throughput: 0: 10922.8. Samples: 84445696. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:38:40,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:38:43,184][1653645] Updated weights for policy 0, policy_version 164880 (0.0013) [2024-06-15 13:38:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 337772544. Throughput: 0: 10854.4. Samples: 84517888. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:38:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:38:47,316][1653645] Updated weights for policy 0, policy_version 164945 (0.0013) [2024-06-15 13:38:49,550][1653645] Updated weights for policy 0, policy_version 165008 (0.0014) [2024-06-15 13:38:50,817][1653645] Updated weights for policy 0, policy_version 165061 (0.0012) [2024-06-15 13:38:50,968][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 338034688. Throughput: 0: 11070.5. Samples: 84555264. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:38:50,981][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:38:51,872][1653645] Updated weights for policy 0, policy_version 165112 (0.0010) [2024-06-15 13:38:54,696][1653645] Updated weights for policy 0, policy_version 165137 (0.0039) [2024-06-15 13:38:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44783.3, 300 sec: 44653.3). Total num frames: 338296832. Throughput: 0: 11218.5. Samples: 84623360. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:38:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:38:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000165184_338296832.pth... [2024-06-15 13:38:56,042][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000159968_327614464.pth [2024-06-15 13:38:57,942][1653645] Updated weights for policy 0, policy_version 165202 (0.0014) [2024-06-15 13:39:00,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44236.6, 300 sec: 44764.4). Total num frames: 338427904. Throughput: 0: 11514.3. Samples: 84698112. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:00,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 13:39:01,341][1653645] Updated weights for policy 0, policy_version 165265 (0.0013) [2024-06-15 13:39:02,865][1653645] Updated weights for policy 0, policy_version 165331 (0.0012) [2024-06-15 13:39:05,677][1653645] Updated weights for policy 0, policy_version 165378 (0.0013) [2024-06-15 13:39:05,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 338722816. Throughput: 0: 11446.0. Samples: 84724736. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:39:06,518][1651596] Signal inference workers to stop experience collection... (8500 times) [2024-06-15 13:39:06,604][1653645] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-06-15 13:39:06,853][1651596] Signal inference workers to resume experience collection... (8500 times) [2024-06-15 13:39:06,854][1653645] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-06-15 13:39:07,166][1653645] Updated weights for policy 0, policy_version 165437 (0.0014) [2024-06-15 13:39:09,479][1653645] Updated weights for policy 0, policy_version 165488 (0.0013) [2024-06-15 13:39:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.0, 300 sec: 44875.4). Total num frames: 338952192. Throughput: 0: 11480.1. Samples: 84803072. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:10,963][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:39:12,610][1653645] Updated weights for policy 0, policy_version 165525 (0.0012) [2024-06-15 13:39:14,073][1653645] Updated weights for policy 0, policy_version 165587 (0.0013) [2024-06-15 13:39:15,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 339214336. Throughput: 0: 11559.8. Samples: 84865536. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:15,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:39:17,628][1653645] Updated weights for policy 0, policy_version 165648 (0.0064) [2024-06-15 13:39:18,954][1653645] Updated weights for policy 0, policy_version 165695 (0.0013) [2024-06-15 13:39:20,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.7, 300 sec: 44653.3). Total num frames: 339410944. Throughput: 0: 11684.9. Samples: 84901888. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:39:23,937][1653645] Updated weights for policy 0, policy_version 165763 (0.0014) [2024-06-15 13:39:25,861][1653645] Updated weights for policy 0, policy_version 165856 (0.0254) [2024-06-15 13:39:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 46967.5, 300 sec: 44764.4). Total num frames: 339673088. Throughput: 0: 11559.8. Samples: 84965888. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:39:29,795][1653645] Updated weights for policy 0, policy_version 165904 (0.0014) [2024-06-15 13:39:30,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 339836928. Throughput: 0: 11309.5. Samples: 85026816. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:30,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:39:31,108][1653645] Updated weights for policy 0, policy_version 165952 (0.0011) [2024-06-15 13:39:33,597][1653645] Updated weights for policy 0, policy_version 166015 (0.0013) [2024-06-15 13:39:35,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 340000768. Throughput: 0: 11264.0. Samples: 85062144. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 13:39:36,767][1653645] Updated weights for policy 0, policy_version 166077 (0.0012) [2024-06-15 13:39:38,099][1653645] Updated weights for policy 0, policy_version 166139 (0.0013) [2024-06-15 13:39:40,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 340262912. Throughput: 0: 11298.2. Samples: 85131776. Policy #0 lag: (min: 7.0, avg: 79.7, max: 263.0) [2024-06-15 13:39:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:39:42,687][1653645] Updated weights for policy 0, policy_version 166208 (0.0013) [2024-06-15 13:39:45,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 45875.0, 300 sec: 44875.5). Total num frames: 340525056. Throughput: 0: 11127.4. Samples: 85198848. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:39:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:39:47,273][1653645] Updated weights for policy 0, policy_version 166280 (0.0014) [2024-06-15 13:39:48,729][1653645] Updated weights for policy 0, policy_version 166340 (0.0013) [2024-06-15 13:39:49,997][1653645] Updated weights for policy 0, policy_version 166400 (0.0110) [2024-06-15 13:39:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 44767.0). Total num frames: 340787200. Throughput: 0: 11366.4. Samples: 85236224. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:39:50,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 13:39:53,283][1651596] Signal inference workers to stop experience collection... (8550 times) [2024-06-15 13:39:53,332][1653645] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-06-15 13:39:53,582][1651596] Signal inference workers to resume experience collection... (8550 times) [2024-06-15 13:39:53,583][1653645] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-06-15 13:39:53,711][1653645] Updated weights for policy 0, policy_version 166452 (0.0014) [2024-06-15 13:39:55,277][1653645] Updated weights for policy 0, policy_version 166499 (0.0012) [2024-06-15 13:39:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.0, 300 sec: 44875.4). Total num frames: 341049344. Throughput: 0: 11207.1. Samples: 85307392. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:39:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:39:59,327][1653645] Updated weights for policy 0, policy_version 166576 (0.0014) [2024-06-15 13:40:00,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 46967.6, 300 sec: 44653.3). Total num frames: 341245952. Throughput: 0: 11355.0. Samples: 85376512. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:40:01,454][1653645] Updated weights for policy 0, policy_version 166644 (0.0014) [2024-06-15 13:40:05,243][1653645] Updated weights for policy 0, policy_version 166708 (0.0013) [2024-06-15 13:40:05,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 45329.0, 300 sec: 44653.4). Total num frames: 341442560. Throughput: 0: 11286.8. Samples: 85409792. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:40:07,013][1653645] Updated weights for policy 0, policy_version 166768 (0.0013) [2024-06-15 13:40:10,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 341639168. Throughput: 0: 11229.9. Samples: 85471232. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:40:11,273][1653645] Updated weights for policy 0, policy_version 166840 (0.0013) [2024-06-15 13:40:12,976][1653645] Updated weights for policy 0, policy_version 166880 (0.0036) [2024-06-15 13:40:15,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43690.4, 300 sec: 44653.3). Total num frames: 341835776. Throughput: 0: 11446.0. Samples: 85541888. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:40:16,693][1653645] Updated weights for policy 0, policy_version 166931 (0.0013) [2024-06-15 13:40:18,524][1653645] Updated weights for policy 0, policy_version 166994 (0.0142) [2024-06-15 13:40:20,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 342097920. Throughput: 0: 11218.5. Samples: 85566976. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:40:23,573][1653645] Updated weights for policy 0, policy_version 167074 (0.0012) [2024-06-15 13:40:24,074][1653645] Updated weights for policy 0, policy_version 167102 (0.0011) [2024-06-15 13:40:25,959][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.6, 300 sec: 44653.4). Total num frames: 342294528. Throughput: 0: 11150.2. Samples: 85633536. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:25,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:40:26,162][1653645] Updated weights for policy 0, policy_version 167152 (0.0025) [2024-06-15 13:40:29,146][1653645] Updated weights for policy 0, policy_version 167186 (0.0013) [2024-06-15 13:40:30,479][1653645] Updated weights for policy 0, policy_version 167253 (0.0038) [2024-06-15 13:40:30,962][1648982] Fps is (10 sec: 45855.1, 60 sec: 45325.6, 300 sec: 44763.7). Total num frames: 342556672. Throughput: 0: 11126.4. Samples: 85699584. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:30,963][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:40:34,368][1653645] Updated weights for policy 0, policy_version 167313 (0.0012) [2024-06-15 13:40:35,155][1653645] Updated weights for policy 0, policy_version 167358 (0.0011) [2024-06-15 13:40:35,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.1, 300 sec: 44542.3). Total num frames: 342753280. Throughput: 0: 11161.6. Samples: 85738496. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:40:37,398][1653645] Updated weights for policy 0, policy_version 167395 (0.0011) [2024-06-15 13:40:40,243][1653645] Updated weights for policy 0, policy_version 167441 (0.0013) [2024-06-15 13:40:40,550][1651596] Signal inference workers to stop experience collection... (8600 times) [2024-06-15 13:40:40,593][1653645] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-06-15 13:40:40,835][1651596] Signal inference workers to resume experience collection... (8600 times) [2024-06-15 13:40:40,836][1653645] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-06-15 13:40:40,958][1648982] Fps is (10 sec: 42617.3, 60 sec: 45329.0, 300 sec: 44764.5). Total num frames: 342982656. Throughput: 0: 11184.4. Samples: 85810688. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:40:42,187][1653645] Updated weights for policy 0, policy_version 167537 (0.0234) [2024-06-15 13:40:45,674][1653645] Updated weights for policy 0, policy_version 167584 (0.0013) [2024-06-15 13:40:45,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44783.2, 300 sec: 44875.5). Total num frames: 343212032. Throughput: 0: 11161.6. Samples: 85878784. Policy #0 lag: (min: 14.0, avg: 121.6, max: 270.0) [2024-06-15 13:40:45,958][1648982] Avg episode reward: [(0, '37.540')] [2024-06-15 13:40:48,071][1653645] Updated weights for policy 0, policy_version 167635 (0.0012) [2024-06-15 13:40:50,840][1653645] Updated weights for policy 0, policy_version 167682 (0.0011) [2024-06-15 13:40:50,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 343408640. Throughput: 0: 11195.7. Samples: 85913600. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:40:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:40:52,694][1653645] Updated weights for policy 0, policy_version 167762 (0.0012) [2024-06-15 13:40:55,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 343670784. Throughput: 0: 11264.0. Samples: 85978112. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:40:55,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 13:40:55,979][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000167808_343670784.pth... [2024-06-15 13:40:56,101][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000162560_332922880.pth [2024-06-15 13:40:56,625][1653645] Updated weights for policy 0, policy_version 167827 (0.0011) [2024-06-15 13:40:57,435][1653645] Updated weights for policy 0, policy_version 167865 (0.0011) [2024-06-15 13:40:59,829][1653645] Updated weights for policy 0, policy_version 167904 (0.0012) [2024-06-15 13:41:00,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 343932928. Throughput: 0: 11343.7. Samples: 86052352. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:00,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:41:01,944][1653645] Updated weights for policy 0, policy_version 167940 (0.0014) [2024-06-15 13:41:03,792][1653645] Updated weights for policy 0, policy_version 168002 (0.0013) [2024-06-15 13:41:04,917][1653645] Updated weights for policy 0, policy_version 168062 (0.0013) [2024-06-15 13:41:05,957][1648982] Fps is (10 sec: 52430.0, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 344195072. Throughput: 0: 11434.7. Samples: 86081536. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:41:10,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 344326144. Throughput: 0: 11468.8. Samples: 86149632. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:10,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 13:41:11,554][1653645] Updated weights for policy 0, policy_version 168131 (0.0015) [2024-06-15 13:41:12,989][1653645] Updated weights for policy 0, policy_version 168190 (0.0012) [2024-06-15 13:41:14,364][1653645] Updated weights for policy 0, policy_version 168240 (0.0011) [2024-06-15 13:41:15,625][1653645] Updated weights for policy 0, policy_version 168288 (0.0029) [2024-06-15 13:41:15,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 46967.6, 300 sec: 44875.5). Total num frames: 344653824. Throughput: 0: 11526.8. Samples: 86218240. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:41:19,430][1653645] Updated weights for policy 0, policy_version 168368 (0.0013) [2024-06-15 13:41:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.3, 300 sec: 45097.6). Total num frames: 344850432. Throughput: 0: 11514.3. Samples: 86256640. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:41:24,104][1653645] Updated weights for policy 0, policy_version 168432 (0.0013) [2024-06-15 13:41:25,036][1651596] Signal inference workers to stop experience collection... (8650 times) [2024-06-15 13:41:25,072][1653645] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-06-15 13:41:25,321][1651596] Signal inference workers to resume experience collection... (8650 times) [2024-06-15 13:41:25,321][1653645] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-06-15 13:41:25,932][1653645] Updated weights for policy 0, policy_version 168498 (0.0012) [2024-06-15 13:41:25,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 345079808. Throughput: 0: 11593.9. Samples: 86332416. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:41:30,107][1653645] Updated weights for policy 0, policy_version 168596 (0.0014) [2024-06-15 13:41:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46971.0, 300 sec: 45319.8). Total num frames: 345374720. Throughput: 0: 11434.7. Samples: 86393344. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:41:34,704][1653645] Updated weights for policy 0, policy_version 168656 (0.0013) [2024-06-15 13:41:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 345505792. Throughput: 0: 11650.9. Samples: 86437888. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:41:37,075][1653645] Updated weights for policy 0, policy_version 168761 (0.0016) [2024-06-15 13:41:38,335][1653645] Updated weights for policy 0, policy_version 168816 (0.0012) [2024-06-15 13:41:40,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 46421.2, 300 sec: 44875.5). Total num frames: 345767936. Throughput: 0: 11514.3. Samples: 86496256. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:41:41,806][1653645] Updated weights for policy 0, policy_version 168867 (0.0013) [2024-06-15 13:41:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 345899008. Throughput: 0: 11685.0. Samples: 86578176. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:41:46,397][1653645] Updated weights for policy 0, policy_version 168912 (0.0020) [2024-06-15 13:41:48,219][1653645] Updated weights for policy 0, policy_version 168977 (0.0032) [2024-06-15 13:41:50,417][1653645] Updated weights for policy 0, policy_version 169072 (0.0055) [2024-06-15 13:41:50,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 48059.9, 300 sec: 45319.9). Total num frames: 346292224. Throughput: 0: 11628.1. Samples: 86604800. Policy #0 lag: (min: 55.0, avg: 165.3, max: 311.0) [2024-06-15 13:41:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:41:53,450][1653645] Updated weights for policy 0, policy_version 169124 (0.0013) [2024-06-15 13:41:55,966][1648982] Fps is (10 sec: 52383.8, 60 sec: 45868.6, 300 sec: 45318.5). Total num frames: 346423296. Throughput: 0: 11443.8. Samples: 86664704. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:41:55,967][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:41:59,124][1653645] Updated weights for policy 0, policy_version 169184 (0.0013) [2024-06-15 13:42:00,793][1653645] Updated weights for policy 0, policy_version 169264 (0.0013) [2024-06-15 13:42:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 346652672. Throughput: 0: 11571.2. Samples: 86738944. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:42:04,222][1653645] Updated weights for policy 0, policy_version 169347 (0.0014) [2024-06-15 13:42:04,921][1651596] Signal inference workers to stop experience collection... (8700 times) [2024-06-15 13:42:04,952][1653645] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-06-15 13:42:05,175][1651596] Signal inference workers to resume experience collection... (8700 times) [2024-06-15 13:42:05,176][1653645] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-06-15 13:42:05,587][1653645] Updated weights for policy 0, policy_version 169408 (0.0014) [2024-06-15 13:42:05,958][1648982] Fps is (10 sec: 52475.0, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 346947584. Throughput: 0: 11298.2. Samples: 86765056. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:42:10,981][1648982] Fps is (10 sec: 35962.2, 60 sec: 44765.8, 300 sec: 44649.9). Total num frames: 347013120. Throughput: 0: 11394.8. Samples: 86845440. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:10,981][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:42:12,280][1653645] Updated weights for policy 0, policy_version 169505 (0.0085) [2024-06-15 13:42:13,332][1653645] Updated weights for policy 0, policy_version 169553 (0.0020) [2024-06-15 13:42:15,729][1653645] Updated weights for policy 0, policy_version 169602 (0.0014) [2024-06-15 13:42:15,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 45329.0, 300 sec: 45097.6). Total num frames: 347373568. Throughput: 0: 11298.1. Samples: 86901760. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:42:17,256][1653645] Updated weights for policy 0, policy_version 169662 (0.0012) [2024-06-15 13:42:20,958][1648982] Fps is (10 sec: 45980.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 347471872. Throughput: 0: 11059.2. Samples: 86935552. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:42:23,950][1653645] Updated weights for policy 0, policy_version 169746 (0.0013) [2024-06-15 13:42:25,240][1653645] Updated weights for policy 0, policy_version 169810 (0.0105) [2024-06-15 13:42:25,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 347832320. Throughput: 0: 11264.0. Samples: 87003136. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:42:28,291][1653645] Updated weights for policy 0, policy_version 169875 (0.0014) [2024-06-15 13:42:29,203][1653645] Updated weights for policy 0, policy_version 169915 (0.0023) [2024-06-15 13:42:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 347996160. Throughput: 0: 10911.3. Samples: 87069184. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:42:34,249][1653645] Updated weights for policy 0, policy_version 169968 (0.0013) [2024-06-15 13:42:35,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45328.9, 300 sec: 44875.5). Total num frames: 348225536. Throughput: 0: 11229.8. Samples: 87110144. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:42:35,965][1653645] Updated weights for policy 0, policy_version 170033 (0.0015) [2024-06-15 13:42:36,996][1653645] Updated weights for policy 0, policy_version 170083 (0.0012) [2024-06-15 13:42:38,454][1653645] Updated weights for policy 0, policy_version 170113 (0.0013) [2024-06-15 13:42:39,731][1653645] Updated weights for policy 0, policy_version 170160 (0.0032) [2024-06-15 13:42:40,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 348520448. Throughput: 0: 11266.2. Samples: 87171584. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:40,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:42:45,260][1653645] Updated weights for policy 0, policy_version 170213 (0.0013) [2024-06-15 13:42:45,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 348651520. Throughput: 0: 11434.6. Samples: 87253504. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:42:46,877][1653645] Updated weights for policy 0, policy_version 170288 (0.0014) [2024-06-15 13:42:47,375][1651596] Signal inference workers to stop experience collection... (8750 times) [2024-06-15 13:42:47,406][1653645] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-06-15 13:42:47,668][1651596] Signal inference workers to resume experience collection... (8750 times) [2024-06-15 13:42:47,670][1653645] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-06-15 13:42:48,340][1653645] Updated weights for policy 0, policy_version 170352 (0.0032) [2024-06-15 13:42:50,489][1653645] Updated weights for policy 0, policy_version 170426 (0.0015) [2024-06-15 13:42:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 45542.1). Total num frames: 349044736. Throughput: 0: 11468.8. Samples: 87281152. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:42:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43696.9, 300 sec: 44986.6). Total num frames: 349044736. Throughput: 0: 11372.2. Samples: 87356928. Policy #0 lag: (min: 15.0, avg: 158.1, max: 271.0) [2024-06-15 13:42:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:42:56,512][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000170464_349110272.pth... [2024-06-15 13:42:56,695][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000165184_338296832.pth [2024-06-15 13:42:57,238][1653645] Updated weights for policy 0, policy_version 170492 (0.0014) [2024-06-15 13:42:59,160][1653645] Updated weights for policy 0, policy_version 170569 (0.0012) [2024-06-15 13:43:00,311][1653645] Updated weights for policy 0, policy_version 170619 (0.0013) [2024-06-15 13:43:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 349437952. Throughput: 0: 11355.0. Samples: 87412736. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:43:02,068][1653645] Updated weights for policy 0, policy_version 170672 (0.0012) [2024-06-15 13:43:05,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.4, 300 sec: 45319.8). Total num frames: 349569024. Throughput: 0: 11343.6. Samples: 87446016. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:43:08,581][1653645] Updated weights for policy 0, policy_version 170736 (0.0022) [2024-06-15 13:43:10,559][1653645] Updated weights for policy 0, policy_version 170808 (0.0014) [2024-06-15 13:43:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 47531.8, 300 sec: 45430.9). Total num frames: 349863936. Throughput: 0: 11537.1. Samples: 87522304. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:10,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 13:43:11,830][1653645] Updated weights for policy 0, policy_version 170872 (0.0015) [2024-06-15 13:43:13,559][1653645] Updated weights for policy 0, policy_version 170928 (0.0011) [2024-06-15 13:43:15,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 350093312. Throughput: 0: 11502.9. Samples: 87586816. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:15,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:43:19,215][1653645] Updated weights for policy 0, policy_version 170960 (0.0012) [2024-06-15 13:43:20,460][1653645] Updated weights for policy 0, policy_version 171014 (0.0011) [2024-06-15 13:43:20,970][1648982] Fps is (10 sec: 42544.6, 60 sec: 46957.6, 300 sec: 45540.0). Total num frames: 350289920. Throughput: 0: 11568.0. Samples: 87630848. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:20,971][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:43:21,692][1653645] Updated weights for policy 0, policy_version 171072 (0.0035) [2024-06-15 13:43:23,141][1653645] Updated weights for policy 0, policy_version 171130 (0.0014) [2024-06-15 13:43:24,670][1653645] Updated weights for policy 0, policy_version 171173 (0.0012) [2024-06-15 13:43:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 350617600. Throughput: 0: 11582.5. Samples: 87692800. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:25,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 13:43:30,958][1648982] Fps is (10 sec: 36090.3, 60 sec: 44236.8, 300 sec: 45430.9). Total num frames: 350650368. Throughput: 0: 11411.9. Samples: 87767040. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:43:31,159][1653645] Updated weights for policy 0, policy_version 171232 (0.0013) [2024-06-15 13:43:31,660][1651596] Signal inference workers to stop experience collection... (8800 times) [2024-06-15 13:43:31,738][1653645] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-06-15 13:43:31,844][1651596] Signal inference workers to resume experience collection... (8800 times) [2024-06-15 13:43:31,844][1653645] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-06-15 13:43:32,963][1653645] Updated weights for policy 0, policy_version 171312 (0.0015) [2024-06-15 13:43:34,717][1653645] Updated weights for policy 0, policy_version 171381 (0.0014) [2024-06-15 13:43:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 45430.9). Total num frames: 351043584. Throughput: 0: 11377.7. Samples: 87793152. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:35,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:43:35,959][1653645] Updated weights for policy 0, policy_version 171409 (0.0022) [2024-06-15 13:43:36,955][1653645] Updated weights for policy 0, policy_version 171456 (0.0012) [2024-06-15 13:43:40,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 43690.5, 300 sec: 45319.8). Total num frames: 351141888. Throughput: 0: 11184.3. Samples: 87860224. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:43:43,220][1653645] Updated weights for policy 0, policy_version 171515 (0.0013) [2024-06-15 13:43:45,415][1653645] Updated weights for policy 0, policy_version 171584 (0.0061) [2024-06-15 13:43:45,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 46421.4, 300 sec: 45430.9). Total num frames: 351436800. Throughput: 0: 11411.9. Samples: 87926272. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:43:46,765][1653645] Updated weights for policy 0, policy_version 171637 (0.0014) [2024-06-15 13:43:48,369][1653645] Updated weights for policy 0, policy_version 171696 (0.0015) [2024-06-15 13:43:50,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 351666176. Throughput: 0: 11184.4. Samples: 87949312. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:43:55,260][1653645] Updated weights for policy 0, policy_version 171744 (0.0092) [2024-06-15 13:43:55,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 351764480. Throughput: 0: 11218.4. Samples: 88027136. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:43:55,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:43:57,486][1653645] Updated weights for policy 0, policy_version 171824 (0.0013) [2024-06-15 13:43:59,525][1653645] Updated weights for policy 0, policy_version 171892 (0.0012) [2024-06-15 13:44:00,319][1653645] Updated weights for policy 0, policy_version 171920 (0.0011) [2024-06-15 13:44:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45329.0, 300 sec: 45542.0). Total num frames: 352157696. Throughput: 0: 10854.4. Samples: 88075264. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:44:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:44:05,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 352190464. Throughput: 0: 10698.1. Samples: 88112128. Policy #0 lag: (min: 47.0, avg: 125.0, max: 271.0) [2024-06-15 13:44:05,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:44:07,658][1653645] Updated weights for policy 0, policy_version 171985 (0.0039) [2024-06-15 13:44:09,345][1653645] Updated weights for policy 0, policy_version 172048 (0.0130) [2024-06-15 13:44:10,908][1653645] Updated weights for policy 0, policy_version 172112 (0.0013) [2024-06-15 13:44:10,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 352485376. Throughput: 0: 10831.7. Samples: 88180224. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:44:11,952][1653645] Updated weights for policy 0, policy_version 172157 (0.0025) [2024-06-15 13:44:13,309][1651596] Signal inference workers to stop experience collection... (8850 times) [2024-06-15 13:44:13,411][1653645] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-06-15 13:44:13,413][1653645] Updated weights for policy 0, policy_version 172199 (0.0012) [2024-06-15 13:44:13,599][1651596] Signal inference workers to resume experience collection... (8850 times) [2024-06-15 13:44:13,600][1653645] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-06-15 13:44:15,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.8, 300 sec: 45097.7). Total num frames: 352714752. Throughput: 0: 10570.0. Samples: 88242688. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:15,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:44:19,614][1653645] Updated weights for policy 0, policy_version 172241 (0.0014) [2024-06-15 13:44:20,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42607.3, 300 sec: 44653.3). Total num frames: 352845824. Throughput: 0: 10831.7. Samples: 88280576. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:44:20,975][1653645] Updated weights for policy 0, policy_version 172290 (0.0014) [2024-06-15 13:44:23,294][1653645] Updated weights for policy 0, policy_version 172387 (0.0014) [2024-06-15 13:44:24,304][1653645] Updated weights for policy 0, policy_version 172421 (0.0036) [2024-06-15 13:44:25,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 353239040. Throughput: 0: 10672.4. Samples: 88340480. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:44:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 353271808. Throughput: 0: 10717.9. Samples: 88408576. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:44:31,580][1653645] Updated weights for policy 0, policy_version 172519 (0.0013) [2024-06-15 13:44:34,222][1653645] Updated weights for policy 0, policy_version 172576 (0.0094) [2024-06-15 13:44:35,691][1653645] Updated weights for policy 0, policy_version 172628 (0.0013) [2024-06-15 13:44:35,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42052.3, 300 sec: 45097.6). Total num frames: 353566720. Throughput: 0: 10922.6. Samples: 88440832. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:44:37,398][1653645] Updated weights for policy 0, policy_version 172704 (0.0013) [2024-06-15 13:44:40,958][1648982] Fps is (10 sec: 49150.1, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 353763328. Throughput: 0: 10467.5. Samples: 88498176. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:40,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:44:42,474][1653645] Updated weights for policy 0, policy_version 172738 (0.0020) [2024-06-15 13:44:45,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 40960.0, 300 sec: 44431.2). Total num frames: 353894400. Throughput: 0: 11013.7. Samples: 88570880. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:44:46,120][1653645] Updated weights for policy 0, policy_version 172816 (0.0014) [2024-06-15 13:44:47,848][1653645] Updated weights for policy 0, policy_version 172881 (0.0012) [2024-06-15 13:44:50,026][1653645] Updated weights for policy 0, policy_version 172987 (0.0012) [2024-06-15 13:44:50,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.4, 300 sec: 44875.5). Total num frames: 354287616. Throughput: 0: 10763.3. Samples: 88596480. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:50,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:44:55,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 354353152. Throughput: 0: 10763.3. Samples: 88664576. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:44:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:44:56,229][1653645] Updated weights for policy 0, policy_version 173046 (0.0014) [2024-06-15 13:44:56,327][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000173056_354418688.pth... [2024-06-15 13:44:56,429][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000167808_343670784.pth [2024-06-15 13:44:59,232][1651596] Signal inference workers to stop experience collection... (8900 times) [2024-06-15 13:44:59,251][1653645] Updated weights for policy 0, policy_version 173122 (0.0096) [2024-06-15 13:44:59,308][1653645] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-06-15 13:44:59,423][1651596] Signal inference workers to resume experience collection... (8900 times) [2024-06-15 13:44:59,425][1653645] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-06-15 13:45:00,525][1653645] Updated weights for policy 0, policy_version 173186 (0.0013) [2024-06-15 13:45:00,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 42598.4, 300 sec: 44986.6). Total num frames: 354713600. Throughput: 0: 10797.5. Samples: 88728576. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:45:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:45:01,860][1653645] Updated weights for policy 0, policy_version 173248 (0.0011) [2024-06-15 13:45:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 354811904. Throughput: 0: 10649.6. Samples: 88759808. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:45:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:45:08,082][1653645] Updated weights for policy 0, policy_version 173306 (0.0027) [2024-06-15 13:45:10,556][1653645] Updated weights for policy 0, policy_version 173360 (0.0114) [2024-06-15 13:45:10,958][1648982] Fps is (10 sec: 36042.3, 60 sec: 43144.0, 300 sec: 44875.4). Total num frames: 355074048. Throughput: 0: 11081.8. Samples: 88839168. Policy #0 lag: (min: 15.0, avg: 68.9, max: 271.0) [2024-06-15 13:45:10,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:45:12,504][1653645] Updated weights for policy 0, policy_version 173446 (0.0133) [2024-06-15 13:45:13,836][1653645] Updated weights for policy 0, policy_version 173504 (0.0020) [2024-06-15 13:45:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 355336192. Throughput: 0: 10820.3. Samples: 88895488. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:45:19,896][1653645] Updated weights for policy 0, policy_version 173563 (0.0018) [2024-06-15 13:45:20,958][1648982] Fps is (10 sec: 39324.2, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 355467264. Throughput: 0: 11013.7. Samples: 88936448. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:45:22,325][1653645] Updated weights for policy 0, policy_version 173618 (0.0015) [2024-06-15 13:45:23,997][1653645] Updated weights for policy 0, policy_version 173696 (0.0014) [2024-06-15 13:45:25,298][1653645] Updated weights for policy 0, policy_version 173756 (0.0013) [2024-06-15 13:45:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 45098.3). Total num frames: 355860480. Throughput: 0: 11047.9. Samples: 88995328. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:45:30,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 355926016. Throughput: 0: 11127.4. Samples: 89071616. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:45:31,239][1653645] Updated weights for policy 0, policy_version 173824 (0.0015) [2024-06-15 13:45:34,819][1653645] Updated weights for policy 0, policy_version 173905 (0.0021) [2024-06-15 13:45:35,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 356253696. Throughput: 0: 11298.2. Samples: 89104896. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:45:36,822][1653645] Updated weights for policy 0, policy_version 173986 (0.0018) [2024-06-15 13:45:40,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 356384768. Throughput: 0: 11081.9. Samples: 89163264. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:45:43,027][1653645] Updated weights for policy 0, policy_version 174032 (0.0017) [2024-06-15 13:45:43,234][1651596] Signal inference workers to stop experience collection... (8950 times) [2024-06-15 13:45:43,277][1653645] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-06-15 13:45:43,438][1651596] Signal inference workers to resume experience collection... (8950 times) [2024-06-15 13:45:43,439][1653645] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-06-15 13:45:45,792][1653645] Updated weights for policy 0, policy_version 174096 (0.0013) [2024-06-15 13:45:45,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 356548608. Throughput: 0: 11184.4. Samples: 89231872. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:45:48,066][1653645] Updated weights for policy 0, policy_version 174177 (0.0077) [2024-06-15 13:45:50,270][1653645] Updated weights for policy 0, policy_version 174262 (0.0013) [2024-06-15 13:45:50,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 356909056. Throughput: 0: 10956.8. Samples: 89252864. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:45:55,798][1653645] Updated weights for policy 0, policy_version 174305 (0.0020) [2024-06-15 13:45:55,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 356974592. Throughput: 0: 10604.2. Samples: 89316352. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:45:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:45:58,417][1653645] Updated weights for policy 0, policy_version 174340 (0.0032) [2024-06-15 13:46:00,175][1653645] Updated weights for policy 0, policy_version 174402 (0.0074) [2024-06-15 13:46:00,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 42052.4, 300 sec: 44209.0). Total num frames: 357236736. Throughput: 0: 10797.6. Samples: 89381376. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:46:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:46:01,612][1653645] Updated weights for policy 0, policy_version 174468 (0.0012) [2024-06-15 13:46:05,961][1648982] Fps is (10 sec: 45861.4, 60 sec: 43688.5, 300 sec: 44430.7). Total num frames: 357433344. Throughput: 0: 10569.2. Samples: 89412096. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:46:05,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:46:07,378][1653645] Updated weights for policy 0, policy_version 174560 (0.0015) [2024-06-15 13:46:10,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 42052.7, 300 sec: 43875.8). Total num frames: 357597184. Throughput: 0: 10831.7. Samples: 89482752. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:46:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:46:11,167][1653645] Updated weights for policy 0, policy_version 174624 (0.0012) [2024-06-15 13:46:12,824][1653645] Updated weights for policy 0, policy_version 174691 (0.0012) [2024-06-15 13:46:14,489][1653645] Updated weights for policy 0, policy_version 174780 (0.0123) [2024-06-15 13:46:15,958][1648982] Fps is (10 sec: 52443.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 357957632. Throughput: 0: 10535.8. Samples: 89545728. Policy #0 lag: (min: 127.0, avg: 222.8, max: 383.0) [2024-06-15 13:46:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:46:20,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 358088704. Throughput: 0: 10581.3. Samples: 89581056. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:46:22,470][1653645] Updated weights for policy 0, policy_version 174864 (0.0014) [2024-06-15 13:46:24,422][1651596] Signal inference workers to stop experience collection... (9000 times) [2024-06-15 13:46:24,442][1653645] Updated weights for policy 0, policy_version 174945 (0.0085) [2024-06-15 13:46:24,475][1653645] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-06-15 13:46:24,619][1651596] Signal inference workers to resume experience collection... (9000 times) [2024-06-15 13:46:24,620][1653645] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-06-15 13:46:25,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 358416384. Throughput: 0: 10786.2. Samples: 89648640. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:46:26,259][1653645] Updated weights for policy 0, policy_version 175037 (0.0015) [2024-06-15 13:46:30,902][1653645] Updated weights for policy 0, policy_version 175096 (0.0012) [2024-06-15 13:46:30,957][1648982] Fps is (10 sec: 49153.6, 60 sec: 44237.0, 300 sec: 44320.1). Total num frames: 358580224. Throughput: 0: 10729.3. Samples: 89714688. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:46:35,071][1653645] Updated weights for policy 0, policy_version 175153 (0.0011) [2024-06-15 13:46:35,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 42052.1, 300 sec: 44097.9). Total num frames: 358776832. Throughput: 0: 11138.8. Samples: 89754112. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:35,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 13:46:36,431][1653645] Updated weights for policy 0, policy_version 175216 (0.0115) [2024-06-15 13:46:38,499][1653645] Updated weights for policy 0, policy_version 175292 (0.0035) [2024-06-15 13:46:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 359006208. Throughput: 0: 10991.0. Samples: 89810944. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:46:42,524][1653645] Updated weights for policy 0, policy_version 175349 (0.0112) [2024-06-15 13:46:45,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.3, 300 sec: 43653.6). Total num frames: 359170048. Throughput: 0: 11218.3. Samples: 89886208. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:46:46,400][1653645] Updated weights for policy 0, policy_version 175395 (0.0013) [2024-06-15 13:46:48,629][1653645] Updated weights for policy 0, policy_version 175474 (0.0035) [2024-06-15 13:46:50,069][1653645] Updated weights for policy 0, policy_version 175536 (0.0013) [2024-06-15 13:46:50,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 43690.5, 300 sec: 44432.5). Total num frames: 359530496. Throughput: 0: 11116.8. Samples: 89912320. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:46:54,124][1653645] Updated weights for policy 0, policy_version 175607 (0.0041) [2024-06-15 13:46:55,958][1648982] Fps is (10 sec: 49154.0, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 359661568. Throughput: 0: 10979.5. Samples: 89976832. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:46:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:46:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000175616_359661568.pth... [2024-06-15 13:46:56,019][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000170464_349110272.pth [2024-06-15 13:46:57,976][1653645] Updated weights for policy 0, policy_version 175664 (0.0015) [2024-06-15 13:47:00,988][1648982] Fps is (10 sec: 35938.3, 60 sec: 44214.6, 300 sec: 43871.3). Total num frames: 359890944. Throughput: 0: 11177.0. Samples: 90049024. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:47:00,988][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 13:47:01,282][1653645] Updated weights for policy 0, policy_version 175747 (0.0077) [2024-06-15 13:47:02,589][1653645] Updated weights for policy 0, policy_version 175803 (0.0012) [2024-06-15 13:47:05,721][1653645] Updated weights for policy 0, policy_version 175868 (0.0013) [2024-06-15 13:47:05,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 45877.3, 300 sec: 44656.8). Total num frames: 360185856. Throughput: 0: 11070.5. Samples: 90079232. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:47:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:47:09,170][1651596] Signal inference workers to stop experience collection... (9050 times) [2024-06-15 13:47:09,240][1653645] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-06-15 13:47:09,401][1651596] Signal inference workers to resume experience collection... (9050 times) [2024-06-15 13:47:09,402][1653645] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-06-15 13:47:09,551][1653645] Updated weights for policy 0, policy_version 175927 (0.0013) [2024-06-15 13:47:10,958][1648982] Fps is (10 sec: 42726.0, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 360316928. Throughput: 0: 11070.6. Samples: 90146816. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:47:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:47:12,540][1653645] Updated weights for policy 0, policy_version 175984 (0.0012) [2024-06-15 13:47:13,954][1653645] Updated weights for policy 0, policy_version 176032 (0.0154) [2024-06-15 13:47:15,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 360579072. Throughput: 0: 10945.4. Samples: 90207232. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:47:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:47:17,678][1653645] Updated weights for policy 0, policy_version 176121 (0.0018) [2024-06-15 13:47:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44237.0, 300 sec: 43764.7). Total num frames: 360742912. Throughput: 0: 10797.6. Samples: 90240000. Policy #0 lag: (min: 23.0, avg: 137.1, max: 279.0) [2024-06-15 13:47:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:47:21,513][1653645] Updated weights for policy 0, policy_version 176176 (0.0012) [2024-06-15 13:47:24,030][1653645] Updated weights for policy 0, policy_version 176210 (0.0012) [2024-06-15 13:47:25,959][1648982] Fps is (10 sec: 45870.5, 60 sec: 43690.0, 300 sec: 44208.9). Total num frames: 361037824. Throughput: 0: 11161.3. Samples: 90313216. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:25,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:47:26,526][1653645] Updated weights for policy 0, policy_version 176313 (0.0101) [2024-06-15 13:47:29,364][1653645] Updated weights for policy 0, policy_version 176355 (0.0016) [2024-06-15 13:47:30,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 361234432. Throughput: 0: 10786.2. Samples: 90371584. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:47:32,688][1653645] Updated weights for policy 0, policy_version 176404 (0.0013) [2024-06-15 13:47:33,650][1653645] Updated weights for policy 0, policy_version 176448 (0.0011) [2024-06-15 13:47:35,958][1648982] Fps is (10 sec: 36047.2, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 361398272. Throughput: 0: 11081.9. Samples: 90411008. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:35,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:47:37,139][1653645] Updated weights for policy 0, policy_version 176514 (0.0014) [2024-06-15 13:47:38,398][1653645] Updated weights for policy 0, policy_version 176576 (0.0012) [2024-06-15 13:47:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 361758720. Throughput: 0: 11116.1. Samples: 90477056. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:47:43,927][1653645] Updated weights for policy 0, policy_version 176644 (0.0013) [2024-06-15 13:47:45,125][1653645] Updated weights for policy 0, policy_version 176697 (0.0015) [2024-06-15 13:47:45,958][1648982] Fps is (10 sec: 49153.5, 60 sec: 45329.4, 300 sec: 43542.6). Total num frames: 361889792. Throughput: 0: 11066.5. Samples: 90546688. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 13:47:48,358][1653645] Updated weights for policy 0, policy_version 176739 (0.0013) [2024-06-15 13:47:49,505][1653645] Updated weights for policy 0, policy_version 176789 (0.0012) [2024-06-15 13:47:50,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 362151936. Throughput: 0: 11161.7. Samples: 90581504. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:47:52,214][1653645] Updated weights for policy 0, policy_version 176880 (0.0014) [2024-06-15 13:47:55,355][1651596] Signal inference workers to stop experience collection... (9100 times) [2024-06-15 13:47:55,404][1653645] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-06-15 13:47:55,597][1651596] Signal inference workers to resume experience collection... (9100 times) [2024-06-15 13:47:55,598][1653645] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-06-15 13:47:55,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 362381312. Throughput: 0: 11127.5. Samples: 90647552. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:47:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:47:56,028][1653645] Updated weights for policy 0, policy_version 176949 (0.0013) [2024-06-15 13:48:00,132][1653645] Updated weights for policy 0, policy_version 176997 (0.0011) [2024-06-15 13:48:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44258.9, 300 sec: 43986.9). Total num frames: 362545152. Throughput: 0: 11286.8. Samples: 90715136. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:48:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:48:01,751][1653645] Updated weights for policy 0, policy_version 177063 (0.0044) [2024-06-15 13:48:03,379][1653645] Updated weights for policy 0, policy_version 177120 (0.0013) [2024-06-15 13:48:05,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 362807296. Throughput: 0: 11229.8. Samples: 90745344. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:48:05,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:48:06,792][1653645] Updated weights for policy 0, policy_version 177184 (0.0014) [2024-06-15 13:48:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 362938368. Throughput: 0: 11275.6. Samples: 90820608. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:48:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:48:11,535][1653645] Updated weights for policy 0, policy_version 177248 (0.0013) [2024-06-15 13:48:13,575][1653645] Updated weights for policy 0, policy_version 177333 (0.0014) [2024-06-15 13:48:15,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 44783.0, 300 sec: 43988.8). Total num frames: 363266048. Throughput: 0: 11309.5. Samples: 90880512. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:48:15,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:48:16,142][1653645] Updated weights for policy 0, policy_version 177381 (0.0012) [2024-06-15 13:48:19,467][1653645] Updated weights for policy 0, policy_version 177471 (0.0107) [2024-06-15 13:48:20,957][1648982] Fps is (10 sec: 52430.0, 60 sec: 45329.2, 300 sec: 43542.6). Total num frames: 363462656. Throughput: 0: 11252.8. Samples: 90917376. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:48:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:48:24,084][1653645] Updated weights for policy 0, policy_version 177544 (0.0012) [2024-06-15 13:48:25,392][1653645] Updated weights for policy 0, policy_version 177600 (0.0012) [2024-06-15 13:48:25,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44783.7, 300 sec: 44320.1). Total num frames: 363724800. Throughput: 0: 11229.9. Samples: 90982400. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 13:48:25,961][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:48:27,997][1653645] Updated weights for policy 0, policy_version 177661 (0.0013) [2024-06-15 13:48:30,675][1653645] Updated weights for policy 0, policy_version 177719 (0.0012) [2024-06-15 13:48:30,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.3, 300 sec: 43875.8). Total num frames: 363986944. Throughput: 0: 11161.6. Samples: 91048960. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:48:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:48:35,402][1653645] Updated weights for policy 0, policy_version 177766 (0.0012) [2024-06-15 13:48:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.3, 300 sec: 43986.9). Total num frames: 364118016. Throughput: 0: 11264.0. Samples: 91088384. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:48:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:48:37,389][1653645] Updated weights for policy 0, policy_version 177850 (0.0115) [2024-06-15 13:48:39,365][1651596] Signal inference workers to stop experience collection... (9150 times) [2024-06-15 13:48:39,412][1653645] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-06-15 13:48:39,554][1651596] Signal inference workers to resume experience collection... (9150 times) [2024-06-15 13:48:39,555][1653645] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-06-15 13:48:39,804][1653645] Updated weights for policy 0, policy_version 177919 (0.0015) [2024-06-15 13:48:40,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 364380160. Throughput: 0: 11047.8. Samples: 91144704. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:48:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:48:42,402][1653645] Updated weights for policy 0, policy_version 177968 (0.0011) [2024-06-15 13:48:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 364511232. Throughput: 0: 11298.1. Samples: 91223552. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:48:45,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 13:48:46,304][1653645] Updated weights for policy 0, policy_version 178003 (0.0010) [2024-06-15 13:48:47,744][1653645] Updated weights for policy 0, policy_version 178056 (0.0013) [2024-06-15 13:48:50,177][1653645] Updated weights for policy 0, policy_version 178128 (0.0013) [2024-06-15 13:48:50,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 45328.8, 300 sec: 44431.2). Total num frames: 364871680. Throughput: 0: 11286.7. Samples: 91253248. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:48:50,959][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 13:48:51,112][1653645] Updated weights for policy 0, policy_version 178172 (0.0013) [2024-06-15 13:48:53,484][1653645] Updated weights for policy 0, policy_version 178224 (0.0013) [2024-06-15 13:48:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 365035520. Throughput: 0: 11150.2. Samples: 91322368. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:48:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 13:48:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000178240_365035520.pth... [2024-06-15 13:48:56,060][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000173056_354418688.pth [2024-06-15 13:48:58,710][1653645] Updated weights for policy 0, policy_version 178272 (0.0013) [2024-06-15 13:49:00,768][1653645] Updated weights for policy 0, policy_version 178352 (0.0012) [2024-06-15 13:49:00,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45328.8, 300 sec: 44320.1). Total num frames: 365264896. Throughput: 0: 11195.6. Samples: 91384320. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:00,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 13:49:02,177][1653645] Updated weights for policy 0, policy_version 178403 (0.0017) [2024-06-15 13:49:02,802][1653645] Updated weights for policy 0, policy_version 178431 (0.0014) [2024-06-15 13:49:05,846][1653645] Updated weights for policy 0, policy_version 178480 (0.0012) [2024-06-15 13:49:05,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.3, 300 sec: 44209.0). Total num frames: 365527040. Throughput: 0: 11059.2. Samples: 91415040. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:49:10,958][1648982] Fps is (10 sec: 32769.1, 60 sec: 44236.9, 300 sec: 43653.6). Total num frames: 365592576. Throughput: 0: 11173.0. Samples: 91485184. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:10,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:49:12,615][1653645] Updated weights for policy 0, policy_version 178592 (0.0014) [2024-06-15 13:49:13,540][1653645] Updated weights for policy 0, policy_version 178624 (0.0011) [2024-06-15 13:49:15,184][1653645] Updated weights for policy 0, policy_version 178683 (0.0141) [2024-06-15 13:49:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 365953024. Throughput: 0: 10877.2. Samples: 91538432. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:49:20,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.5, 300 sec: 43542.6). Total num frames: 366084096. Throughput: 0: 10717.9. Samples: 91570688. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:20,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 13:49:23,216][1653645] Updated weights for policy 0, policy_version 178753 (0.0019) [2024-06-15 13:49:24,920][1653645] Updated weights for policy 0, policy_version 178832 (0.0024) [2024-06-15 13:49:25,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 366313472. Throughput: 0: 11059.2. Samples: 91642368. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:49:25,968][1651596] Signal inference workers to stop experience collection... (9200 times) [2024-06-15 13:49:25,990][1653645] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-06-15 13:49:26,154][1651596] Signal inference workers to resume experience collection... (9200 times) [2024-06-15 13:49:26,154][1653645] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-06-15 13:49:26,303][1653645] Updated weights for policy 0, policy_version 178882 (0.0012) [2024-06-15 13:49:27,603][1653645] Updated weights for policy 0, policy_version 178935 (0.0011) [2024-06-15 13:49:30,900][1653645] Updated weights for policy 0, policy_version 178992 (0.0133) [2024-06-15 13:49:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 366575616. Throughput: 0: 10615.5. Samples: 91701248. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:30,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 13:49:35,960][1648982] Fps is (10 sec: 29483.5, 60 sec: 41504.3, 300 sec: 43542.2). Total num frames: 366608384. Throughput: 0: 10728.7. Samples: 91736064. Policy #0 lag: (min: 15.0, avg: 132.3, max: 271.0) [2024-06-15 13:49:35,961][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:49:37,279][1653645] Updated weights for policy 0, policy_version 179072 (0.0012) [2024-06-15 13:49:39,948][1653645] Updated weights for policy 0, policy_version 179169 (0.0013) [2024-06-15 13:49:40,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 367001600. Throughput: 0: 10308.2. Samples: 91786240. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:49:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:49:43,856][1653645] Updated weights for policy 0, policy_version 179220 (0.0012) [2024-06-15 13:49:45,958][1648982] Fps is (10 sec: 52442.4, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 367132672. Throughput: 0: 10365.2. Samples: 91850752. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:49:45,958][1648982] Avg episode reward: [(0, '36.420')] [2024-06-15 13:49:49,809][1653645] Updated weights for policy 0, policy_version 179328 (0.0028) [2024-06-15 13:49:50,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 41506.3, 300 sec: 44098.0). Total num frames: 367362048. Throughput: 0: 10661.0. Samples: 91894784. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:49:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:49:51,202][1653645] Updated weights for policy 0, policy_version 179383 (0.0044) [2024-06-15 13:49:52,750][1653645] Updated weights for policy 0, policy_version 179449 (0.0013) [2024-06-15 13:49:55,859][1653645] Updated weights for policy 0, policy_version 179520 (0.0012) [2024-06-15 13:49:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 367656960. Throughput: 0: 10342.4. Samples: 91950592. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:49:55,961][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:50:00,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 40960.2, 300 sec: 43764.7). Total num frames: 367722496. Throughput: 0: 10854.4. Samples: 92026880. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 13:50:01,334][1653645] Updated weights for policy 0, policy_version 179568 (0.0159) [2024-06-15 13:50:02,690][1653645] Updated weights for policy 0, policy_version 179632 (0.0030) [2024-06-15 13:50:04,060][1653645] Updated weights for policy 0, policy_version 179696 (0.0027) [2024-06-15 13:50:05,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 43987.0). Total num frames: 368050176. Throughput: 0: 10729.2. Samples: 92053504. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:50:06,925][1653645] Updated weights for policy 0, policy_version 179744 (0.0013) [2024-06-15 13:50:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 368181248. Throughput: 0: 10695.1. Samples: 92123648. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:50:11,669][1651596] Signal inference workers to stop experience collection... (9250 times) [2024-06-15 13:50:11,743][1653645] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-06-15 13:50:11,745][1653645] Updated weights for policy 0, policy_version 179795 (0.0012) [2024-06-15 13:50:11,990][1651596] Signal inference workers to resume experience collection... (9250 times) [2024-06-15 13:50:12,005][1653645] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-06-15 13:50:13,737][1653645] Updated weights for policy 0, policy_version 179888 (0.0128) [2024-06-15 13:50:15,692][1653645] Updated weights for policy 0, policy_version 179959 (0.0014) [2024-06-15 13:50:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 368574464. Throughput: 0: 10786.1. Samples: 92186624. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:15,960][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 13:50:19,184][1653645] Updated weights for policy 0, policy_version 180022 (0.0014) [2024-06-15 13:50:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 368705536. Throughput: 0: 10764.0. Samples: 92220416. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:50:24,005][1653645] Updated weights for policy 0, policy_version 180068 (0.0015) [2024-06-15 13:50:25,083][1653645] Updated weights for policy 0, policy_version 180114 (0.0014) [2024-06-15 13:50:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 368934912. Throughput: 0: 11366.4. Samples: 92297728. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:25,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:50:26,651][1653645] Updated weights for policy 0, policy_version 180181 (0.0127) [2024-06-15 13:50:29,454][1653645] Updated weights for policy 0, policy_version 180256 (0.0013) [2024-06-15 13:50:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 369229824. Throughput: 0: 11309.5. Samples: 92359680. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:50:34,634][1653645] Updated weights for policy 0, policy_version 180306 (0.0014) [2024-06-15 13:50:35,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 46423.1, 300 sec: 44098.0). Total num frames: 369393664. Throughput: 0: 11377.7. Samples: 92406784. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:35,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:50:36,248][1653645] Updated weights for policy 0, policy_version 180384 (0.0012) [2024-06-15 13:50:38,050][1653645] Updated weights for policy 0, policy_version 180448 (0.0014) [2024-06-15 13:50:40,631][1653645] Updated weights for policy 0, policy_version 180513 (0.0016) [2024-06-15 13:50:40,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.3, 300 sec: 44653.3). Total num frames: 369721344. Throughput: 0: 11457.5. Samples: 92466176. Policy #0 lag: (min: 15.0, avg: 85.6, max: 271.0) [2024-06-15 13:50:40,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 13:50:45,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 369786880. Throughput: 0: 11559.8. Samples: 92547072. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:50:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:50:46,138][1653645] Updated weights for policy 0, policy_version 180582 (0.0013) [2024-06-15 13:50:47,804][1653645] Updated weights for policy 0, policy_version 180656 (0.0011) [2024-06-15 13:50:49,483][1653645] Updated weights for policy 0, policy_version 180694 (0.0036) [2024-06-15 13:50:50,400][1653645] Updated weights for policy 0, policy_version 180736 (0.0012) [2024-06-15 13:50:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 44653.3). Total num frames: 370147328. Throughput: 0: 11605.3. Samples: 92575744. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:50:50,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:50:51,574][1651596] Signal inference workers to stop experience collection... (9300 times) [2024-06-15 13:50:51,642][1653645] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-06-15 13:50:51,799][1651596] Signal inference workers to resume experience collection... (9300 times) [2024-06-15 13:50:51,808][1653645] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-06-15 13:50:52,271][1653645] Updated weights for policy 0, policy_version 180784 (0.0016) [2024-06-15 13:50:55,958][1648982] Fps is (10 sec: 49150.0, 60 sec: 43690.5, 300 sec: 44208.9). Total num frames: 370278400. Throughput: 0: 11559.7. Samples: 92643840. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:50:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:50:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000180800_370278400.pth... [2024-06-15 13:50:56,028][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000175616_359661568.pth [2024-06-15 13:50:56,033][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000180800_370278400.pth [2024-06-15 13:50:57,093][1653645] Updated weights for policy 0, policy_version 180820 (0.0011) [2024-06-15 13:50:58,269][1653645] Updated weights for policy 0, policy_version 180869 (0.0012) [2024-06-15 13:50:59,808][1653645] Updated weights for policy 0, policy_version 180932 (0.0013) [2024-06-15 13:51:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 48605.9, 300 sec: 44764.9). Total num frames: 370638848. Throughput: 0: 11741.9. Samples: 92715008. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:51:02,382][1653645] Updated weights for policy 0, policy_version 180993 (0.0133) [2024-06-15 13:51:03,902][1653645] Updated weights for policy 0, policy_version 181056 (0.0033) [2024-06-15 13:51:05,960][1648982] Fps is (10 sec: 52430.7, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 370802688. Throughput: 0: 11582.6. Samples: 92741632. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:05,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:51:09,844][1653645] Updated weights for policy 0, policy_version 181114 (0.0015) [2024-06-15 13:51:10,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 46421.3, 300 sec: 44098.0). Total num frames: 370966528. Throughput: 0: 11491.6. Samples: 92814848. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:51:11,153][1653645] Updated weights for policy 0, policy_version 181156 (0.0013) [2024-06-15 13:51:13,110][1653645] Updated weights for policy 0, policy_version 181216 (0.0105) [2024-06-15 13:51:15,256][1653645] Updated weights for policy 0, policy_version 181296 (0.0019) [2024-06-15 13:51:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 371326976. Throughput: 0: 11332.3. Samples: 92869632. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:15,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 13:51:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 371326976. Throughput: 0: 11138.9. Samples: 92908032. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:51:21,833][1653645] Updated weights for policy 0, policy_version 181344 (0.0012) [2024-06-15 13:51:24,688][1653645] Updated weights for policy 0, policy_version 181441 (0.0141) [2024-06-15 13:51:25,907][1653645] Updated weights for policy 0, policy_version 181501 (0.0014) [2024-06-15 13:51:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 46421.4, 300 sec: 44542.2). Total num frames: 371720192. Throughput: 0: 11116.1. Samples: 92966400. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:51:29,017][1653645] Updated weights for policy 0, policy_version 181568 (0.0012) [2024-06-15 13:51:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44320.2). Total num frames: 371851264. Throughput: 0: 10604.1. Samples: 93024256. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:51:35,958][1648982] Fps is (10 sec: 26214.4, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 371982336. Throughput: 0: 10922.7. Samples: 93067264. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:35,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:51:35,963][1653645] Updated weights for policy 0, policy_version 181648 (0.0012) [2024-06-15 13:51:38,241][1653645] Updated weights for policy 0, policy_version 181728 (0.0014) [2024-06-15 13:51:39,133][1653645] Updated weights for policy 0, policy_version 181759 (0.0011) [2024-06-15 13:51:40,152][1651596] Signal inference workers to stop experience collection... (9350 times) [2024-06-15 13:51:40,250][1653645] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-06-15 13:51:40,408][1651596] Signal inference workers to resume experience collection... (9350 times) [2024-06-15 13:51:40,409][1653645] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-06-15 13:51:40,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 372310016. Throughput: 0: 10490.4. Samples: 93115904. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:51:41,379][1653645] Updated weights for policy 0, policy_version 181818 (0.0013) [2024-06-15 13:51:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 372375552. Throughput: 0: 10615.4. Samples: 93192704. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:45,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:51:47,708][1653645] Updated weights for policy 0, policy_version 181872 (0.0013) [2024-06-15 13:51:49,533][1653645] Updated weights for policy 0, policy_version 181936 (0.0013) [2024-06-15 13:51:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 372703232. Throughput: 0: 10706.5. Samples: 93223424. Policy #0 lag: (min: 15.0, avg: 79.7, max: 266.0) [2024-06-15 13:51:50,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 13:51:51,014][1653645] Updated weights for policy 0, policy_version 181985 (0.0084) [2024-06-15 13:51:52,492][1653645] Updated weights for policy 0, policy_version 182034 (0.0039) [2024-06-15 13:51:55,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.7, 300 sec: 44102.4). Total num frames: 372899840. Throughput: 0: 10217.2. Samples: 93274624. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:51:55,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:52:00,345][1653645] Updated weights for policy 0, policy_version 182128 (0.0014) [2024-06-15 13:52:00,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 39867.7, 300 sec: 43542.6). Total num frames: 373030912. Throughput: 0: 10581.3. Samples: 93345792. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 13:52:02,181][1653645] Updated weights for policy 0, policy_version 182204 (0.0016) [2024-06-15 13:52:03,808][1653645] Updated weights for policy 0, policy_version 182256 (0.0013) [2024-06-15 13:52:05,704][1653645] Updated weights for policy 0, policy_version 182335 (0.0013) [2024-06-15 13:52:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 373424128. Throughput: 0: 10319.6. Samples: 93372416. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:52:10,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 43542.5). Total num frames: 373424128. Throughput: 0: 10535.8. Samples: 93440512. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:52:13,756][1653645] Updated weights for policy 0, policy_version 182427 (0.0191) [2024-06-15 13:52:15,083][1653645] Updated weights for policy 0, policy_version 182486 (0.0014) [2024-06-15 13:52:15,987][1648982] Fps is (10 sec: 39210.6, 60 sec: 41486.4, 300 sec: 44315.8). Total num frames: 373817344. Throughput: 0: 10608.7. Samples: 93501952. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:15,989][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:52:16,515][1653645] Updated weights for policy 0, policy_version 182529 (0.0017) [2024-06-15 13:52:20,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.4, 300 sec: 43764.8). Total num frames: 373948416. Throughput: 0: 10274.1. Samples: 93529600. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:52:24,006][1653645] Updated weights for policy 0, policy_version 182593 (0.0015) [2024-06-15 13:52:25,958][1648982] Fps is (10 sec: 29575.1, 60 sec: 39867.6, 300 sec: 43653.6). Total num frames: 374112256. Throughput: 0: 10945.4. Samples: 93608448. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:52:26,180][1653645] Updated weights for policy 0, policy_version 182688 (0.0013) [2024-06-15 13:52:26,312][1651596] Signal inference workers to stop experience collection... (9400 times) [2024-06-15 13:52:26,361][1653645] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-06-15 13:52:26,572][1651596] Signal inference workers to resume experience collection... (9400 times) [2024-06-15 13:52:26,573][1653645] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-06-15 13:52:28,194][1653645] Updated weights for policy 0, policy_version 182783 (0.0012) [2024-06-15 13:52:30,007][1653645] Updated weights for policy 0, policy_version 182840 (0.0027) [2024-06-15 13:52:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.4, 300 sec: 44320.1). Total num frames: 374472704. Throughput: 0: 10410.6. Samples: 93661184. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:52:35,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 41506.1, 300 sec: 43098.3). Total num frames: 374472704. Throughput: 0: 10638.2. Samples: 93702144. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:52:37,615][1653645] Updated weights for policy 0, policy_version 182896 (0.0029) [2024-06-15 13:52:39,683][1653645] Updated weights for policy 0, policy_version 182978 (0.0107) [2024-06-15 13:52:40,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 42052.2, 300 sec: 43875.8). Total num frames: 374833152. Throughput: 0: 10888.6. Samples: 93764608. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:52:42,078][1653645] Updated weights for policy 0, policy_version 183076 (0.0104) [2024-06-15 13:52:45,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 374996992. Throughput: 0: 10638.2. Samples: 93824512. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:52:49,943][1653645] Updated weights for policy 0, policy_version 183159 (0.0014) [2024-06-15 13:52:50,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 40960.0, 300 sec: 43320.4). Total num frames: 375160832. Throughput: 0: 10911.4. Samples: 93863424. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:52:51,256][1653645] Updated weights for policy 0, policy_version 183206 (0.0025) [2024-06-15 13:52:52,809][1653645] Updated weights for policy 0, policy_version 183280 (0.0017) [2024-06-15 13:52:54,675][1653645] Updated weights for policy 0, policy_version 183351 (0.0013) [2024-06-15 13:52:55,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.8, 300 sec: 43986.8). Total num frames: 375521280. Throughput: 0: 10501.7. Samples: 93913088. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:52:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:52:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000183360_375521280.pth... [2024-06-15 13:52:56,048][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000178240_365035520.pth [2024-06-15 13:53:00,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 41506.2, 300 sec: 43098.3). Total num frames: 375521280. Throughput: 0: 10872.7. Samples: 93990912. Policy #0 lag: (min: 79.0, avg: 204.4, max: 335.0) [2024-06-15 13:53:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 13:53:01,616][1653645] Updated weights for policy 0, policy_version 183392 (0.0013) [2024-06-15 13:53:03,235][1653645] Updated weights for policy 0, policy_version 183459 (0.0013) [2024-06-15 13:53:05,691][1651596] Signal inference workers to stop experience collection... (9450 times) [2024-06-15 13:53:05,727][1653645] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-06-15 13:53:05,947][1651596] Signal inference workers to resume experience collection... (9450 times) [2024-06-15 13:53:05,950][1653645] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-06-15 13:53:05,952][1653645] Updated weights for policy 0, policy_version 183568 (0.0142) [2024-06-15 13:53:05,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 44097.9). Total num frames: 375947264. Throughput: 0: 10911.3. Samples: 94020608. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:53:10,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 376045568. Throughput: 0: 10331.0. Samples: 94073344. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:10,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:53:14,098][1653645] Updated weights for policy 0, policy_version 183648 (0.0015) [2024-06-15 13:53:15,958][1648982] Fps is (10 sec: 29491.7, 60 sec: 40433.2, 300 sec: 43320.4). Total num frames: 376242176. Throughput: 0: 10809.0. Samples: 94147584. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:53:15,981][1653645] Updated weights for policy 0, policy_version 183716 (0.0012) [2024-06-15 13:53:17,360][1653645] Updated weights for policy 0, policy_version 183776 (0.0015) [2024-06-15 13:53:18,545][1653645] Updated weights for policy 0, policy_version 183829 (0.0013) [2024-06-15 13:53:20,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 376569856. Throughput: 0: 10444.7. Samples: 94172160. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:53:25,720][1653645] Updated weights for policy 0, policy_version 183888 (0.0011) [2024-06-15 13:53:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 41506.3, 300 sec: 42765.0). Total num frames: 376602624. Throughput: 0: 10683.7. Samples: 94245376. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:53:27,244][1653645] Updated weights for policy 0, policy_version 183952 (0.0013) [2024-06-15 13:53:28,442][1653645] Updated weights for policy 0, policy_version 184002 (0.0013) [2024-06-15 13:53:29,850][1653645] Updated weights for policy 0, policy_version 184064 (0.0017) [2024-06-15 13:53:30,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 43764.7). Total num frames: 377028608. Throughput: 0: 10751.9. Samples: 94308352. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:53:31,221][1653645] Updated weights for policy 0, policy_version 184119 (0.0012) [2024-06-15 13:53:35,974][1648982] Fps is (10 sec: 49070.0, 60 sec: 43678.5, 300 sec: 43095.8). Total num frames: 377094144. Throughput: 0: 10713.9. Samples: 94345728. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:35,975][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:53:38,658][1653645] Updated weights for policy 0, policy_version 184177 (0.0013) [2024-06-15 13:53:40,312][1653645] Updated weights for policy 0, policy_version 184240 (0.0012) [2024-06-15 13:53:40,958][1648982] Fps is (10 sec: 32769.0, 60 sec: 42052.4, 300 sec: 43542.6). Total num frames: 377356288. Throughput: 0: 11150.3. Samples: 94414848. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:53:42,706][1653645] Updated weights for policy 0, policy_version 184337 (0.0013) [2024-06-15 13:53:43,684][1653645] Updated weights for policy 0, policy_version 184382 (0.0015) [2024-06-15 13:53:45,958][1648982] Fps is (10 sec: 52517.0, 60 sec: 43690.6, 300 sec: 43209.4). Total num frames: 377618432. Throughput: 0: 10740.6. Samples: 94474240. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:53:50,744][1651596] Signal inference workers to stop experience collection... (9500 times) [2024-06-15 13:53:50,781][1653645] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-06-15 13:53:50,957][1648982] Fps is (10 sec: 32768.2, 60 sec: 42052.4, 300 sec: 42876.1). Total num frames: 377683968. Throughput: 0: 10945.5. Samples: 94513152. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:53:50,983][1651596] Signal inference workers to resume experience collection... (9500 times) [2024-06-15 13:53:50,984][1653645] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-06-15 13:53:51,815][1653645] Updated weights for policy 0, policy_version 184464 (0.0095) [2024-06-15 13:53:53,814][1653645] Updated weights for policy 0, policy_version 184560 (0.0012) [2024-06-15 13:53:55,473][1653645] Updated weights for policy 0, policy_version 184609 (0.0012) [2024-06-15 13:53:55,958][1648982] Fps is (10 sec: 49150.5, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 378109952. Throughput: 0: 10922.7. Samples: 94564864. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:53:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 13:54:00,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 378142720. Throughput: 0: 10763.4. Samples: 94631936. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:54:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:54:03,496][1653645] Updated weights for policy 0, policy_version 184674 (0.0012) [2024-06-15 13:54:05,630][1653645] Updated weights for policy 0, policy_version 184753 (0.0150) [2024-06-15 13:54:05,958][1648982] Fps is (10 sec: 29491.8, 60 sec: 40960.0, 300 sec: 43431.5). Total num frames: 378404864. Throughput: 0: 11082.0. Samples: 94670848. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 13:54:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:54:07,732][1653645] Updated weights for policy 0, policy_version 184836 (0.0014) [2024-06-15 13:54:09,082][1653645] Updated weights for policy 0, policy_version 184896 (0.0012) [2024-06-15 13:54:10,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 378667008. Throughput: 0: 10433.4. Samples: 94714880. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:54:15,959][1648982] Fps is (10 sec: 29491.2, 60 sec: 40959.9, 300 sec: 42765.0). Total num frames: 378699776. Throughput: 0: 10854.4. Samples: 94796800. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:15,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:54:17,539][1653645] Updated weights for policy 0, policy_version 184992 (0.0013) [2024-06-15 13:54:19,374][1653645] Updated weights for policy 0, policy_version 185072 (0.0103) [2024-06-15 13:54:20,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.7, 300 sec: 43542.6). Total num frames: 379158528. Throughput: 0: 10517.0. Samples: 94818816. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:54:21,124][1653645] Updated weights for policy 0, policy_version 185143 (0.0012) [2024-06-15 13:54:25,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 379191296. Throughput: 0: 10308.2. Samples: 94878720. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:54:28,948][1653645] Updated weights for policy 0, policy_version 185187 (0.0014) [2024-06-15 13:54:30,282][1653645] Updated weights for policy 0, policy_version 185248 (0.0013) [2024-06-15 13:54:30,863][1651596] Signal inference workers to stop experience collection... (9550 times) [2024-06-15 13:54:30,930][1653645] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-06-15 13:54:30,958][1648982] Fps is (10 sec: 26213.5, 60 sec: 39867.6, 300 sec: 43431.8). Total num frames: 379420672. Throughput: 0: 10524.4. Samples: 94947840. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:30,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 13:54:31,181][1651596] Signal inference workers to resume experience collection... (9550 times) [2024-06-15 13:54:31,182][1653645] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-06-15 13:54:32,056][1653645] Updated weights for policy 0, policy_version 185316 (0.0012) [2024-06-15 13:54:33,801][1653645] Updated weights for policy 0, policy_version 185392 (0.0012) [2024-06-15 13:54:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43702.9, 300 sec: 43098.3). Total num frames: 379715584. Throughput: 0: 10240.0. Samples: 94973952. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:54:40,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 40959.8, 300 sec: 42987.2). Total num frames: 379813888. Throughput: 0: 10820.3. Samples: 95051776. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:54:40,999][1653645] Updated weights for policy 0, policy_version 185461 (0.0013) [2024-06-15 13:54:43,347][1653645] Updated weights for policy 0, policy_version 185555 (0.0022) [2024-06-15 13:54:44,691][1653645] Updated weights for policy 0, policy_version 185617 (0.0027) [2024-06-15 13:54:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 380239872. Throughput: 0: 10456.2. Samples: 95102464. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:54:50,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 380239872. Throughput: 0: 10456.2. Samples: 95141376. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:54:52,194][1653645] Updated weights for policy 0, policy_version 185696 (0.0092) [2024-06-15 13:54:53,989][1653645] Updated weights for policy 0, policy_version 185765 (0.0013) [2024-06-15 13:54:55,481][1653645] Updated weights for policy 0, policy_version 185830 (0.0012) [2024-06-15 13:54:55,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 41506.3, 300 sec: 43653.6). Total num frames: 380600320. Throughput: 0: 10945.4. Samples: 95207424. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:54:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:54:56,535][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000185872_380665856.pth... [2024-06-15 13:54:56,694][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000180800_370278400.pth [2024-06-15 13:54:57,677][1653645] Updated weights for policy 0, policy_version 185916 (0.0012) [2024-06-15 13:55:00,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 380764160. Throughput: 0: 10308.3. Samples: 95260672. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:55:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:55:05,954][1653645] Updated weights for policy 0, policy_version 186000 (0.0015) [2024-06-15 13:55:05,958][1648982] Fps is (10 sec: 32767.2, 60 sec: 42052.1, 300 sec: 43209.3). Total num frames: 380928000. Throughput: 0: 10672.3. Samples: 95299072. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:55:05,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:55:08,594][1653645] Updated weights for policy 0, policy_version 186096 (0.0229) [2024-06-15 13:55:09,392][1651596] Signal inference workers to stop experience collection... (9600 times) [2024-06-15 13:55:09,435][1653645] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-06-15 13:55:09,682][1651596] Signal inference workers to resume experience collection... (9600 times) [2024-06-15 13:55:09,695][1653645] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-06-15 13:55:10,289][1653645] Updated weights for policy 0, policy_version 186167 (0.0012) [2024-06-15 13:55:10,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 381288448. Throughput: 0: 10467.6. Samples: 95349760. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:55:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:55:15,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 381288448. Throughput: 0: 10547.3. Samples: 95422464. Policy #0 lag: (min: 162.0, avg: 212.2, max: 407.0) [2024-06-15 13:55:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:55:17,641][1653645] Updated weights for policy 0, policy_version 186208 (0.0013) [2024-06-15 13:55:19,139][1653645] Updated weights for policy 0, policy_version 186272 (0.0012) [2024-06-15 13:55:20,457][1653645] Updated weights for policy 0, policy_version 186323 (0.0031) [2024-06-15 13:55:20,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 40960.1, 300 sec: 42987.2). Total num frames: 381616128. Throughput: 0: 10740.6. Samples: 95457280. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:55:22,449][1653645] Updated weights for policy 0, policy_version 186401 (0.0028) [2024-06-15 13:55:25,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 381812736. Throughput: 0: 10274.1. Samples: 95514112. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:55:29,703][1653645] Updated weights for policy 0, policy_version 186464 (0.0016) [2024-06-15 13:55:30,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42052.5, 300 sec: 42542.9). Total num frames: 381943808. Throughput: 0: 10786.2. Samples: 95587840. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:30,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 13:55:31,406][1653645] Updated weights for policy 0, policy_version 186528 (0.0135) [2024-06-15 13:55:32,875][1653645] Updated weights for policy 0, policy_version 186579 (0.0024) [2024-06-15 13:55:34,984][1653645] Updated weights for policy 0, policy_version 186679 (0.0013) [2024-06-15 13:55:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 382337024. Throughput: 0: 10490.3. Samples: 95613440. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:35,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:55:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 382337024. Throughput: 0: 10467.5. Samples: 95678464. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:40,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:55:42,666][1653645] Updated weights for policy 0, policy_version 186720 (0.0020) [2024-06-15 13:55:44,129][1653645] Updated weights for policy 0, policy_version 186784 (0.0022) [2024-06-15 13:55:45,774][1653645] Updated weights for policy 0, policy_version 186848 (0.0013) [2024-06-15 13:55:45,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 40413.9, 300 sec: 42431.8). Total num frames: 382664704. Throughput: 0: 10706.5. Samples: 95742464. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:55:47,357][1653645] Updated weights for policy 0, policy_version 186912 (0.0014) [2024-06-15 13:55:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 382861312. Throughput: 0: 10444.9. Samples: 95769088. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:50,959][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 13:55:54,809][1653645] Updated weights for policy 0, policy_version 186979 (0.0044) [2024-06-15 13:55:55,149][1651596] Signal inference workers to stop experience collection... (9650 times) [2024-06-15 13:55:55,208][1653645] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-06-15 13:55:55,416][1651596] Signal inference workers to resume experience collection... (9650 times) [2024-06-15 13:55:55,418][1653645] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-06-15 13:55:55,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 40413.9, 300 sec: 41987.5). Total num frames: 383025152. Throughput: 0: 11025.1. Samples: 95845888. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:55:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:55:56,920][1653645] Updated weights for policy 0, policy_version 187061 (0.0014) [2024-06-15 13:55:58,278][1653645] Updated weights for policy 0, policy_version 187120 (0.0012) [2024-06-15 13:55:59,953][1653645] Updated weights for policy 0, policy_version 187194 (0.0015) [2024-06-15 13:56:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 383385600. Throughput: 0: 10456.2. Samples: 95892992. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:56:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:56:05,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 40960.1, 300 sec: 42098.5). Total num frames: 383385600. Throughput: 0: 10513.0. Samples: 95930368. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:56:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:56:07,901][1653645] Updated weights for policy 0, policy_version 187252 (0.0013) [2024-06-15 13:56:09,846][1653645] Updated weights for policy 0, policy_version 187333 (0.0012) [2024-06-15 13:56:10,960][1648982] Fps is (10 sec: 36038.0, 60 sec: 40958.7, 300 sec: 42098.3). Total num frames: 383746048. Throughput: 0: 10615.1. Samples: 95991808. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:56:10,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 13:56:11,099][1653645] Updated weights for policy 0, policy_version 187392 (0.0096) [2024-06-15 13:56:15,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 383909888. Throughput: 0: 10387.9. Samples: 96055296. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:56:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 13:56:18,847][1653645] Updated weights for policy 0, policy_version 187459 (0.0040) [2024-06-15 13:56:20,185][1653645] Updated weights for policy 0, policy_version 187520 (0.0016) [2024-06-15 13:56:20,958][1648982] Fps is (10 sec: 36051.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 384106496. Throughput: 0: 10672.3. Samples: 96093696. Policy #0 lag: (min: 15.0, avg: 66.3, max: 271.0) [2024-06-15 13:56:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:56:21,990][1653645] Updated weights for policy 0, policy_version 187602 (0.0013) [2024-06-15 13:56:24,341][1653645] Updated weights for policy 0, policy_version 187705 (0.0016) [2024-06-15 13:56:25,958][1648982] Fps is (10 sec: 52426.2, 60 sec: 43690.4, 300 sec: 42653.9). Total num frames: 384434176. Throughput: 0: 10456.1. Samples: 96148992. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:56:30,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 384434176. Throughput: 0: 10786.1. Samples: 96227840. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:56:31,431][1653645] Updated weights for policy 0, policy_version 187744 (0.0012) [2024-06-15 13:56:32,998][1653645] Updated weights for policy 0, policy_version 187808 (0.0155) [2024-06-15 13:56:33,867][1651596] Signal inference workers to stop experience collection... (9700 times) [2024-06-15 13:56:33,950][1653645] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-06-15 13:56:34,196][1651596] Signal inference workers to resume experience collection... (9700 times) [2024-06-15 13:56:34,197][1653645] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-06-15 13:56:34,795][1653645] Updated weights for policy 0, policy_version 187876 (0.0014) [2024-06-15 13:56:35,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 384892928. Throughput: 0: 10774.7. Samples: 96253952. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:56:36,104][1653645] Updated weights for policy 0, policy_version 187938 (0.0015) [2024-06-15 13:56:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 384958464. Throughput: 0: 10410.7. Samples: 96314368. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:56:43,570][1653645] Updated weights for policy 0, policy_version 187984 (0.0010) [2024-06-15 13:56:44,951][1653645] Updated weights for policy 0, policy_version 188036 (0.0012) [2024-06-15 13:56:45,958][1648982] Fps is (10 sec: 26215.0, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 385155072. Throughput: 0: 10899.9. Samples: 96383488. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:56:46,983][1653645] Updated weights for policy 0, policy_version 188115 (0.0142) [2024-06-15 13:56:48,786][1653645] Updated weights for policy 0, policy_version 188177 (0.0019) [2024-06-15 13:56:50,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 42654.0). Total num frames: 385482752. Throughput: 0: 10547.2. Samples: 96404992. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:56:55,958][1648982] Fps is (10 sec: 32766.9, 60 sec: 40959.8, 300 sec: 42209.6). Total num frames: 385482752. Throughput: 0: 10718.2. Samples: 96474112. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:56:55,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:56:55,965][1653645] Updated weights for policy 0, policy_version 188228 (0.0013) [2024-06-15 13:56:56,579][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000188256_385548288.pth... [2024-06-15 13:56:56,719][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000183360_375521280.pth [2024-06-15 13:56:57,768][1653645] Updated weights for policy 0, policy_version 188291 (0.0010) [2024-06-15 13:56:59,401][1653645] Updated weights for policy 0, policy_version 188357 (0.0012) [2024-06-15 13:57:00,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 40960.0, 300 sec: 42098.6). Total num frames: 385843200. Throughput: 0: 10558.6. Samples: 96530432. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 13:57:01,495][1653645] Updated weights for policy 0, policy_version 188439 (0.0017) [2024-06-15 13:57:05,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.8, 300 sec: 42654.0). Total num frames: 386007040. Throughput: 0: 10342.4. Samples: 96559104. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:57:08,019][1653645] Updated weights for policy 0, policy_version 188485 (0.0013) [2024-06-15 13:57:09,284][1653645] Updated weights for policy 0, policy_version 188544 (0.0012) [2024-06-15 13:57:10,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 40961.3, 300 sec: 41991.5). Total num frames: 386203648. Throughput: 0: 10706.6. Samples: 96630784. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 13:57:11,626][1653645] Updated weights for policy 0, policy_version 188611 (0.0011) [2024-06-15 13:57:13,600][1653645] Updated weights for policy 0, policy_version 188690 (0.0095) [2024-06-15 13:57:14,002][1651596] Signal inference workers to stop experience collection... (9750 times) [2024-06-15 13:57:14,044][1653645] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-06-15 13:57:14,240][1651596] Signal inference workers to resume experience collection... (9750 times) [2024-06-15 13:57:14,240][1653645] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-06-15 13:57:15,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 42654.0). Total num frames: 386531328. Throughput: 0: 10183.1. Samples: 96686080. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:57:20,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 40413.9, 300 sec: 42098.6). Total num frames: 386531328. Throughput: 0: 10399.3. Samples: 96721920. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 13:57:21,409][1653645] Updated weights for policy 0, policy_version 188754 (0.0013) [2024-06-15 13:57:23,725][1653645] Updated weights for policy 0, policy_version 188851 (0.0118) [2024-06-15 13:57:25,598][1653645] Updated weights for policy 0, policy_version 188928 (0.0013) [2024-06-15 13:57:25,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 42052.6, 300 sec: 42320.7). Total num frames: 386957312. Throughput: 0: 10353.8. Samples: 96780288. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:57:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 387055616. Throughput: 0: 10240.0. Samples: 96844288. Policy #0 lag: (min: 127.0, avg: 229.5, max: 413.0) [2024-06-15 13:57:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:57:34,245][1653645] Updated weights for policy 0, policy_version 189027 (0.0014) [2024-06-15 13:57:35,958][1648982] Fps is (10 sec: 29491.6, 60 sec: 39321.8, 300 sec: 42098.6). Total num frames: 387252224. Throughput: 0: 10752.0. Samples: 96888832. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:57:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:57:36,214][1653645] Updated weights for policy 0, policy_version 189110 (0.0019) [2024-06-15 13:57:37,983][1653645] Updated weights for policy 0, policy_version 189188 (0.0125) [2024-06-15 13:57:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 387579904. Throughput: 0: 10274.2. Samples: 96936448. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:57:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:57:45,528][1653645] Updated weights for policy 0, policy_version 189267 (0.0015) [2024-06-15 13:57:45,959][1648982] Fps is (10 sec: 39317.4, 60 sec: 41505.4, 300 sec: 42320.6). Total num frames: 387645440. Throughput: 0: 10808.6. Samples: 97016832. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:57:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:57:46,641][1653645] Updated weights for policy 0, policy_version 189310 (0.0011) [2024-06-15 13:57:48,739][1653645] Updated weights for policy 0, policy_version 189392 (0.0146) [2024-06-15 13:57:50,156][1653645] Updated weights for policy 0, policy_version 189456 (0.0014) [2024-06-15 13:57:50,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 388071424. Throughput: 0: 10740.6. Samples: 97042432. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:57:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:57:55,958][1648982] Fps is (10 sec: 45879.6, 60 sec: 43690.8, 300 sec: 42653.9). Total num frames: 388104192. Throughput: 0: 10592.7. Samples: 97107456. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:57:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 13:57:57,848][1653645] Updated weights for policy 0, policy_version 189536 (0.0015) [2024-06-15 13:57:59,133][1653645] Updated weights for policy 0, policy_version 189571 (0.0026) [2024-06-15 13:57:59,371][1651596] Signal inference workers to stop experience collection... (9800 times) [2024-06-15 13:57:59,416][1653645] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-06-15 13:57:59,566][1651596] Signal inference workers to resume experience collection... (9800 times) [2024-06-15 13:57:59,567][1653645] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-06-15 13:58:00,822][1653645] Updated weights for policy 0, policy_version 189651 (0.0015) [2024-06-15 13:58:00,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 388399104. Throughput: 0: 10797.5. Samples: 97171968. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:00,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:58:02,172][1653645] Updated weights for policy 0, policy_version 189715 (0.0134) [2024-06-15 13:58:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 388628480. Throughput: 0: 10615.5. Samples: 97199616. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 13:58:09,828][1653645] Updated weights for policy 0, policy_version 189795 (0.0014) [2024-06-15 13:58:10,958][1648982] Fps is (10 sec: 36042.7, 60 sec: 42597.9, 300 sec: 42431.7). Total num frames: 388759552. Throughput: 0: 11127.3. Samples: 97281024. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:10,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:58:12,170][1653645] Updated weights for policy 0, policy_version 189888 (0.0171) [2024-06-15 13:58:14,709][1653645] Updated weights for policy 0, policy_version 189989 (0.0119) [2024-06-15 13:58:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.8, 300 sec: 42654.0). Total num frames: 389152768. Throughput: 0: 10752.0. Samples: 97328128. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 13:58:20,958][1648982] Fps is (10 sec: 39324.4, 60 sec: 43690.7, 300 sec: 42542.9). Total num frames: 389152768. Throughput: 0: 10672.4. Samples: 97369088. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:58:21,388][1653645] Updated weights for policy 0, policy_version 190017 (0.0011) [2024-06-15 13:58:22,601][1653645] Updated weights for policy 0, policy_version 190080 (0.0015) [2024-06-15 13:58:24,675][1653645] Updated weights for policy 0, policy_version 190160 (0.0104) [2024-06-15 13:58:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 389513216. Throughput: 0: 11104.7. Samples: 97436160. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:25,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 13:58:26,737][1653645] Updated weights for policy 0, policy_version 190225 (0.0037) [2024-06-15 13:58:30,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 42656.3). Total num frames: 389677056. Throughput: 0: 10672.6. Samples: 97497088. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 13:58:34,416][1653645] Updated weights for policy 0, policy_version 190304 (0.0027) [2024-06-15 13:58:35,653][1653645] Updated weights for policy 0, policy_version 190342 (0.0013) [2024-06-15 13:58:35,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43144.4, 300 sec: 42320.7). Total num frames: 389840896. Throughput: 0: 10934.0. Samples: 97534464. Policy #0 lag: (min: 15.0, avg: 62.6, max: 271.0) [2024-06-15 13:58:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:58:37,371][1653645] Updated weights for policy 0, policy_version 190401 (0.0013) [2024-06-15 13:58:38,997][1651596] Signal inference workers to stop experience collection... (9850 times) [2024-06-15 13:58:39,006][1653645] Updated weights for policy 0, policy_version 190465 (0.0013) [2024-06-15 13:58:39,039][1653645] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-06-15 13:58:39,202][1651596] Signal inference workers to resume experience collection... (9850 times) [2024-06-15 13:58:39,203][1653645] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-06-15 13:58:40,148][1653645] Updated weights for policy 0, policy_version 190528 (0.0012) [2024-06-15 13:58:40,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.8, 300 sec: 42653.9). Total num frames: 390201344. Throughput: 0: 10695.1. Samples: 97588736. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:58:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:58:45,971][1648982] Fps is (10 sec: 35995.8, 60 sec: 42589.4, 300 sec: 42429.8). Total num frames: 390201344. Throughput: 0: 10862.5. Samples: 97660928. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:58:45,972][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 13:58:48,158][1653645] Updated weights for policy 0, policy_version 190593 (0.0016) [2024-06-15 13:58:50,957][1648982] Fps is (10 sec: 36045.2, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 390561792. Throughput: 0: 10900.0. Samples: 97690112. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:58:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:58:50,963][1653645] Updated weights for policy 0, policy_version 190706 (0.0172) [2024-06-15 13:58:55,958][1648982] Fps is (10 sec: 52500.9, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 390725632. Throughput: 0: 10183.3. Samples: 97739264. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:58:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 13:58:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000190784_390725632.pth... [2024-06-15 13:58:56,028][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000185872_380665856.pth [2024-06-15 13:58:59,558][1653645] Updated weights for policy 0, policy_version 190791 (0.0014) [2024-06-15 13:59:00,958][1648982] Fps is (10 sec: 29490.5, 60 sec: 40960.0, 300 sec: 42209.6). Total num frames: 390856704. Throughput: 0: 10774.7. Samples: 97812992. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 13:59:01,048][1653645] Updated weights for policy 0, policy_version 190849 (0.0029) [2024-06-15 13:59:03,106][1653645] Updated weights for policy 0, policy_version 190944 (0.0011) [2024-06-15 13:59:04,790][1653645] Updated weights for policy 0, policy_version 191012 (0.0011) [2024-06-15 13:59:05,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.5, 300 sec: 42653.9). Total num frames: 391249920. Throughput: 0: 10467.5. Samples: 97840128. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:59:10,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.6, 300 sec: 42542.9). Total num frames: 391249920. Throughput: 0: 10604.1. Samples: 97913344. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 13:59:11,442][1653645] Updated weights for policy 0, policy_version 191044 (0.0017) [2024-06-15 13:59:13,526][1653645] Updated weights for policy 0, policy_version 191122 (0.0013) [2024-06-15 13:59:15,360][1653645] Updated weights for policy 0, policy_version 191186 (0.0011) [2024-06-15 13:59:15,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 40960.0, 300 sec: 42209.6). Total num frames: 391610368. Throughput: 0: 10433.5. Samples: 97966592. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:15,963][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 13:59:16,583][1653645] Updated weights for policy 0, policy_version 191248 (0.0012) [2024-06-15 13:59:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 391774208. Throughput: 0: 10331.1. Samples: 97999360. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:59:24,084][1653645] Updated weights for policy 0, policy_version 191328 (0.0016) [2024-06-15 13:59:24,239][1651596] Signal inference workers to stop experience collection... (9900 times) [2024-06-15 13:59:24,289][1653645] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-06-15 13:59:24,591][1651596] Signal inference workers to resume experience collection... (9900 times) [2024-06-15 13:59:24,592][1653645] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-06-15 13:59:25,267][1653645] Updated weights for policy 0, policy_version 191376 (0.0010) [2024-06-15 13:59:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 40960.0, 300 sec: 42542.9). Total num frames: 391970816. Throughput: 0: 10729.2. Samples: 98071552. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:25,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 13:59:27,445][1653645] Updated weights for policy 0, policy_version 191458 (0.0018) [2024-06-15 13:59:28,569][1653645] Updated weights for policy 0, policy_version 191508 (0.0014) [2024-06-15 13:59:29,384][1653645] Updated weights for policy 0, policy_version 191552 (0.0012) [2024-06-15 13:59:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 42653.9). Total num frames: 392298496. Throughput: 0: 10402.5. Samples: 98128896. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 13:59:35,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 392331264. Throughput: 0: 10729.2. Samples: 98172928. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 13:59:37,058][1653645] Updated weights for policy 0, policy_version 191623 (0.0012) [2024-06-15 13:59:38,448][1653645] Updated weights for policy 0, policy_version 191682 (0.0013) [2024-06-15 13:59:39,502][1653645] Updated weights for policy 0, policy_version 191731 (0.0012) [2024-06-15 13:59:40,853][1653645] Updated weights for policy 0, policy_version 191795 (0.0015) [2024-06-15 13:59:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 392790016. Throughput: 0: 10865.8. Samples: 98228224. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 13:59:45,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43700.6, 300 sec: 42653.9). Total num frames: 392822784. Throughput: 0: 10922.6. Samples: 98304512. Policy #0 lag: (min: 187.0, avg: 264.2, max: 411.0) [2024-06-15 13:59:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 13:59:47,433][1653645] Updated weights for policy 0, policy_version 191824 (0.0016) [2024-06-15 13:59:48,676][1653645] Updated weights for policy 0, policy_version 191874 (0.0013) [2024-06-15 13:59:50,323][1653645] Updated weights for policy 0, policy_version 191939 (0.0012) [2024-06-15 13:59:50,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 43144.2, 300 sec: 42542.8). Total num frames: 393150464. Throughput: 0: 11036.4. Samples: 98336768. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 13:59:50,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 13:59:52,791][1653645] Updated weights for policy 0, policy_version 192038 (0.0012) [2024-06-15 13:59:55,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 393347072. Throughput: 0: 10604.1. Samples: 98390528. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 13:59:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:00:00,291][1653645] Updated weights for policy 0, policy_version 192112 (0.0018) [2024-06-15 14:00:00,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.6, 300 sec: 42542.9). Total num frames: 393478144. Throughput: 0: 10956.7. Samples: 98459648. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:00:01,340][1653645] Updated weights for policy 0, policy_version 192146 (0.0013) [2024-06-15 14:00:02,160][1651596] Signal inference workers to stop experience collection... (9950 times) [2024-06-15 14:00:02,199][1653645] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-06-15 14:00:02,454][1651596] Signal inference workers to resume experience collection... (9950 times) [2024-06-15 14:00:02,455][1653645] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-06-15 14:00:03,260][1653645] Updated weights for policy 0, policy_version 192224 (0.0013) [2024-06-15 14:00:04,557][1653645] Updated weights for policy 0, policy_version 192288 (0.0013) [2024-06-15 14:00:05,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 393871360. Throughput: 0: 10854.4. Samples: 98487808. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:00:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.5, 300 sec: 42653.9). Total num frames: 393871360. Throughput: 0: 10695.1. Samples: 98552832. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:00:12,161][1653645] Updated weights for policy 0, policy_version 192342 (0.0021) [2024-06-15 14:00:14,154][1653645] Updated weights for policy 0, policy_version 192432 (0.0042) [2024-06-15 14:00:15,803][1653645] Updated weights for policy 0, policy_version 192501 (0.0013) [2024-06-15 14:00:15,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 42876.1). Total num frames: 394264576. Throughput: 0: 10808.9. Samples: 98615296. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:00:20,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.6, 300 sec: 42654.0). Total num frames: 394395648. Throughput: 0: 10456.2. Samples: 98643456. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:00:23,882][1653645] Updated weights for policy 0, policy_version 192580 (0.0129) [2024-06-15 14:00:25,289][1653645] Updated weights for policy 0, policy_version 192643 (0.0011) [2024-06-15 14:00:25,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 394592256. Throughput: 0: 11036.4. Samples: 98724864. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:00:27,101][1653645] Updated weights for policy 0, policy_version 192721 (0.0011) [2024-06-15 14:00:29,100][1653645] Updated weights for policy 0, policy_version 192802 (0.0072) [2024-06-15 14:00:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 394919936. Throughput: 0: 10422.1. Samples: 98773504. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:00:35,832][1653645] Updated weights for policy 0, policy_version 192854 (0.0013) [2024-06-15 14:00:35,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 394952704. Throughput: 0: 10570.0. Samples: 98812416. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:00:37,407][1653645] Updated weights for policy 0, policy_version 192915 (0.0014) [2024-06-15 14:00:39,028][1653645] Updated weights for policy 0, policy_version 192992 (0.0106) [2024-06-15 14:00:40,617][1651596] Signal inference workers to stop experience collection... (10000 times) [2024-06-15 14:00:40,752][1653645] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-06-15 14:00:40,937][1651596] Signal inference workers to resume experience collection... (10000 times) [2024-06-15 14:00:40,937][1653645] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-06-15 14:00:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 395345920. Throughput: 0: 10945.4. Samples: 98883072. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:00:41,471][1653645] Updated weights for policy 0, policy_version 193083 (0.0014) [2024-06-15 14:00:45,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 395444224. Throughput: 0: 10865.8. Samples: 98948608. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:00:48,442][1653645] Updated weights for policy 0, policy_version 193148 (0.0029) [2024-06-15 14:00:50,750][1653645] Updated weights for policy 0, policy_version 193233 (0.0210) [2024-06-15 14:00:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 395739136. Throughput: 0: 11025.1. Samples: 98983936. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:00:51,772][1653645] Updated weights for policy 0, policy_version 193280 (0.0015) [2024-06-15 14:00:53,428][1653645] Updated weights for policy 0, policy_version 193329 (0.0013) [2024-06-15 14:00:55,960][1648982] Fps is (10 sec: 52419.2, 60 sec: 43689.2, 300 sec: 42653.6). Total num frames: 395968512. Throughput: 0: 10888.1. Samples: 99042816. Policy #0 lag: (min: 5.0, avg: 51.8, max: 229.0) [2024-06-15 14:00:55,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:00:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000193344_395968512.pth... [2024-06-15 14:00:56,031][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000188256_385548288.pth [2024-06-15 14:00:59,145][1653645] Updated weights for policy 0, policy_version 193366 (0.0026) [2024-06-15 14:01:00,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44783.1, 300 sec: 43320.4). Total num frames: 396165120. Throughput: 0: 11241.2. Samples: 99121152. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:01:01,864][1653645] Updated weights for policy 0, policy_version 193472 (0.0016) [2024-06-15 14:01:04,054][1653645] Updated weights for policy 0, policy_version 193538 (0.0013) [2024-06-15 14:01:05,228][1653645] Updated weights for policy 0, policy_version 193600 (0.0013) [2024-06-15 14:01:05,958][1648982] Fps is (10 sec: 52438.7, 60 sec: 43690.6, 300 sec: 43209.6). Total num frames: 396492800. Throughput: 0: 11138.8. Samples: 99144704. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:01:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44783.0, 300 sec: 42876.1). Total num frames: 396558336. Throughput: 0: 11116.1. Samples: 99225088. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:10,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:01:11,445][1653645] Updated weights for policy 0, policy_version 193664 (0.0014) [2024-06-15 14:01:13,390][1653645] Updated weights for policy 0, policy_version 193734 (0.0012) [2024-06-15 14:01:14,684][1653645] Updated weights for policy 0, policy_version 193791 (0.0012) [2024-06-15 14:01:15,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 396951552. Throughput: 0: 11320.9. Samples: 99282944. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:15,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:01:16,417][1653645] Updated weights for policy 0, policy_version 193847 (0.0012) [2024-06-15 14:01:20,960][1648982] Fps is (10 sec: 45868.0, 60 sec: 43689.5, 300 sec: 42653.8). Total num frames: 397017088. Throughput: 0: 11286.3. Samples: 99320320. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:20,965][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:01:22,191][1653645] Updated weights for policy 0, policy_version 193888 (0.0015) [2024-06-15 14:01:23,223][1653645] Updated weights for policy 0, policy_version 193924 (0.0013) [2024-06-15 14:01:24,459][1653645] Updated weights for policy 0, policy_version 193982 (0.0012) [2024-06-15 14:01:25,854][1651596] Signal inference workers to stop experience collection... (10050 times) [2024-06-15 14:01:25,898][1653645] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-06-15 14:01:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.3, 300 sec: 43764.7). Total num frames: 397344768. Throughput: 0: 11241.3. Samples: 99388928. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:01:26,074][1651596] Signal inference workers to resume experience collection... (10050 times) [2024-06-15 14:01:26,075][1653645] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-06-15 14:01:26,309][1653645] Updated weights for policy 0, policy_version 194045 (0.0012) [2024-06-15 14:01:28,096][1653645] Updated weights for policy 0, policy_version 194086 (0.0015) [2024-06-15 14:01:30,958][1648982] Fps is (10 sec: 52437.7, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 397541376. Throughput: 0: 11252.7. Samples: 99454976. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:01:32,901][1653645] Updated weights for policy 0, policy_version 194128 (0.0038) [2024-06-15 14:01:34,298][1653645] Updated weights for policy 0, policy_version 194177 (0.0014) [2024-06-15 14:01:35,487][1653645] Updated weights for policy 0, policy_version 194239 (0.0013) [2024-06-15 14:01:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 47513.6, 300 sec: 43542.6). Total num frames: 397803520. Throughput: 0: 11309.5. Samples: 99492864. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:01:38,011][1653645] Updated weights for policy 0, policy_version 194296 (0.0013) [2024-06-15 14:01:40,152][1653645] Updated weights for policy 0, policy_version 194352 (0.0017) [2024-06-15 14:01:40,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 45328.9, 300 sec: 43764.7). Total num frames: 398065664. Throughput: 0: 11469.3. Samples: 99558912. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:01:45,516][1653645] Updated weights for policy 0, policy_version 194404 (0.0012) [2024-06-15 14:01:45,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 45329.3, 300 sec: 42987.2). Total num frames: 398163968. Throughput: 0: 11184.4. Samples: 99624448. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:01:46,671][1653645] Updated weights for policy 0, policy_version 194435 (0.0039) [2024-06-15 14:01:48,504][1653645] Updated weights for policy 0, policy_version 194502 (0.0096) [2024-06-15 14:01:50,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45328.8, 300 sec: 43986.9). Total num frames: 398458880. Throughput: 0: 11286.7. Samples: 99652608. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:50,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:01:52,405][1653645] Updated weights for policy 0, policy_version 194592 (0.0013) [2024-06-15 14:01:55,958][1648982] Fps is (10 sec: 42596.3, 60 sec: 43691.9, 300 sec: 43209.3). Total num frames: 398589952. Throughput: 0: 10956.7. Samples: 99718144. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:01:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:01:57,205][1653645] Updated weights for policy 0, policy_version 194646 (0.0012) [2024-06-15 14:01:59,513][1653645] Updated weights for policy 0, policy_version 194736 (0.0143) [2024-06-15 14:02:00,957][1648982] Fps is (10 sec: 45877.6, 60 sec: 45875.3, 300 sec: 43764.7). Total num frames: 398917632. Throughput: 0: 11195.8. Samples: 99786752. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:02:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:02:01,205][1653645] Updated weights for policy 0, policy_version 194789 (0.0021) [2024-06-15 14:02:04,653][1653645] Updated weights for policy 0, policy_version 194848 (0.0032) [2024-06-15 14:02:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 399114240. Throughput: 0: 11139.2. Samples: 99821568. Policy #0 lag: (min: 11.0, avg: 73.0, max: 267.0) [2024-06-15 14:02:05,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:02:08,264][1653645] Updated weights for policy 0, policy_version 194882 (0.0013) [2024-06-15 14:02:09,416][1653645] Updated weights for policy 0, policy_version 194935 (0.0015) [2024-06-15 14:02:10,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 45875.2, 300 sec: 43320.4). Total num frames: 399310848. Throughput: 0: 11127.4. Samples: 99889664. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:02:11,086][1653645] Updated weights for policy 0, policy_version 194992 (0.0013) [2024-06-15 14:02:11,957][1651596] Signal inference workers to stop experience collection... (10100 times) [2024-06-15 14:02:12,011][1653645] Updated weights for policy 0, policy_version 195027 (0.0015) [2024-06-15 14:02:12,111][1653645] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-06-15 14:02:12,310][1651596] Signal inference workers to resume experience collection... (10100 times) [2024-06-15 14:02:12,311][1653645] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-06-15 14:02:15,391][1653645] Updated weights for policy 0, policy_version 195088 (0.0077) [2024-06-15 14:02:15,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 399572992. Throughput: 0: 11286.7. Samples: 99962880. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:15,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:02:19,483][1653645] Updated weights for policy 0, policy_version 195157 (0.0013) [2024-06-15 14:02:20,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 45876.5, 300 sec: 43431.5). Total num frames: 399769600. Throughput: 0: 11207.1. Samples: 99997184. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:02:21,427][1653645] Updated weights for policy 0, policy_version 195224 (0.0059) [2024-06-15 14:02:23,739][1653645] Updated weights for policy 0, policy_version 195296 (0.0013) [2024-06-15 14:02:24,435][1653645] Updated weights for policy 0, policy_version 195325 (0.0012) [2024-06-15 14:02:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 400031744. Throughput: 0: 11104.7. Samples: 100058624. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:25,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:02:27,365][1653645] Updated weights for policy 0, policy_version 195377 (0.0012) [2024-06-15 14:02:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 400195584. Throughput: 0: 11389.1. Samples: 100136960. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:02:31,772][1653645] Updated weights for policy 0, policy_version 195440 (0.0013) [2024-06-15 14:02:33,311][1653645] Updated weights for policy 0, policy_version 195475 (0.0013) [2024-06-15 14:02:34,775][1653645] Updated weights for policy 0, policy_version 195522 (0.0012) [2024-06-15 14:02:35,915][1653645] Updated weights for policy 0, policy_version 195574 (0.0012) [2024-06-15 14:02:35,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 400523264. Throughput: 0: 11377.9. Samples: 100164608. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:35,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 14:02:39,431][1653645] Updated weights for policy 0, policy_version 195641 (0.0022) [2024-06-15 14:02:40,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.8, 300 sec: 44209.2). Total num frames: 400687104. Throughput: 0: 11412.0. Samples: 100231680. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:02:44,019][1653645] Updated weights for policy 0, policy_version 195704 (0.0013) [2024-06-15 14:02:45,833][1653645] Updated weights for policy 0, policy_version 195760 (0.0012) [2024-06-15 14:02:45,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 45874.9, 300 sec: 43542.5). Total num frames: 400916480. Throughput: 0: 11207.0. Samples: 100291072. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:02:47,594][1653645] Updated weights for policy 0, policy_version 195796 (0.0012) [2024-06-15 14:02:50,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44237.1, 300 sec: 44098.0). Total num frames: 401113088. Throughput: 0: 11070.6. Samples: 100319744. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:50,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 14:02:51,741][1653645] Updated weights for policy 0, policy_version 195888 (0.0012) [2024-06-15 14:02:55,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 44236.8, 300 sec: 43542.5). Total num frames: 401244160. Throughput: 0: 11127.4. Samples: 100390400. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:02:55,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 14:02:56,253][1653645] Updated weights for policy 0, policy_version 195936 (0.0013) [2024-06-15 14:02:56,263][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000195936_401276928.pth... [2024-06-15 14:02:56,378][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000190784_390725632.pth [2024-06-15 14:02:58,107][1653645] Updated weights for policy 0, policy_version 196004 (0.0106) [2024-06-15 14:03:00,011][1653645] Updated weights for policy 0, policy_version 196050 (0.0057) [2024-06-15 14:03:00,311][1651596] Signal inference workers to stop experience collection... (10150 times) [2024-06-15 14:03:00,364][1653645] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-06-15 14:03:00,505][1651596] Signal inference workers to resume experience collection... (10150 times) [2024-06-15 14:03:00,506][1653645] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-06-15 14:03:00,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 44782.6, 300 sec: 43986.8). Total num frames: 401604608. Throughput: 0: 10820.2. Samples: 100449792. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:03:00,959][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 14:03:03,280][1653645] Updated weights for policy 0, policy_version 196114 (0.0013) [2024-06-15 14:03:05,958][1648982] Fps is (10 sec: 49154.0, 60 sec: 43690.8, 300 sec: 43987.0). Total num frames: 401735680. Throughput: 0: 10979.6. Samples: 100491264. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:03:05,958][1648982] Avg episode reward: [(0, '36.790')] [2024-06-15 14:03:07,075][1653645] Updated weights for policy 0, policy_version 196179 (0.0017) [2024-06-15 14:03:09,013][1653645] Updated weights for policy 0, policy_version 196227 (0.0012) [2024-06-15 14:03:10,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44782.9, 300 sec: 43542.5). Total num frames: 401997824. Throughput: 0: 10979.5. Samples: 100552704. Policy #0 lag: (min: 5.0, avg: 87.4, max: 261.0) [2024-06-15 14:03:10,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:03:11,639][1653645] Updated weights for policy 0, policy_version 196304 (0.0013) [2024-06-15 14:03:12,632][1653645] Updated weights for policy 0, policy_version 196352 (0.0011) [2024-06-15 14:03:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 402194432. Throughput: 0: 10763.4. Samples: 100621312. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:03:16,525][1653645] Updated weights for policy 0, policy_version 196416 (0.0014) [2024-06-15 14:03:20,054][1653645] Updated weights for policy 0, policy_version 196480 (0.0056) [2024-06-15 14:03:20,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.6, 300 sec: 43764.7). Total num frames: 402423808. Throughput: 0: 10911.2. Samples: 100655616. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:03:23,334][1653645] Updated weights for policy 0, policy_version 196560 (0.0012) [2024-06-15 14:03:25,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 402653184. Throughput: 0: 10706.5. Samples: 100713472. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:03:27,899][1653645] Updated weights for policy 0, policy_version 196627 (0.0013) [2024-06-15 14:03:28,786][1653645] Updated weights for policy 0, policy_version 196667 (0.0014) [2024-06-15 14:03:30,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 402849792. Throughput: 0: 11070.6. Samples: 100789248. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:03:31,094][1653645] Updated weights for policy 0, policy_version 196706 (0.0013) [2024-06-15 14:03:32,792][1653645] Updated weights for policy 0, policy_version 196752 (0.0013) [2024-06-15 14:03:35,070][1653645] Updated weights for policy 0, policy_version 196816 (0.0013) [2024-06-15 14:03:35,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 403177472. Throughput: 0: 11104.7. Samples: 100819456. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:03:39,335][1653645] Updated weights for policy 0, policy_version 196869 (0.0012) [2024-06-15 14:03:40,785][1653645] Updated weights for policy 0, policy_version 196928 (0.0011) [2024-06-15 14:03:40,962][1648982] Fps is (10 sec: 45856.9, 60 sec: 43687.8, 300 sec: 44432.7). Total num frames: 403308544. Throughput: 0: 11103.8. Samples: 100890112. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:40,964][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:03:43,672][1653645] Updated weights for policy 0, policy_version 196988 (0.0021) [2024-06-15 14:03:45,591][1653645] Updated weights for policy 0, policy_version 197052 (0.0014) [2024-06-15 14:03:45,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44236.9, 300 sec: 44097.9). Total num frames: 403570688. Throughput: 0: 11047.9. Samples: 100946944. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:03:47,754][1653645] Updated weights for policy 0, policy_version 197104 (0.0013) [2024-06-15 14:03:50,958][1648982] Fps is (10 sec: 39337.1, 60 sec: 43144.4, 300 sec: 43986.9). Total num frames: 403701760. Throughput: 0: 10877.1. Samples: 100980736. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:50,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:03:51,392][1651596] Signal inference workers to stop experience collection... (10200 times) [2024-06-15 14:03:51,478][1653645] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-06-15 14:03:51,480][1653645] Updated weights for policy 0, policy_version 197141 (0.0013) [2024-06-15 14:03:51,708][1651596] Signal inference workers to resume experience collection... (10200 times) [2024-06-15 14:03:51,709][1653645] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-06-15 14:03:54,261][1653645] Updated weights for policy 0, policy_version 197204 (0.0014) [2024-06-15 14:03:55,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 403963904. Throughput: 0: 11116.1. Samples: 101052928. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:03:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:03:56,729][1653645] Updated weights for policy 0, policy_version 197280 (0.0017) [2024-06-15 14:03:59,842][1653645] Updated weights for policy 0, policy_version 197373 (0.0101) [2024-06-15 14:04:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 404226048. Throughput: 0: 11013.7. Samples: 101116928. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:04:00,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:04:03,445][1653645] Updated weights for policy 0, policy_version 197432 (0.0111) [2024-06-15 14:04:05,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 404357120. Throughput: 0: 11127.5. Samples: 101156352. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:04:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:04:06,972][1653645] Updated weights for policy 0, policy_version 197488 (0.0075) [2024-06-15 14:04:08,307][1653645] Updated weights for policy 0, policy_version 197537 (0.0102) [2024-06-15 14:04:10,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 404652032. Throughput: 0: 11172.9. Samples: 101216256. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:04:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:04:11,726][1653645] Updated weights for policy 0, policy_version 197616 (0.0122) [2024-06-15 14:04:15,202][1653645] Updated weights for policy 0, policy_version 197680 (0.0026) [2024-06-15 14:04:15,958][1648982] Fps is (10 sec: 52424.9, 60 sec: 44782.4, 300 sec: 44431.1). Total num frames: 404881408. Throughput: 0: 11070.4. Samples: 101287424. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:04:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:04:17,772][1653645] Updated weights for policy 0, policy_version 197728 (0.0012) [2024-06-15 14:04:19,445][1653645] Updated weights for policy 0, policy_version 197781 (0.0014) [2024-06-15 14:04:20,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.3, 300 sec: 44653.4). Total num frames: 405143552. Throughput: 0: 11161.6. Samples: 101321728. Policy #0 lag: (min: 12.0, avg: 120.9, max: 268.0) [2024-06-15 14:04:20,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:04:22,340][1653645] Updated weights for policy 0, policy_version 197842 (0.0013) [2024-06-15 14:04:25,958][1648982] Fps is (10 sec: 42601.4, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 405307392. Throughput: 0: 11196.7. Samples: 101393920. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:04:26,328][1653645] Updated weights for policy 0, policy_version 197920 (0.0013) [2024-06-15 14:04:28,175][1653645] Updated weights for policy 0, policy_version 197968 (0.0013) [2024-06-15 14:04:29,105][1653645] Updated weights for policy 0, policy_version 198016 (0.0017) [2024-06-15 14:04:30,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.3, 300 sec: 44986.6). Total num frames: 405602304. Throughput: 0: 11480.2. Samples: 101463552. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:04:31,628][1653645] Updated weights for policy 0, policy_version 198077 (0.0013) [2024-06-15 14:04:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 405798912. Throughput: 0: 11366.4. Samples: 101492224. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:04:37,259][1653645] Updated weights for policy 0, policy_version 198147 (0.0013) [2024-06-15 14:04:38,670][1653645] Updated weights for policy 0, policy_version 198208 (0.0013) [2024-06-15 14:04:40,095][1651596] Signal inference workers to stop experience collection... (10250 times) [2024-06-15 14:04:40,185][1653645] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-06-15 14:04:40,357][1651596] Signal inference workers to resume experience collection... (10250 times) [2024-06-15 14:04:40,358][1653645] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-06-15 14:04:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44786.0, 300 sec: 44653.4). Total num frames: 405995520. Throughput: 0: 11343.7. Samples: 101563392. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:04:41,547][1653645] Updated weights for policy 0, policy_version 198267 (0.0014) [2024-06-15 14:04:43,675][1653645] Updated weights for policy 0, policy_version 198336 (0.0151) [2024-06-15 14:04:45,959][1648982] Fps is (10 sec: 39317.2, 60 sec: 43689.9, 300 sec: 44208.9). Total num frames: 406192128. Throughput: 0: 11252.3. Samples: 101623296. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:45,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:04:47,033][1653645] Updated weights for policy 0, policy_version 198394 (0.0078) [2024-06-15 14:04:50,522][1653645] Updated weights for policy 0, policy_version 198464 (0.0014) [2024-06-15 14:04:50,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 406454272. Throughput: 0: 11207.1. Samples: 101660672. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:04:53,141][1653645] Updated weights for policy 0, policy_version 198518 (0.0013) [2024-06-15 14:04:54,831][1653645] Updated weights for policy 0, policy_version 198564 (0.0140) [2024-06-15 14:04:55,958][1648982] Fps is (10 sec: 52432.7, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 406716416. Throughput: 0: 11400.4. Samples: 101729280. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:04:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:04:55,970][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000198592_406716416.pth... [2024-06-15 14:04:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000193344_395968512.pth [2024-06-15 14:04:57,660][1653645] Updated weights for policy 0, policy_version 198625 (0.0015) [2024-06-15 14:05:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 406847488. Throughput: 0: 11321.1. Samples: 101796864. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:05:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:05:01,883][1653645] Updated weights for policy 0, policy_version 198688 (0.0014) [2024-06-15 14:05:03,444][1653645] Updated weights for policy 0, policy_version 198736 (0.0012) [2024-06-15 14:05:05,936][1653645] Updated weights for policy 0, policy_version 198816 (0.0013) [2024-06-15 14:05:05,966][1648982] Fps is (10 sec: 45845.9, 60 sec: 46962.1, 300 sec: 45096.7). Total num frames: 407175168. Throughput: 0: 11307.8. Samples: 101830656. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:05:05,971][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:05:06,656][1653645] Updated weights for policy 0, policy_version 198848 (0.0011) [2024-06-15 14:05:09,776][1653645] Updated weights for policy 0, policy_version 198903 (0.0012) [2024-06-15 14:05:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 407371776. Throughput: 0: 11150.2. Samples: 101895680. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:05:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:05:14,171][1653645] Updated weights for policy 0, policy_version 198969 (0.0019) [2024-06-15 14:05:15,958][1648982] Fps is (10 sec: 42627.1, 60 sec: 45329.6, 300 sec: 44764.4). Total num frames: 407601152. Throughput: 0: 11059.2. Samples: 101961216. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:05:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:05:15,959][1653645] Updated weights for policy 0, policy_version 199038 (0.0012) [2024-06-15 14:05:18,702][1653645] Updated weights for policy 0, policy_version 199088 (0.0013) [2024-06-15 14:05:20,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.9, 300 sec: 44764.5). Total num frames: 407797760. Throughput: 0: 11093.4. Samples: 101991424. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:05:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:05:21,582][1653645] Updated weights for policy 0, policy_version 199140 (0.0011) [2024-06-15 14:05:25,531][1653645] Updated weights for policy 0, policy_version 199202 (0.0011) [2024-06-15 14:05:25,960][1648982] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 407994368. Throughput: 0: 11195.7. Samples: 102067200. Policy #0 lag: (min: 15.0, avg: 139.2, max: 271.0) [2024-06-15 14:05:25,968][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:05:27,227][1653645] Updated weights for policy 0, policy_version 199264 (0.0012) [2024-06-15 14:05:27,343][1651596] Signal inference workers to stop experience collection... (10300 times) [2024-06-15 14:05:27,381][1653645] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-06-15 14:05:27,504][1651596] Signal inference workers to resume experience collection... (10300 times) [2024-06-15 14:05:27,504][1653645] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-06-15 14:05:28,660][1653645] Updated weights for policy 0, policy_version 199302 (0.0015) [2024-06-15 14:05:30,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 44782.9, 300 sec: 45208.7). Total num frames: 408289280. Throughput: 0: 11241.5. Samples: 102129152. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:05:30,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:05:32,592][1653645] Updated weights for policy 0, policy_version 199376 (0.0013) [2024-06-15 14:05:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 408420352. Throughput: 0: 11161.6. Samples: 102162944. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:05:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:05:36,870][1653645] Updated weights for policy 0, policy_version 199425 (0.0107) [2024-06-15 14:05:39,664][1653645] Updated weights for policy 0, policy_version 199522 (0.0013) [2024-06-15 14:05:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 408682496. Throughput: 0: 11082.1. Samples: 102227968. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:05:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:05:42,135][1653645] Updated weights for policy 0, policy_version 199606 (0.0014) [2024-06-15 14:05:45,037][1653645] Updated weights for policy 0, policy_version 199633 (0.0022) [2024-06-15 14:05:45,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 45329.8, 300 sec: 44653.3). Total num frames: 408911872. Throughput: 0: 11036.4. Samples: 102293504. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:05:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:05:46,027][1653645] Updated weights for policy 0, policy_version 199677 (0.0061) [2024-06-15 14:05:49,470][1653645] Updated weights for policy 0, policy_version 199744 (0.0015) [2024-06-15 14:05:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44542.6). Total num frames: 409108480. Throughput: 0: 11129.2. Samples: 102331392. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:05:50,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 14:05:53,370][1653645] Updated weights for policy 0, policy_version 199825 (0.0014) [2024-06-15 14:05:54,479][1653645] Updated weights for policy 0, policy_version 199872 (0.0012) [2024-06-15 14:05:55,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 43691.0, 300 sec: 44653.3). Total num frames: 409337856. Throughput: 0: 10922.7. Samples: 102387200. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:05:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:05:58,415][1653645] Updated weights for policy 0, policy_version 199933 (0.0018) [2024-06-15 14:06:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 409501696. Throughput: 0: 11093.4. Samples: 102460416. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:06:01,464][1653645] Updated weights for policy 0, policy_version 199994 (0.0079) [2024-06-15 14:06:03,736][1653645] Updated weights for policy 0, policy_version 200048 (0.0017) [2024-06-15 14:06:05,303][1653645] Updated weights for policy 0, policy_version 200101 (0.0012) [2024-06-15 14:06:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44788.0, 300 sec: 45097.7). Total num frames: 409862144. Throughput: 0: 11081.9. Samples: 102490112. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:06:09,762][1653645] Updated weights for policy 0, policy_version 200144 (0.0012) [2024-06-15 14:06:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 409993216. Throughput: 0: 10911.3. Samples: 102558208. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:06:12,823][1653645] Updated weights for policy 0, policy_version 200208 (0.0015) [2024-06-15 14:06:14,908][1653645] Updated weights for policy 0, policy_version 200292 (0.0109) [2024-06-15 14:06:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 44875.8). Total num frames: 410255360. Throughput: 0: 10922.7. Samples: 102620672. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:06:16,032][1651596] Signal inference workers to stop experience collection... (10350 times) [2024-06-15 14:06:16,056][1653645] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-06-15 14:06:16,322][1651596] Signal inference workers to resume experience collection... (10350 times) [2024-06-15 14:06:16,323][1653645] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-06-15 14:06:16,485][1653645] Updated weights for policy 0, policy_version 200337 (0.0010) [2024-06-15 14:06:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 410386432. Throughput: 0: 10888.5. Samples: 102652928. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:06:21,819][1653645] Updated weights for policy 0, policy_version 200416 (0.0092) [2024-06-15 14:06:24,534][1653645] Updated weights for policy 0, policy_version 200450 (0.0024) [2024-06-15 14:06:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 410648576. Throughput: 0: 11025.1. Samples: 102724096. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:06:26,416][1653645] Updated weights for policy 0, policy_version 200514 (0.0109) [2024-06-15 14:06:27,796][1653645] Updated weights for policy 0, policy_version 200566 (0.0012) [2024-06-15 14:06:29,464][1653645] Updated weights for policy 0, policy_version 200631 (0.0012) [2024-06-15 14:06:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 410910720. Throughput: 0: 10843.0. Samples: 102781440. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:30,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 14:06:34,874][1653645] Updated weights for policy 0, policy_version 200688 (0.0109) [2024-06-15 14:06:35,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 411041792. Throughput: 0: 10831.6. Samples: 102818816. Policy #0 lag: (min: 31.0, avg: 167.0, max: 287.0) [2024-06-15 14:06:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:06:38,773][1653645] Updated weights for policy 0, policy_version 200752 (0.0116) [2024-06-15 14:06:40,597][1653645] Updated weights for policy 0, policy_version 200803 (0.0011) [2024-06-15 14:06:40,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 411271168. Throughput: 0: 10843.0. Samples: 102875136. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:06:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:06:42,450][1653645] Updated weights for policy 0, policy_version 200889 (0.0012) [2024-06-15 14:06:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 43986.9). Total num frames: 411435008. Throughput: 0: 10592.7. Samples: 102937088. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:06:45,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 14:06:48,016][1653645] Updated weights for policy 0, policy_version 200948 (0.0189) [2024-06-15 14:06:50,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42052.2, 300 sec: 44209.1). Total num frames: 411631616. Throughput: 0: 10649.6. Samples: 102969344. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:06:50,975][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 14:06:51,844][1653645] Updated weights for policy 0, policy_version 201026 (0.0024) [2024-06-15 14:06:53,364][1653645] Updated weights for policy 0, policy_version 201097 (0.0099) [2024-06-15 14:06:55,958][1648982] Fps is (10 sec: 52426.9, 60 sec: 43690.4, 300 sec: 44208.9). Total num frames: 411959296. Throughput: 0: 10478.9. Samples: 103029760. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:06:55,959][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 14:06:55,969][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000201152_411959296.pth... [2024-06-15 14:06:56,070][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000195936_401276928.pth [2024-06-15 14:06:59,311][1653645] Updated weights for policy 0, policy_version 201168 (0.0015) [2024-06-15 14:07:00,971][1648982] Fps is (10 sec: 45816.7, 60 sec: 43135.3, 300 sec: 43985.0). Total num frames: 412090368. Throughput: 0: 10646.6. Samples: 103099904. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:00,971][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:07:03,047][1653645] Updated weights for policy 0, policy_version 201264 (0.0032) [2024-06-15 14:07:03,152][1651596] Signal inference workers to stop experience collection... (10400 times) [2024-06-15 14:07:03,202][1653645] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-06-15 14:07:03,354][1651596] Signal inference workers to resume experience collection... (10400 times) [2024-06-15 14:07:03,354][1653645] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-06-15 14:07:04,483][1653645] Updated weights for policy 0, policy_version 201331 (0.0013) [2024-06-15 14:07:05,887][1653645] Updated weights for policy 0, policy_version 201402 (0.0021) [2024-06-15 14:07:05,958][1648982] Fps is (10 sec: 49154.1, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 412450816. Throughput: 0: 10774.8. Samples: 103137792. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:05,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:07:10,958][1648982] Fps is (10 sec: 39371.7, 60 sec: 41506.1, 300 sec: 43764.7). Total num frames: 412483584. Throughput: 0: 10649.6. Samples: 103203328. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:07:12,199][1653645] Updated weights for policy 0, policy_version 201465 (0.0013) [2024-06-15 14:07:14,861][1653645] Updated weights for policy 0, policy_version 201523 (0.0012) [2024-06-15 14:07:15,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 412811264. Throughput: 0: 10820.2. Samples: 103268352. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:15,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:07:16,016][1653645] Updated weights for policy 0, policy_version 201584 (0.0087) [2024-06-15 14:07:17,805][1653645] Updated weights for policy 0, policy_version 201656 (0.0142) [2024-06-15 14:07:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 413007872. Throughput: 0: 10581.3. Samples: 103294976. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:07:24,637][1653645] Updated weights for policy 0, policy_version 201714 (0.0062) [2024-06-15 14:07:25,864][1653645] Updated weights for policy 0, policy_version 201761 (0.0101) [2024-06-15 14:07:25,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 413204480. Throughput: 0: 11025.1. Samples: 103371264. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:07:27,753][1653645] Updated weights for policy 0, policy_version 201825 (0.0019) [2024-06-15 14:07:29,200][1653645] Updated weights for policy 0, policy_version 201874 (0.0022) [2024-06-15 14:07:30,059][1653645] Updated weights for policy 0, policy_version 201919 (0.0032) [2024-06-15 14:07:30,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 413532160. Throughput: 0: 10956.8. Samples: 103430144. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:07:35,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 43653.7). Total num frames: 413564928. Throughput: 0: 11150.2. Samples: 103471104. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:07:37,059][1653645] Updated weights for policy 0, policy_version 201987 (0.0013) [2024-06-15 14:07:38,375][1653645] Updated weights for policy 0, policy_version 202048 (0.0015) [2024-06-15 14:07:40,764][1653645] Updated weights for policy 0, policy_version 202144 (0.0013) [2024-06-15 14:07:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 413990912. Throughput: 0: 11207.2. Samples: 103534080. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:07:45,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 414056448. Throughput: 0: 11369.6. Samples: 103611392. Policy #0 lag: (min: 15.0, avg: 105.0, max: 271.0) [2024-06-15 14:07:45,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:07:47,123][1651596] Signal inference workers to stop experience collection... (10450 times) [2024-06-15 14:07:47,210][1653645] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-06-15 14:07:47,345][1651596] Signal inference workers to resume experience collection... (10450 times) [2024-06-15 14:07:47,345][1653645] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-06-15 14:07:47,347][1653645] Updated weights for policy 0, policy_version 202192 (0.0014) [2024-06-15 14:07:49,895][1653645] Updated weights for policy 0, policy_version 202300 (0.0014) [2024-06-15 14:07:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 414384128. Throughput: 0: 11252.6. Samples: 103644160. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:07:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:07:51,086][1653645] Updated weights for policy 0, policy_version 202352 (0.0013) [2024-06-15 14:07:52,850][1653645] Updated weights for policy 0, policy_version 202428 (0.0013) [2024-06-15 14:07:55,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 414580736. Throughput: 0: 11104.7. Samples: 103703040. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:07:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:07:59,709][1653645] Updated weights for policy 0, policy_version 202467 (0.0013) [2024-06-15 14:08:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44246.3, 300 sec: 44098.0). Total num frames: 414744576. Throughput: 0: 11355.1. Samples: 103779328. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:08:01,582][1653645] Updated weights for policy 0, policy_version 202560 (0.0014) [2024-06-15 14:08:04,588][1653645] Updated weights for policy 0, policy_version 202672 (0.0114) [2024-06-15 14:08:05,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 415105024. Throughput: 0: 11218.5. Samples: 103799808. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:05,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 14:08:10,960][1653645] Updated weights for policy 0, policy_version 202692 (0.0012) [2024-06-15 14:08:10,977][1648982] Fps is (10 sec: 35973.9, 60 sec: 43676.4, 300 sec: 43761.8). Total num frames: 415105024. Throughput: 0: 11202.2. Samples: 103875584. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:10,978][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:08:12,686][1653645] Updated weights for policy 0, policy_version 202770 (0.0013) [2024-06-15 14:08:15,281][1653645] Updated weights for policy 0, policy_version 202871 (0.0164) [2024-06-15 14:08:15,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.1, 300 sec: 44320.1). Total num frames: 415498240. Throughput: 0: 11059.2. Samples: 103927808. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:08:17,236][1653645] Updated weights for policy 0, policy_version 202943 (0.0123) [2024-06-15 14:08:20,958][1648982] Fps is (10 sec: 52530.2, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 415629312. Throughput: 0: 10786.0. Samples: 103956480. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:08:24,937][1653645] Updated weights for policy 0, policy_version 203008 (0.0016) [2024-06-15 14:08:25,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 415825920. Throughput: 0: 11116.1. Samples: 104034304. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:08:26,505][1651596] Signal inference workers to stop experience collection... (10500 times) [2024-06-15 14:08:26,573][1653645] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-06-15 14:08:26,842][1651596] Signal inference workers to resume experience collection... (10500 times) [2024-06-15 14:08:26,843][1653645] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-06-15 14:08:27,722][1653645] Updated weights for policy 0, policy_version 203104 (0.0127) [2024-06-15 14:08:29,280][1653645] Updated weights for policy 0, policy_version 203152 (0.0012) [2024-06-15 14:08:30,209][1653645] Updated weights for policy 0, policy_version 203199 (0.0011) [2024-06-15 14:08:30,960][1648982] Fps is (10 sec: 52430.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 416153600. Throughput: 0: 10513.1. Samples: 104084480. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:30,961][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:08:35,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43144.5, 300 sec: 43543.2). Total num frames: 416153600. Throughput: 0: 10626.8. Samples: 104122368. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:08:38,073][1653645] Updated weights for policy 0, policy_version 203281 (0.0012) [2024-06-15 14:08:39,908][1653645] Updated weights for policy 0, policy_version 203360 (0.0013) [2024-06-15 14:08:40,711][1653645] Updated weights for policy 0, policy_version 203392 (0.0014) [2024-06-15 14:08:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 416546816. Throughput: 0: 10592.7. Samples: 104179712. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:08:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 416677888. Throughput: 0: 10399.3. Samples: 104247296. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:08:47,940][1653645] Updated weights for policy 0, policy_version 203461 (0.0027) [2024-06-15 14:08:49,222][1653645] Updated weights for policy 0, policy_version 203508 (0.0012) [2024-06-15 14:08:50,269][1653645] Updated weights for policy 0, policy_version 203555 (0.0014) [2024-06-15 14:08:50,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 43986.9). Total num frames: 416940032. Throughput: 0: 10865.8. Samples: 104288768. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:08:52,099][1653645] Updated weights for policy 0, policy_version 203645 (0.0014) [2024-06-15 14:08:54,107][1653645] Updated weights for policy 0, policy_version 203704 (0.0012) [2024-06-15 14:08:55,969][1648982] Fps is (10 sec: 52371.1, 60 sec: 43682.6, 300 sec: 43985.2). Total num frames: 417202176. Throughput: 0: 10480.9. Samples: 104347136. Policy #0 lag: (min: 10.0, avg: 62.5, max: 266.0) [2024-06-15 14:08:55,970][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:08:55,974][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000203712_417202176.pth... [2024-06-15 14:08:56,022][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000198592_406716416.pth [2024-06-15 14:09:00,348][1653645] Updated weights for policy 0, policy_version 203748 (0.0016) [2024-06-15 14:09:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43144.4, 300 sec: 43986.9). Total num frames: 417333248. Throughput: 0: 10899.9. Samples: 104418304. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:09:01,772][1653645] Updated weights for policy 0, policy_version 203809 (0.0015) [2024-06-15 14:09:03,370][1653645] Updated weights for policy 0, policy_version 203875 (0.0149) [2024-06-15 14:09:05,958][1648982] Fps is (10 sec: 45924.6, 60 sec: 42598.2, 300 sec: 44097.9). Total num frames: 417660928. Throughput: 0: 10877.2. Samples: 104445952. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:09:06,001][1653645] Updated weights for policy 0, policy_version 203952 (0.0015) [2024-06-15 14:09:10,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 43704.6, 300 sec: 43542.6). Total num frames: 417726464. Throughput: 0: 10808.8. Samples: 104520704. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:10,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:09:11,259][1651596] Signal inference workers to stop experience collection... (10550 times) [2024-06-15 14:09:11,301][1653645] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-06-15 14:09:11,518][1651596] Signal inference workers to resume experience collection... (10550 times) [2024-06-15 14:09:11,534][1653645] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-06-15 14:09:12,848][1653645] Updated weights for policy 0, policy_version 204033 (0.0015) [2024-06-15 14:09:14,772][1653645] Updated weights for policy 0, policy_version 204128 (0.0014) [2024-06-15 14:09:15,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 418119680. Throughput: 0: 10968.2. Samples: 104578048. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:09:17,353][1653645] Updated weights for policy 0, policy_version 204195 (0.0041) [2024-06-15 14:09:20,958][1648982] Fps is (10 sec: 52431.1, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 418250752. Throughput: 0: 10945.4. Samples: 104614912. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:09:23,828][1653645] Updated weights for policy 0, policy_version 204261 (0.0013) [2024-06-15 14:09:25,325][1653645] Updated weights for policy 0, policy_version 204341 (0.0027) [2024-06-15 14:09:25,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 45328.8, 300 sec: 43875.7). Total num frames: 418545664. Throughput: 0: 11343.6. Samples: 104690176. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:09:26,699][1653645] Updated weights for policy 0, policy_version 204412 (0.0022) [2024-06-15 14:09:29,114][1653645] Updated weights for policy 0, policy_version 204472 (0.0022) [2024-06-15 14:09:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 418775040. Throughput: 0: 11298.1. Samples: 104755712. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:09:34,919][1653645] Updated weights for policy 0, policy_version 204534 (0.0094) [2024-06-15 14:09:35,659][1653645] Updated weights for policy 0, policy_version 204563 (0.0012) [2024-06-15 14:09:35,958][1648982] Fps is (10 sec: 42599.9, 60 sec: 46967.5, 300 sec: 43986.9). Total num frames: 418971648. Throughput: 0: 11264.0. Samples: 104795648. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:09:37,213][1653645] Updated weights for policy 0, policy_version 204628 (0.0014) [2024-06-15 14:09:39,862][1653645] Updated weights for policy 0, policy_version 204675 (0.0011) [2024-06-15 14:09:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44209.2). Total num frames: 419233792. Throughput: 0: 11369.2. Samples: 104858624. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:09:41,290][1653645] Updated weights for policy 0, policy_version 204736 (0.0014) [2024-06-15 14:09:45,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 419332096. Throughput: 0: 11434.7. Samples: 104932864. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:45,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:09:47,413][1653645] Updated weights for policy 0, policy_version 204806 (0.0015) [2024-06-15 14:09:49,927][1653645] Updated weights for policy 0, policy_version 204899 (0.0038) [2024-06-15 14:09:50,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 419692544. Throughput: 0: 11423.4. Samples: 104960000. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:50,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 14:09:51,836][1651596] Signal inference workers to stop experience collection... (10600 times) [2024-06-15 14:09:51,896][1653645] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-06-15 14:09:52,245][1651596] Signal inference workers to resume experience collection... (10600 times) [2024-06-15 14:09:52,246][1653645] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-06-15 14:09:52,402][1653645] Updated weights for policy 0, policy_version 204947 (0.0029) [2024-06-15 14:09:53,268][1653645] Updated weights for policy 0, policy_version 204992 (0.0013) [2024-06-15 14:09:55,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43698.7, 300 sec: 43986.9). Total num frames: 419823616. Throughput: 0: 11173.1. Samples: 105023488. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:09:55,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:09:58,620][1653645] Updated weights for policy 0, policy_version 205051 (0.0013) [2024-06-15 14:10:00,307][1653645] Updated weights for policy 0, policy_version 205111 (0.0014) [2024-06-15 14:10:00,957][1648982] Fps is (10 sec: 42599.0, 60 sec: 46421.5, 300 sec: 43876.8). Total num frames: 420118528. Throughput: 0: 11389.2. Samples: 105090560. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:10:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:10:01,773][1653645] Updated weights for policy 0, policy_version 205178 (0.0013) [2024-06-15 14:10:05,075][1653645] Updated weights for policy 0, policy_version 205232 (0.0015) [2024-06-15 14:10:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44783.2, 300 sec: 43986.9). Total num frames: 420347904. Throughput: 0: 11252.6. Samples: 105121280. Policy #0 lag: (min: 3.0, avg: 69.5, max: 259.0) [2024-06-15 14:10:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:10:09,920][1653645] Updated weights for policy 0, policy_version 205265 (0.0014) [2024-06-15 14:10:10,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 45875.5, 300 sec: 43653.6). Total num frames: 420478976. Throughput: 0: 11309.6. Samples: 105199104. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:10:11,773][1653645] Updated weights for policy 0, policy_version 205348 (0.0102) [2024-06-15 14:10:13,463][1653645] Updated weights for policy 0, policy_version 205411 (0.0012) [2024-06-15 14:10:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 420741120. Throughput: 0: 11013.7. Samples: 105251328. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:15,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:10:16,623][1653645] Updated weights for policy 0, policy_version 205472 (0.0018) [2024-06-15 14:10:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 420872192. Throughput: 0: 10808.9. Samples: 105282048. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:10:22,258][1653645] Updated weights for policy 0, policy_version 205522 (0.0032) [2024-06-15 14:10:24,358][1653645] Updated weights for policy 0, policy_version 205602 (0.0016) [2024-06-15 14:10:25,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 44237.1, 300 sec: 43764.7). Total num frames: 421199872. Throughput: 0: 10854.4. Samples: 105347072. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:10:26,416][1653645] Updated weights for policy 0, policy_version 205696 (0.0016) [2024-06-15 14:10:30,104][1653645] Updated weights for policy 0, policy_version 205758 (0.0015) [2024-06-15 14:10:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 421396480. Throughput: 0: 10547.2. Samples: 105407488. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:30,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:10:35,267][1653645] Updated weights for policy 0, policy_version 205824 (0.0015) [2024-06-15 14:10:35,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 42598.5, 300 sec: 43542.6). Total num frames: 421527552. Throughput: 0: 10831.7. Samples: 105447424. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:10:36,899][1653645] Updated weights for policy 0, policy_version 205886 (0.0011) [2024-06-15 14:10:37,272][1651596] Signal inference workers to stop experience collection... (10650 times) [2024-06-15 14:10:37,368][1653645] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-06-15 14:10:37,483][1651596] Signal inference workers to resume experience collection... (10650 times) [2024-06-15 14:10:37,485][1653645] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-06-15 14:10:38,056][1653645] Updated weights for policy 0, policy_version 205926 (0.0117) [2024-06-15 14:10:40,911][1653645] Updated weights for policy 0, policy_version 205968 (0.0014) [2024-06-15 14:10:40,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 43764.7). Total num frames: 421822464. Throughput: 0: 10752.0. Samples: 105507328. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:10:44,915][1653645] Updated weights for policy 0, policy_version 206019 (0.0012) [2024-06-15 14:10:45,974][1648982] Fps is (10 sec: 49070.9, 60 sec: 44770.8, 300 sec: 43762.3). Total num frames: 422019072. Throughput: 0: 10975.5. Samples: 105584640. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:45,975][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:10:47,061][1653645] Updated weights for policy 0, policy_version 206096 (0.0015) [2024-06-15 14:10:49,258][1653645] Updated weights for policy 0, policy_version 206160 (0.0012) [2024-06-15 14:10:50,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 422313984. Throughput: 0: 10911.3. Samples: 105612288. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:10:52,727][1653645] Updated weights for policy 0, policy_version 206209 (0.0014) [2024-06-15 14:10:54,150][1653645] Updated weights for policy 0, policy_version 206272 (0.0046) [2024-06-15 14:10:55,958][1648982] Fps is (10 sec: 42668.4, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 422445056. Throughput: 0: 10581.3. Samples: 105675264. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:10:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:10:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000206272_422445056.pth... [2024-06-15 14:10:56,024][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000201152_411959296.pth [2024-06-15 14:10:56,027][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000206272_422445056.pth [2024-06-15 14:10:57,640][1653645] Updated weights for policy 0, policy_version 206328 (0.0019) [2024-06-15 14:10:59,295][1653645] Updated weights for policy 0, policy_version 206369 (0.0011) [2024-06-15 14:11:00,522][1653645] Updated weights for policy 0, policy_version 206416 (0.0017) [2024-06-15 14:11:00,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 44236.4, 300 sec: 43764.7). Total num frames: 422772736. Throughput: 0: 11127.4. Samples: 105752064. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:11:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:11:01,439][1653645] Updated weights for policy 0, policy_version 206463 (0.0011) [2024-06-15 14:11:04,632][1653645] Updated weights for policy 0, policy_version 206523 (0.0012) [2024-06-15 14:11:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 422969344. Throughput: 0: 11241.2. Samples: 105787904. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:11:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:11:09,013][1653645] Updated weights for policy 0, policy_version 206583 (0.0014) [2024-06-15 14:11:10,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 423165952. Throughput: 0: 11229.8. Samples: 105852416. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:11:10,959][1648982] Avg episode reward: [(0, '37.040')] [2024-06-15 14:11:11,243][1653645] Updated weights for policy 0, policy_version 206640 (0.0135) [2024-06-15 14:11:12,219][1653645] Updated weights for policy 0, policy_version 206661 (0.0013) [2024-06-15 14:11:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 423362560. Throughput: 0: 11366.4. Samples: 105918976. Policy #0 lag: (min: 15.0, avg: 82.1, max: 271.0) [2024-06-15 14:11:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:11:16,564][1653645] Updated weights for policy 0, policy_version 206752 (0.0013) [2024-06-15 14:11:19,916][1653645] Updated weights for policy 0, policy_version 206816 (0.0012) [2024-06-15 14:11:20,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 423624704. Throughput: 0: 11127.4. Samples: 105948160. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:11:22,099][1653645] Updated weights for policy 0, policy_version 206850 (0.0011) [2024-06-15 14:11:23,441][1653645] Updated weights for policy 0, policy_version 206912 (0.0012) [2024-06-15 14:11:24,977][1651596] Signal inference workers to stop experience collection... (10700 times) [2024-06-15 14:11:25,023][1653645] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-06-15 14:11:25,231][1651596] Signal inference workers to resume experience collection... (10700 times) [2024-06-15 14:11:25,237][1653645] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-06-15 14:11:25,437][1653645] Updated weights for policy 0, policy_version 206968 (0.0014) [2024-06-15 14:11:25,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 423886848. Throughput: 0: 11355.0. Samples: 106018304. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:11:28,802][1653645] Updated weights for policy 0, policy_version 207031 (0.0025) [2024-06-15 14:11:30,970][1648982] Fps is (10 sec: 39272.8, 60 sec: 43681.7, 300 sec: 43985.0). Total num frames: 424017920. Throughput: 0: 11105.7. Samples: 106084352. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:30,971][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:11:32,210][1653645] Updated weights for policy 0, policy_version 207100 (0.0013) [2024-06-15 14:11:35,106][1653645] Updated weights for policy 0, policy_version 207143 (0.0015) [2024-06-15 14:11:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.1, 300 sec: 44097.9). Total num frames: 424280064. Throughput: 0: 11218.5. Samples: 106117120. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:11:36,122][1653645] Updated weights for policy 0, policy_version 207184 (0.0013) [2024-06-15 14:11:39,485][1653645] Updated weights for policy 0, policy_version 207235 (0.0112) [2024-06-15 14:11:40,597][1653645] Updated weights for policy 0, policy_version 207293 (0.0016) [2024-06-15 14:11:40,958][1648982] Fps is (10 sec: 52490.5, 60 sec: 45328.5, 300 sec: 44431.1). Total num frames: 424542208. Throughput: 0: 11298.0. Samples: 106183680. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:11:43,505][1653645] Updated weights for policy 0, policy_version 207352 (0.0014) [2024-06-15 14:11:45,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 45341.3, 300 sec: 44431.1). Total num frames: 424738816. Throughput: 0: 11207.1. Samples: 106256384. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:45,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:11:46,108][1653645] Updated weights for policy 0, policy_version 207399 (0.0139) [2024-06-15 14:11:48,333][1653645] Updated weights for policy 0, policy_version 207426 (0.0013) [2024-06-15 14:11:49,750][1653645] Updated weights for policy 0, policy_version 207483 (0.0012) [2024-06-15 14:11:50,958][1648982] Fps is (10 sec: 39324.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 424935424. Throughput: 0: 11116.1. Samples: 106288128. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:11:51,900][1653645] Updated weights for policy 0, policy_version 207538 (0.0013) [2024-06-15 14:11:54,012][1653645] Updated weights for policy 0, policy_version 207584 (0.0012) [2024-06-15 14:11:55,601][1653645] Updated weights for policy 0, policy_version 207619 (0.0020) [2024-06-15 14:11:55,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 46421.2, 300 sec: 44544.2). Total num frames: 425230336. Throughput: 0: 11286.8. Samples: 106360320. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:11:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:11:56,906][1653645] Updated weights for policy 0, policy_version 207674 (0.0018) [2024-06-15 14:12:00,504][1653645] Updated weights for policy 0, policy_version 207728 (0.0033) [2024-06-15 14:12:00,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44783.2, 300 sec: 44098.0). Total num frames: 425459712. Throughput: 0: 11355.1. Samples: 106429952. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:12:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:12:02,530][1653645] Updated weights for policy 0, policy_version 207765 (0.0012) [2024-06-15 14:12:03,931][1653645] Updated weights for policy 0, policy_version 207809 (0.0016) [2024-06-15 14:12:04,976][1653645] Updated weights for policy 0, policy_version 207866 (0.0011) [2024-06-15 14:12:05,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 425721856. Throughput: 0: 11594.0. Samples: 106469888. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:12:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:12:08,250][1653645] Updated weights for policy 0, policy_version 207929 (0.0014) [2024-06-15 14:12:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.3, 300 sec: 44320.1). Total num frames: 425885696. Throughput: 0: 11480.2. Samples: 106534912. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:12:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:12:11,352][1653645] Updated weights for policy 0, policy_version 207974 (0.0012) [2024-06-15 14:12:13,796][1651596] Signal inference workers to stop experience collection... (10750 times) [2024-06-15 14:12:13,825][1653645] Updated weights for policy 0, policy_version 208019 (0.0029) [2024-06-15 14:12:13,853][1653645] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-06-15 14:12:13,990][1651596] Signal inference workers to resume experience collection... (10750 times) [2024-06-15 14:12:13,994][1653645] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-06-15 14:12:14,499][1653645] Updated weights for policy 0, policy_version 208057 (0.0012) [2024-06-15 14:12:15,854][1653645] Updated weights for policy 0, policy_version 208097 (0.0011) [2024-06-15 14:12:15,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 46967.5, 300 sec: 44653.4). Total num frames: 426180608. Throughput: 0: 11699.6. Samples: 106610688. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:12:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:12:18,930][1653645] Updated weights for policy 0, policy_version 208161 (0.0021) [2024-06-15 14:12:20,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.3, 300 sec: 44653.3). Total num frames: 426377216. Throughput: 0: 11673.6. Samples: 106642432. Policy #0 lag: (min: 61.0, avg: 183.8, max: 317.0) [2024-06-15 14:12:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:12:22,819][1653645] Updated weights for policy 0, policy_version 208227 (0.0035) [2024-06-15 14:12:25,959][1648982] Fps is (10 sec: 39315.5, 60 sec: 44781.8, 300 sec: 44208.8). Total num frames: 426573824. Throughput: 0: 11639.3. Samples: 106707456. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:25,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:12:25,997][1653645] Updated weights for policy 0, policy_version 208292 (0.0011) [2024-06-15 14:12:27,640][1653645] Updated weights for policy 0, policy_version 208368 (0.0011) [2024-06-15 14:12:30,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 46977.2, 300 sec: 44986.6). Total num frames: 426835968. Throughput: 0: 11514.4. Samples: 106774528. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:12:31,420][1653645] Updated weights for policy 0, policy_version 208442 (0.0013) [2024-06-15 14:12:35,377][1653645] Updated weights for policy 0, policy_version 208508 (0.0028) [2024-06-15 14:12:35,958][1648982] Fps is (10 sec: 45881.6, 60 sec: 45875.1, 300 sec: 44209.0). Total num frames: 427032576. Throughput: 0: 11525.6. Samples: 106806784. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:12:38,311][1653645] Updated weights for policy 0, policy_version 208571 (0.0095) [2024-06-15 14:12:39,568][1653645] Updated weights for policy 0, policy_version 208631 (0.0028) [2024-06-15 14:12:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.7, 300 sec: 44875.5). Total num frames: 427294720. Throughput: 0: 11355.0. Samples: 106871296. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:40,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:12:42,127][1653645] Updated weights for policy 0, policy_version 208664 (0.0012) [2024-06-15 14:12:43,102][1653645] Updated weights for policy 0, policy_version 208704 (0.0011) [2024-06-15 14:12:45,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44783.2, 300 sec: 44209.0). Total num frames: 427425792. Throughput: 0: 11389.2. Samples: 106942464. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:45,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:12:49,039][1653645] Updated weights for policy 0, policy_version 208772 (0.0017) [2024-06-15 14:12:50,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 427720704. Throughput: 0: 11309.5. Samples: 106978816. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:12:51,158][1653645] Updated weights for policy 0, policy_version 208864 (0.0012) [2024-06-15 14:12:54,270][1653645] Updated weights for policy 0, policy_version 208912 (0.0048) [2024-06-15 14:12:55,291][1653645] Updated weights for policy 0, policy_version 208960 (0.0014) [2024-06-15 14:12:55,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 427950080. Throughput: 0: 11184.3. Samples: 107038208. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:12:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:12:55,978][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000208960_427950080.pth... [2024-06-15 14:12:56,019][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000203712_417202176.pth [2024-06-15 14:12:59,292][1653645] Updated weights for policy 0, policy_version 209021 (0.0012) [2024-06-15 14:13:00,453][1651596] Signal inference workers to stop experience collection... (10800 times) [2024-06-15 14:13:00,498][1653645] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-06-15 14:13:00,631][1651596] Signal inference workers to resume experience collection... (10800 times) [2024-06-15 14:13:00,632][1653645] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-06-15 14:13:00,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 428113920. Throughput: 0: 11161.6. Samples: 107112960. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:13:01,661][1653645] Updated weights for policy 0, policy_version 209076 (0.0011) [2024-06-15 14:13:03,202][1653645] Updated weights for policy 0, policy_version 209147 (0.0010) [2024-06-15 14:13:05,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 45100.7). Total num frames: 428408832. Throughput: 0: 11059.2. Samples: 107140096. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:13:06,074][1653645] Updated weights for policy 0, policy_version 209200 (0.0013) [2024-06-15 14:13:10,834][1653645] Updated weights for policy 0, policy_version 209248 (0.0013) [2024-06-15 14:13:10,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44236.5, 300 sec: 44209.0). Total num frames: 428539904. Throughput: 0: 11207.4. Samples: 107211776. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:13:12,527][1653645] Updated weights for policy 0, policy_version 209312 (0.0014) [2024-06-15 14:13:13,645][1653645] Updated weights for policy 0, policy_version 209345 (0.0011) [2024-06-15 14:13:14,881][1653645] Updated weights for policy 0, policy_version 209396 (0.0009) [2024-06-15 14:13:15,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 44875.6). Total num frames: 428867584. Throughput: 0: 11150.2. Samples: 107276288. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:13:16,839][1653645] Updated weights for policy 0, policy_version 209430 (0.0048) [2024-06-15 14:13:20,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.5, 300 sec: 44653.3). Total num frames: 428998656. Throughput: 0: 11093.3. Samples: 107305984. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:20,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:13:22,254][1653645] Updated weights for policy 0, policy_version 209504 (0.0013) [2024-06-15 14:13:24,378][1653645] Updated weights for policy 0, policy_version 209587 (0.0039) [2024-06-15 14:13:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44784.0, 300 sec: 44431.2). Total num frames: 429260800. Throughput: 0: 11207.1. Samples: 107375616. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:13:26,695][1653645] Updated weights for policy 0, policy_version 209636 (0.0010) [2024-06-15 14:13:27,651][1653645] Updated weights for policy 0, policy_version 209665 (0.0012) [2024-06-15 14:13:28,919][1653645] Updated weights for policy 0, policy_version 209723 (0.0013) [2024-06-15 14:13:30,960][1648982] Fps is (10 sec: 52429.5, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 429522944. Throughput: 0: 11150.2. Samples: 107444224. Policy #0 lag: (min: 15.0, avg: 109.9, max: 271.0) [2024-06-15 14:13:30,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:13:34,887][1653645] Updated weights for policy 0, policy_version 209792 (0.0013) [2024-06-15 14:13:35,958][1648982] Fps is (10 sec: 49150.1, 60 sec: 45328.8, 300 sec: 44764.4). Total num frames: 429752320. Throughput: 0: 11389.0. Samples: 107491328. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:13:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:13:36,339][1653645] Updated weights for policy 0, policy_version 209856 (0.0017) [2024-06-15 14:13:39,970][1653645] Updated weights for policy 0, policy_version 209938 (0.0012) [2024-06-15 14:13:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 430014464. Throughput: 0: 11195.7. Samples: 107542016. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:13:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:13:45,958][1648982] Fps is (10 sec: 29491.8, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 430047232. Throughput: 0: 11161.6. Samples: 107615232. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:13:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:13:46,108][1651596] Signal inference workers to stop experience collection... (10850 times) [2024-06-15 14:13:46,138][1653645] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-06-15 14:13:46,338][1651596] Signal inference workers to resume experience collection... (10850 times) [2024-06-15 14:13:46,343][1653645] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-06-15 14:13:46,478][1653645] Updated weights for policy 0, policy_version 210019 (0.0051) [2024-06-15 14:13:48,639][1653645] Updated weights for policy 0, policy_version 210105 (0.0013) [2024-06-15 14:13:50,958][1648982] Fps is (10 sec: 29491.6, 60 sec: 43144.5, 300 sec: 44432.8). Total num frames: 430309376. Throughput: 0: 11013.7. Samples: 107635712. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:13:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:13:51,758][1653645] Updated weights for policy 0, policy_version 210145 (0.0013) [2024-06-15 14:13:52,679][1653645] Updated weights for policy 0, policy_version 210195 (0.0014) [2024-06-15 14:13:55,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 430571520. Throughput: 0: 11070.6. Samples: 107709952. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:13:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:13:56,940][1653645] Updated weights for policy 0, policy_version 210241 (0.0014) [2024-06-15 14:13:58,711][1653645] Updated weights for policy 0, policy_version 210322 (0.0019) [2024-06-15 14:13:59,702][1653645] Updated weights for policy 0, policy_version 210368 (0.0017) [2024-06-15 14:14:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45329.2, 300 sec: 44653.4). Total num frames: 430833664. Throughput: 0: 11229.9. Samples: 107781632. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:14:03,669][1653645] Updated weights for policy 0, policy_version 210466 (0.0014) [2024-06-15 14:14:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44782.9, 300 sec: 45319.9). Total num frames: 431095808. Throughput: 0: 11150.3. Samples: 107807744. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:14:09,395][1653645] Updated weights for policy 0, policy_version 210516 (0.0013) [2024-06-15 14:14:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45329.3, 300 sec: 44542.3). Total num frames: 431259648. Throughput: 0: 11423.3. Samples: 107889664. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:10,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:14:11,724][1653645] Updated weights for policy 0, policy_version 210608 (0.0013) [2024-06-15 14:14:13,856][1653645] Updated weights for policy 0, policy_version 210640 (0.0060) [2024-06-15 14:14:15,762][1653645] Updated weights for policy 0, policy_version 210736 (0.0161) [2024-06-15 14:14:15,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 431587328. Throughput: 0: 11104.7. Samples: 107943936. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:14:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.8, 300 sec: 44320.2). Total num frames: 431620096. Throughput: 0: 10888.6. Samples: 107981312. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:14:21,783][1653645] Updated weights for policy 0, policy_version 210800 (0.0013) [2024-06-15 14:14:22,871][1653645] Updated weights for policy 0, policy_version 210833 (0.0012) [2024-06-15 14:14:25,585][1653645] Updated weights for policy 0, policy_version 210883 (0.0028) [2024-06-15 14:14:25,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 431915008. Throughput: 0: 11195.8. Samples: 108045824. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:14:26,520][1651596] Signal inference workers to stop experience collection... (10900 times) [2024-06-15 14:14:26,588][1653645] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-06-15 14:14:26,730][1651596] Signal inference workers to resume experience collection... (10900 times) [2024-06-15 14:14:26,730][1653645] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-06-15 14:14:27,211][1653645] Updated weights for policy 0, policy_version 210961 (0.0012) [2024-06-15 14:14:28,092][1653645] Updated weights for policy 0, policy_version 211004 (0.0113) [2024-06-15 14:14:30,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 432144384. Throughput: 0: 11047.8. Samples: 108112384. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:14:33,978][1653645] Updated weights for policy 0, policy_version 211066 (0.0012) [2024-06-15 14:14:35,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44237.1, 300 sec: 44653.3). Total num frames: 432406528. Throughput: 0: 11343.6. Samples: 108146176. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:14:37,093][1653645] Updated weights for policy 0, policy_version 211138 (0.0012) [2024-06-15 14:14:38,577][1653645] Updated weights for policy 0, policy_version 211200 (0.0012) [2024-06-15 14:14:40,177][1653645] Updated weights for policy 0, policy_version 211258 (0.0018) [2024-06-15 14:14:40,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 44237.0, 300 sec: 45208.8). Total num frames: 432668672. Throughput: 0: 11059.2. Samples: 108207616. Policy #0 lag: (min: 14.0, avg: 79.3, max: 270.0) [2024-06-15 14:14:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:14:45,036][1653645] Updated weights for policy 0, policy_version 211312 (0.0012) [2024-06-15 14:14:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 432799744. Throughput: 0: 11025.1. Samples: 108277760. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:14:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:14:46,957][1653645] Updated weights for policy 0, policy_version 211362 (0.0012) [2024-06-15 14:14:50,035][1653645] Updated weights for policy 0, policy_version 211440 (0.0012) [2024-06-15 14:14:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 433061888. Throughput: 0: 11127.5. Samples: 108308480. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:14:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:14:51,257][1653645] Updated weights for policy 0, policy_version 211477 (0.0011) [2024-06-15 14:14:55,488][1653645] Updated weights for policy 0, policy_version 211540 (0.0014) [2024-06-15 14:14:55,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.2, 300 sec: 44653.3). Total num frames: 433291264. Throughput: 0: 10934.1. Samples: 108381696. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:14:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:14:56,295][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000211584_433324032.pth... [2024-06-15 14:14:56,296][1653645] Updated weights for policy 0, policy_version 211584 (0.0010) [2024-06-15 14:14:56,348][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000206272_422445056.pth [2024-06-15 14:15:00,727][1653645] Updated weights for policy 0, policy_version 211650 (0.0013) [2024-06-15 14:15:00,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 433487872. Throughput: 0: 11252.6. Samples: 108450304. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:15:02,105][1653645] Updated weights for policy 0, policy_version 211714 (0.0013) [2024-06-15 14:15:03,435][1653645] Updated weights for policy 0, policy_version 211766 (0.0013) [2024-06-15 14:15:05,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 433717248. Throughput: 0: 11002.2. Samples: 108476416. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:15:07,180][1653645] Updated weights for policy 0, policy_version 211824 (0.0137) [2024-06-15 14:15:09,832][1653645] Updated weights for policy 0, policy_version 211864 (0.0012) [2024-06-15 14:15:10,659][1653645] Updated weights for policy 0, policy_version 211904 (0.0016) [2024-06-15 14:15:10,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 433979392. Throughput: 0: 11298.2. Samples: 108554240. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:15:13,668][1651596] Signal inference workers to stop experience collection... (10950 times) [2024-06-15 14:15:13,704][1653645] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-06-15 14:15:13,896][1651596] Signal inference workers to resume experience collection... (10950 times) [2024-06-15 14:15:13,897][1653645] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-06-15 14:15:14,647][1653645] Updated weights for policy 0, policy_version 212000 (0.0122) [2024-06-15 14:15:15,389][1653645] Updated weights for policy 0, policy_version 212032 (0.0042) [2024-06-15 14:15:15,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 434241536. Throughput: 0: 11173.0. Samples: 108615168. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:15:20,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 434372608. Throughput: 0: 11161.6. Samples: 108648448. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:15:21,683][1653645] Updated weights for policy 0, policy_version 212129 (0.0015) [2024-06-15 14:15:25,050][1653645] Updated weights for policy 0, policy_version 212208 (0.0012) [2024-06-15 14:15:25,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45875.3, 300 sec: 44986.6). Total num frames: 434667520. Throughput: 0: 11491.6. Samples: 108724736. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:25,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:15:29,380][1653645] Updated weights for policy 0, policy_version 212289 (0.0013) [2024-06-15 14:15:30,348][1653645] Updated weights for policy 0, policy_version 212346 (0.0014) [2024-06-15 14:15:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.4, 300 sec: 45319.8). Total num frames: 434896896. Throughput: 0: 11320.9. Samples: 108787200. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:15:33,661][1653645] Updated weights for policy 0, policy_version 212386 (0.0012) [2024-06-15 14:15:35,834][1653645] Updated weights for policy 0, policy_version 212432 (0.0089) [2024-06-15 14:15:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 435060736. Throughput: 0: 11434.7. Samples: 108823040. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:15:37,550][1653645] Updated weights for policy 0, policy_version 212512 (0.0013) [2024-06-15 14:15:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.8, 300 sec: 45211.2). Total num frames: 435355648. Throughput: 0: 11332.2. Samples: 108891648. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:15:40,972][1653645] Updated weights for policy 0, policy_version 212576 (0.0055) [2024-06-15 14:15:45,092][1653645] Updated weights for policy 0, policy_version 212640 (0.0013) [2024-06-15 14:15:45,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 435552256. Throughput: 0: 11252.6. Samples: 108956672. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:45,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 14:15:48,342][1653645] Updated weights for policy 0, policy_version 212736 (0.0013) [2024-06-15 14:15:49,649][1653645] Updated weights for policy 0, policy_version 212791 (0.0013) [2024-06-15 14:15:50,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 45875.0, 300 sec: 45319.8). Total num frames: 435814400. Throughput: 0: 11389.2. Samples: 108988928. Policy #0 lag: (min: 15.0, avg: 100.9, max: 271.0) [2024-06-15 14:15:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:15:52,752][1653645] Updated weights for policy 0, policy_version 212817 (0.0012) [2024-06-15 14:15:55,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 44236.6, 300 sec: 44653.4). Total num frames: 435945472. Throughput: 0: 11184.3. Samples: 109057536. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:15:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:15:56,850][1653645] Updated weights for policy 0, policy_version 212896 (0.0016) [2024-06-15 14:15:57,447][1653645] Updated weights for policy 0, policy_version 212927 (0.0056) [2024-06-15 14:15:58,845][1651596] Signal inference workers to stop experience collection... (11000 times) [2024-06-15 14:15:58,895][1653645] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-06-15 14:15:59,043][1651596] Signal inference workers to resume experience collection... (11000 times) [2024-06-15 14:15:59,044][1653645] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-06-15 14:15:59,527][1653645] Updated weights for policy 0, policy_version 212979 (0.0013) [2024-06-15 14:16:00,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 46421.2, 300 sec: 45097.6). Total num frames: 436273152. Throughput: 0: 11355.0. Samples: 109126144. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:00,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:16:01,342][1653645] Updated weights for policy 0, policy_version 213045 (0.0018) [2024-06-15 14:16:04,553][1653645] Updated weights for policy 0, policy_version 213094 (0.0139) [2024-06-15 14:16:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45875.4, 300 sec: 45097.7). Total num frames: 436469760. Throughput: 0: 11332.3. Samples: 109158400. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:16:09,342][1653645] Updated weights for policy 0, policy_version 213183 (0.0013) [2024-06-15 14:16:10,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 436600832. Throughput: 0: 11138.8. Samples: 109225984. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:10,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:16:12,135][1653645] Updated weights for policy 0, policy_version 213233 (0.0012) [2024-06-15 14:16:15,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 436862976. Throughput: 0: 11047.8. Samples: 109284352. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:15,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 14:16:16,153][1653645] Updated weights for policy 0, policy_version 213321 (0.0152) [2024-06-15 14:16:17,249][1653645] Updated weights for policy 0, policy_version 213374 (0.0020) [2024-06-15 14:16:20,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 437059584. Throughput: 0: 11081.9. Samples: 109321728. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:16:21,052][1653645] Updated weights for policy 0, policy_version 213424 (0.0013) [2024-06-15 14:16:22,782][1653645] Updated weights for policy 0, policy_version 213456 (0.0013) [2024-06-15 14:16:24,438][1653645] Updated weights for policy 0, policy_version 213525 (0.0016) [2024-06-15 14:16:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45329.0, 300 sec: 45321.7). Total num frames: 437387264. Throughput: 0: 11059.2. Samples: 109389312. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:16:28,008][1653645] Updated weights for policy 0, policy_version 213586 (0.0041) [2024-06-15 14:16:28,985][1653645] Updated weights for policy 0, policy_version 213632 (0.0013) [2024-06-15 14:16:30,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 437518336. Throughput: 0: 11081.9. Samples: 109455360. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:16:32,964][1653645] Updated weights for policy 0, policy_version 213693 (0.0025) [2024-06-15 14:16:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 44875.6). Total num frames: 437780480. Throughput: 0: 11207.2. Samples: 109493248. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:16:35,986][1653645] Updated weights for policy 0, policy_version 213768 (0.0014) [2024-06-15 14:16:37,162][1653645] Updated weights for policy 0, policy_version 213824 (0.0012) [2024-06-15 14:16:40,730][1653645] Updated weights for policy 0, policy_version 213880 (0.0014) [2024-06-15 14:16:40,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 438042624. Throughput: 0: 11116.1. Samples: 109557760. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:16:43,995][1653645] Updated weights for policy 0, policy_version 213943 (0.0014) [2024-06-15 14:16:45,865][1651596] Signal inference workers to stop experience collection... (11050 times) [2024-06-15 14:16:45,935][1653645] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-06-15 14:16:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 438173696. Throughput: 0: 11298.2. Samples: 109634560. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:16:46,142][1651596] Signal inference workers to resume experience collection... (11050 times) [2024-06-15 14:16:46,143][1653645] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-06-15 14:16:46,567][1653645] Updated weights for policy 0, policy_version 213984 (0.0013) [2024-06-15 14:16:48,081][1653645] Updated weights for policy 0, policy_version 214048 (0.0014) [2024-06-15 14:16:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44237.1, 300 sec: 44875.5). Total num frames: 438468608. Throughput: 0: 11150.2. Samples: 109660160. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:16:51,698][1653645] Updated weights for policy 0, policy_version 214137 (0.0016) [2024-06-15 14:16:55,175][1653645] Updated weights for policy 0, policy_version 214206 (0.0012) [2024-06-15 14:16:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 438697984. Throughput: 0: 11355.0. Samples: 109736960. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:16:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:16:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000214208_438697984.pth... [2024-06-15 14:16:56,000][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000208960_427950080.pth [2024-06-15 14:16:57,487][1653645] Updated weights for policy 0, policy_version 214242 (0.0011) [2024-06-15 14:16:59,587][1653645] Updated weights for policy 0, policy_version 214336 (0.0034) [2024-06-15 14:17:00,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 438960128. Throughput: 0: 11514.3. Samples: 109802496. Policy #0 lag: (min: 8.0, avg: 122.3, max: 264.0) [2024-06-15 14:17:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:17:02,779][1653645] Updated weights for policy 0, policy_version 214389 (0.0033) [2024-06-15 14:17:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 439123968. Throughput: 0: 11491.5. Samples: 109838848. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:05,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:17:06,002][1653645] Updated weights for policy 0, policy_version 214432 (0.0011) [2024-06-15 14:17:08,731][1653645] Updated weights for policy 0, policy_version 214482 (0.0012) [2024-06-15 14:17:10,145][1653645] Updated weights for policy 0, policy_version 214547 (0.0013) [2024-06-15 14:17:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 48059.9, 300 sec: 45097.7). Total num frames: 439484416. Throughput: 0: 11571.2. Samples: 109910016. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:17:13,807][1653645] Updated weights for policy 0, policy_version 214611 (0.0013) [2024-06-15 14:17:14,834][1653645] Updated weights for policy 0, policy_version 214656 (0.0010) [2024-06-15 14:17:15,958][1648982] Fps is (10 sec: 49153.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 439615488. Throughput: 0: 11628.1. Samples: 109978624. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:15,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:17:18,424][1653645] Updated weights for policy 0, policy_version 214720 (0.0012) [2024-06-15 14:17:20,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 45875.2, 300 sec: 44875.7). Total num frames: 439812096. Throughput: 0: 11468.8. Samples: 110009344. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:17:22,153][1653645] Updated weights for policy 0, policy_version 214816 (0.0103) [2024-06-15 14:17:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 440041472. Throughput: 0: 11423.3. Samples: 110071808. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:17:26,054][1653645] Updated weights for policy 0, policy_version 214880 (0.0025) [2024-06-15 14:17:29,052][1653645] Updated weights for policy 0, policy_version 214920 (0.0011) [2024-06-15 14:17:29,640][1651596] Signal inference workers to stop experience collection... (11100 times) [2024-06-15 14:17:29,694][1653645] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-06-15 14:17:29,918][1651596] Signal inference workers to resume experience collection... (11100 times) [2024-06-15 14:17:29,919][1653645] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-06-15 14:17:30,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 440270848. Throughput: 0: 11309.5. Samples: 110143488. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:17:31,643][1653645] Updated weights for policy 0, policy_version 214979 (0.0015) [2024-06-15 14:17:33,277][1653645] Updated weights for policy 0, policy_version 215041 (0.0012) [2024-06-15 14:17:34,385][1653645] Updated weights for policy 0, policy_version 215097 (0.0015) [2024-06-15 14:17:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 440532992. Throughput: 0: 11423.3. Samples: 110174208. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:17:37,627][1653645] Updated weights for policy 0, policy_version 215152 (0.0014) [2024-06-15 14:17:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 440664064. Throughput: 0: 11195.7. Samples: 110240768. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:17:41,889][1653645] Updated weights for policy 0, policy_version 215200 (0.0012) [2024-06-15 14:17:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 440926208. Throughput: 0: 11138.8. Samples: 110303744. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:17:46,538][1653645] Updated weights for policy 0, policy_version 215328 (0.0029) [2024-06-15 14:17:49,935][1653645] Updated weights for policy 0, policy_version 215411 (0.0013) [2024-06-15 14:17:50,960][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 441188352. Throughput: 0: 10991.0. Samples: 110333440. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:50,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:17:53,496][1653645] Updated weights for policy 0, policy_version 215446 (0.0012) [2024-06-15 14:17:55,543][1653645] Updated weights for policy 0, policy_version 215498 (0.0013) [2024-06-15 14:17:55,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.1, 300 sec: 44986.6). Total num frames: 441384960. Throughput: 0: 11104.7. Samples: 110409728. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:17:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:17:56,915][1653645] Updated weights for policy 0, policy_version 215555 (0.0013) [2024-06-15 14:17:58,476][1653645] Updated weights for policy 0, policy_version 215616 (0.0013) [2024-06-15 14:18:00,957][1648982] Fps is (10 sec: 49152.8, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 441679872. Throughput: 0: 10990.9. Samples: 110473216. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:18:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:18:01,154][1653645] Updated weights for policy 0, policy_version 215674 (0.0014) [2024-06-15 14:18:05,194][1653645] Updated weights for policy 0, policy_version 215744 (0.0090) [2024-06-15 14:18:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45329.3, 300 sec: 45097.7). Total num frames: 441843712. Throughput: 0: 11309.5. Samples: 110518272. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:18:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:18:07,776][1653645] Updated weights for policy 0, policy_version 215792 (0.0099) [2024-06-15 14:18:09,388][1653645] Updated weights for policy 0, policy_version 215856 (0.0018) [2024-06-15 14:18:10,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 442105856. Throughput: 0: 11252.6. Samples: 110578176. Policy #0 lag: (min: 15.0, avg: 137.4, max: 271.0) [2024-06-15 14:18:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:18:12,083][1653645] Updated weights for policy 0, policy_version 215920 (0.0051) [2024-06-15 14:18:15,356][1651596] Signal inference workers to stop experience collection... (11150 times) [2024-06-15 14:18:15,407][1653645] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-06-15 14:18:15,409][1653645] Updated weights for policy 0, policy_version 215958 (0.0020) [2024-06-15 14:18:15,512][1651596] Signal inference workers to resume experience collection... (11150 times) [2024-06-15 14:18:15,514][1653645] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-06-15 14:18:15,958][1648982] Fps is (10 sec: 49150.4, 60 sec: 45328.8, 300 sec: 45208.7). Total num frames: 442335232. Throughput: 0: 11582.5. Samples: 110664704. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:15,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:18:17,782][1653645] Updated weights for policy 0, policy_version 216016 (0.0014) [2024-06-15 14:18:19,951][1653645] Updated weights for policy 0, policy_version 216112 (0.0110) [2024-06-15 14:18:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 46967.5, 300 sec: 45319.8). Total num frames: 442630144. Throughput: 0: 11571.2. Samples: 110694912. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:18:22,963][1653645] Updated weights for policy 0, policy_version 216164 (0.0013) [2024-06-15 14:18:25,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 442761216. Throughput: 0: 11537.1. Samples: 110759936. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:18:27,338][1653645] Updated weights for policy 0, policy_version 216208 (0.0128) [2024-06-15 14:18:29,434][1653645] Updated weights for policy 0, policy_version 216272 (0.0018) [2024-06-15 14:18:30,822][1653645] Updated weights for policy 0, policy_version 216340 (0.0208) [2024-06-15 14:18:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46421.4, 300 sec: 45097.7). Total num frames: 443056128. Throughput: 0: 11662.2. Samples: 110828544. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:18:33,088][1653645] Updated weights for policy 0, policy_version 216387 (0.0014) [2024-06-15 14:18:35,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 443285504. Throughput: 0: 11764.6. Samples: 110862848. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:18:38,700][1653645] Updated weights for policy 0, policy_version 216457 (0.0072) [2024-06-15 14:18:39,648][1653645] Updated weights for policy 0, policy_version 216509 (0.0014) [2024-06-15 14:18:40,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46967.7, 300 sec: 45542.0). Total num frames: 443482112. Throughput: 0: 11696.4. Samples: 110936064. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:40,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 14:18:41,562][1653645] Updated weights for policy 0, policy_version 216579 (0.0015) [2024-06-15 14:18:44,858][1653645] Updated weights for policy 0, policy_version 216656 (0.0014) [2024-06-15 14:18:45,957][1648982] Fps is (10 sec: 49152.4, 60 sec: 47513.8, 300 sec: 45653.1). Total num frames: 443777024. Throughput: 0: 11707.7. Samples: 111000064. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:18:50,003][1653645] Updated weights for policy 0, policy_version 216705 (0.0012) [2024-06-15 14:18:50,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 44782.7, 300 sec: 45097.6). Total num frames: 443875328. Throughput: 0: 11650.7. Samples: 111042560. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:18:52,312][1653645] Updated weights for policy 0, policy_version 216816 (0.0018) [2024-06-15 14:18:55,958][1648982] Fps is (10 sec: 42596.4, 60 sec: 46967.2, 300 sec: 45319.8). Total num frames: 444203008. Throughput: 0: 11616.6. Samples: 111100928. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:18:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:18:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000216896_444203008.pth... [2024-06-15 14:18:56,208][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000211584_433324032.pth [2024-06-15 14:18:56,661][1651596] Signal inference workers to stop experience collection... (11200 times) [2024-06-15 14:18:56,709][1653645] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-06-15 14:18:57,023][1651596] Signal inference workers to resume experience collection... (11200 times) [2024-06-15 14:18:57,038][1653645] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-06-15 14:18:57,040][1653645] Updated weights for policy 0, policy_version 216928 (0.0030) [2024-06-15 14:19:00,958][1648982] Fps is (10 sec: 45877.1, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 444334080. Throughput: 0: 11400.6. Samples: 111177728. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:19:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:19:01,722][1653645] Updated weights for policy 0, policy_version 216976 (0.0015) [2024-06-15 14:19:02,568][1653645] Updated weights for policy 0, policy_version 217023 (0.0013) [2024-06-15 14:19:03,783][1653645] Updated weights for policy 0, policy_version 217088 (0.0015) [2024-06-15 14:19:04,703][1653645] Updated weights for policy 0, policy_version 217140 (0.0080) [2024-06-15 14:19:05,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 48059.7, 300 sec: 45653.1). Total num frames: 444727296. Throughput: 0: 11411.9. Samples: 111208448. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:19:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:19:08,227][1653645] Updated weights for policy 0, policy_version 217184 (0.0040) [2024-06-15 14:19:10,958][1648982] Fps is (10 sec: 52426.3, 60 sec: 45875.0, 300 sec: 44986.5). Total num frames: 444858368. Throughput: 0: 11559.7. Samples: 111280128. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:19:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:19:12,765][1653645] Updated weights for policy 0, policy_version 217218 (0.0014) [2024-06-15 14:19:14,094][1653645] Updated weights for policy 0, policy_version 217280 (0.0022) [2024-06-15 14:19:15,716][1653645] Updated weights for policy 0, policy_version 217363 (0.0093) [2024-06-15 14:19:15,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 47513.6, 300 sec: 45986.2). Total num frames: 445186048. Throughput: 0: 11502.9. Samples: 111346176. Policy #0 lag: (min: 15.0, avg: 149.8, max: 271.0) [2024-06-15 14:19:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:19:18,844][1653645] Updated weights for policy 0, policy_version 217410 (0.0012) [2024-06-15 14:19:20,025][1653645] Updated weights for policy 0, policy_version 217463 (0.0012) [2024-06-15 14:19:20,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45875.0, 300 sec: 45653.0). Total num frames: 445382656. Throughput: 0: 11628.0. Samples: 111386112. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:19:24,251][1653645] Updated weights for policy 0, policy_version 217504 (0.0013) [2024-06-15 14:19:25,433][1653645] Updated weights for policy 0, policy_version 217568 (0.0015) [2024-06-15 14:19:25,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 47513.7, 300 sec: 45653.1). Total num frames: 445612032. Throughput: 0: 11730.5. Samples: 111463936. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:25,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 14:19:30,025][1653645] Updated weights for policy 0, policy_version 217667 (0.0018) [2024-06-15 14:19:30,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 445841408. Throughput: 0: 11741.8. Samples: 111528448. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:30,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:19:35,168][1653645] Updated weights for policy 0, policy_version 217731 (0.0013) [2024-06-15 14:19:35,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 445972480. Throughput: 0: 11628.2. Samples: 111565824. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:19:36,600][1653645] Updated weights for policy 0, policy_version 217799 (0.0013) [2024-06-15 14:19:37,153][1651596] Signal inference workers to stop experience collection... (11250 times) [2024-06-15 14:19:37,231][1653645] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-06-15 14:19:37,347][1651596] Signal inference workers to resume experience collection... (11250 times) [2024-06-15 14:19:37,348][1653645] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-06-15 14:19:38,041][1653645] Updated weights for policy 0, policy_version 217872 (0.0100) [2024-06-15 14:19:40,974][1648982] Fps is (10 sec: 45799.4, 60 sec: 46954.4, 300 sec: 45761.6). Total num frames: 446300160. Throughput: 0: 11760.4. Samples: 111630336. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:40,975][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:19:41,351][1653645] Updated weights for policy 0, policy_version 217936 (0.0012) [2024-06-15 14:19:42,509][1653645] Updated weights for policy 0, policy_version 217983 (0.0014) [2024-06-15 14:19:45,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 45319.8). Total num frames: 446431232. Throughput: 0: 11776.0. Samples: 111707648. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:19:47,151][1653645] Updated weights for policy 0, policy_version 218032 (0.0025) [2024-06-15 14:19:48,970][1653645] Updated weights for policy 0, policy_version 218114 (0.0097) [2024-06-15 14:19:50,022][1653645] Updated weights for policy 0, policy_version 218170 (0.0015) [2024-06-15 14:19:50,958][1648982] Fps is (10 sec: 52516.1, 60 sec: 49152.3, 300 sec: 45875.2). Total num frames: 446824448. Throughput: 0: 11787.4. Samples: 111738880. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:19:53,327][1653645] Updated weights for policy 0, policy_version 218224 (0.0012) [2024-06-15 14:19:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.4, 300 sec: 45653.0). Total num frames: 446955520. Throughput: 0: 11662.3. Samples: 111804928. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:19:55,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:19:58,253][1653645] Updated weights for policy 0, policy_version 218274 (0.0012) [2024-06-15 14:19:59,804][1653645] Updated weights for policy 0, policy_version 218358 (0.0014) [2024-06-15 14:20:00,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 48605.7, 300 sec: 45875.2). Total num frames: 447250432. Throughput: 0: 11753.3. Samples: 111875072. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:20:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:20:01,142][1653645] Updated weights for policy 0, policy_version 218401 (0.0139) [2024-06-15 14:20:04,350][1653645] Updated weights for policy 0, policy_version 218464 (0.0011) [2024-06-15 14:20:05,153][1653645] Updated weights for policy 0, policy_version 218496 (0.0024) [2024-06-15 14:20:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 447479808. Throughput: 0: 11673.7. Samples: 111911424. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:20:05,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:20:10,871][1653645] Updated weights for policy 0, policy_version 218566 (0.0037) [2024-06-15 14:20:10,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 45875.5, 300 sec: 45319.8). Total num frames: 447610880. Throughput: 0: 11582.6. Samples: 111985152. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:20:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:20:12,257][1653645] Updated weights for policy 0, policy_version 218624 (0.0012) [2024-06-15 14:20:13,985][1653645] Updated weights for policy 0, policy_version 218688 (0.0013) [2024-06-15 14:20:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.2, 300 sec: 45875.2). Total num frames: 447905792. Throughput: 0: 11411.9. Samples: 112041984. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:20:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:20:16,796][1653645] Updated weights for policy 0, policy_version 218744 (0.0115) [2024-06-15 14:20:20,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 448004096. Throughput: 0: 11332.2. Samples: 112075776. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:20:20,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 14:20:21,372][1651596] Signal inference workers to stop experience collection... (11300 times) [2024-06-15 14:20:21,404][1653645] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-06-15 14:20:21,648][1651596] Signal inference workers to resume experience collection... (11300 times) [2024-06-15 14:20:21,649][1653645] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-06-15 14:20:22,539][1653645] Updated weights for policy 0, policy_version 218802 (0.0023) [2024-06-15 14:20:24,205][1653645] Updated weights for policy 0, policy_version 218868 (0.0013) [2024-06-15 14:20:25,799][1653645] Updated weights for policy 0, policy_version 218928 (0.0013) [2024-06-15 14:20:25,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 45875.0, 300 sec: 45653.0). Total num frames: 448364544. Throughput: 0: 11245.3. Samples: 112136192. Policy #0 lag: (min: 31.0, avg: 162.5, max: 287.0) [2024-06-15 14:20:25,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:20:27,740][1653645] Updated weights for policy 0, policy_version 218960 (0.0038) [2024-06-15 14:20:28,894][1653645] Updated weights for policy 0, policy_version 219008 (0.0012) [2024-06-15 14:20:30,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 44782.9, 300 sec: 45653.0). Total num frames: 448528384. Throughput: 0: 10979.6. Samples: 112201728. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:20:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:20:35,443][1653645] Updated weights for policy 0, policy_version 219059 (0.0014) [2024-06-15 14:20:35,958][1648982] Fps is (10 sec: 29492.1, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 448659456. Throughput: 0: 11173.0. Samples: 112241664. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:20:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:20:37,200][1653645] Updated weights for policy 0, policy_version 219124 (0.0013) [2024-06-15 14:20:39,120][1653645] Updated weights for policy 0, policy_version 219200 (0.0012) [2024-06-15 14:20:40,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 44248.8, 300 sec: 45430.8). Total num frames: 448954368. Throughput: 0: 10797.5. Samples: 112290816. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:20:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:20:41,776][1653645] Updated weights for policy 0, policy_version 219264 (0.0043) [2024-06-15 14:20:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 449052672. Throughput: 0: 10786.2. Samples: 112360448. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:20:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:20:47,520][1653645] Updated weights for policy 0, policy_version 219324 (0.0012) [2024-06-15 14:20:49,454][1653645] Updated weights for policy 0, policy_version 219383 (0.0137) [2024-06-15 14:20:50,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 43144.5, 300 sec: 45653.1). Total num frames: 449413120. Throughput: 0: 10626.8. Samples: 112389632. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:20:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:20:53,092][1653645] Updated weights for policy 0, policy_version 219460 (0.0012) [2024-06-15 14:20:54,036][1653645] Updated weights for policy 0, policy_version 219515 (0.0012) [2024-06-15 14:20:55,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 45097.7). Total num frames: 449576960. Throughput: 0: 10274.1. Samples: 112447488. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:20:55,959][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 14:20:55,976][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000219520_449576960.pth... [2024-06-15 14:20:56,054][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000214208_438697984.pth [2024-06-15 14:20:59,964][1653645] Updated weights for policy 0, policy_version 219568 (0.0013) [2024-06-15 14:21:00,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 41506.2, 300 sec: 44986.6). Total num frames: 449740800. Throughput: 0: 10661.0. Samples: 112521728. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:21:01,430][1653645] Updated weights for policy 0, policy_version 219621 (0.0022) [2024-06-15 14:21:02,492][1651596] Signal inference workers to stop experience collection... (11350 times) [2024-06-15 14:21:02,519][1653645] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-06-15 14:21:02,720][1651596] Signal inference workers to resume experience collection... (11350 times) [2024-06-15 14:21:02,720][1653645] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-06-15 14:21:03,591][1653645] Updated weights for policy 0, policy_version 219709 (0.0120) [2024-06-15 14:21:05,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 45542.0). Total num frames: 450035712. Throughput: 0: 10410.7. Samples: 112544256. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:21:06,257][1653645] Updated weights for policy 0, policy_version 219769 (0.0087) [2024-06-15 14:21:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 44875.5). Total num frames: 450101248. Throughput: 0: 10626.9. Samples: 112614400. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:21:11,667][1653645] Updated weights for policy 0, policy_version 219810 (0.0095) [2024-06-15 14:21:12,923][1653645] Updated weights for policy 0, policy_version 219860 (0.0016) [2024-06-15 14:21:14,668][1653645] Updated weights for policy 0, policy_version 219922 (0.0015) [2024-06-15 14:21:15,958][1648982] Fps is (10 sec: 45873.4, 60 sec: 43144.3, 300 sec: 45541.9). Total num frames: 450494464. Throughput: 0: 10501.6. Samples: 112674304. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:15,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:21:17,347][1653645] Updated weights for policy 0, policy_version 219971 (0.0012) [2024-06-15 14:21:18,499][1653645] Updated weights for policy 0, policy_version 220026 (0.0012) [2024-06-15 14:21:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 450625536. Throughput: 0: 10319.6. Samples: 112706048. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:21:23,865][1653645] Updated weights for policy 0, policy_version 220084 (0.0013) [2024-06-15 14:21:24,619][1653645] Updated weights for policy 0, policy_version 220112 (0.0013) [2024-06-15 14:21:25,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 45319.8). Total num frames: 450887680. Throughput: 0: 10945.5. Samples: 112783360. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:21:26,026][1653645] Updated weights for policy 0, policy_version 220164 (0.0011) [2024-06-15 14:21:28,907][1653645] Updated weights for policy 0, policy_version 220240 (0.0017) [2024-06-15 14:21:30,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 451149824. Throughput: 0: 10626.8. Samples: 112838656. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:21:35,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42598.3, 300 sec: 44653.3). Total num frames: 451215360. Throughput: 0: 10831.6. Samples: 112877056. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 14:21:35,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:21:35,988][1653645] Updated weights for policy 0, policy_version 220336 (0.0034) [2024-06-15 14:21:37,486][1653645] Updated weights for policy 0, policy_version 220377 (0.0019) [2024-06-15 14:21:39,249][1653645] Updated weights for policy 0, policy_version 220464 (0.0013) [2024-06-15 14:21:40,928][1653645] Updated weights for policy 0, policy_version 220512 (0.0017) [2024-06-15 14:21:40,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44237.0, 300 sec: 45542.0). Total num frames: 451608576. Throughput: 0: 10911.3. Samples: 112938496. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:21:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:21:41,684][1653645] Updated weights for policy 0, policy_version 220544 (0.0012) [2024-06-15 14:21:45,961][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 451674112. Throughput: 0: 10854.4. Samples: 113010176. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:21:45,962][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:21:48,575][1653645] Updated weights for policy 0, policy_version 220611 (0.0012) [2024-06-15 14:21:48,837][1651596] Signal inference workers to stop experience collection... (11400 times) [2024-06-15 14:21:48,909][1653645] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-06-15 14:21:49,062][1651596] Signal inference workers to resume experience collection... (11400 times) [2024-06-15 14:21:49,063][1653645] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-06-15 14:21:49,905][1653645] Updated weights for policy 0, policy_version 220675 (0.0012) [2024-06-15 14:21:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 45208.8). Total num frames: 452034560. Throughput: 0: 11241.2. Samples: 113050112. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:21:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:21:51,020][1653645] Updated weights for policy 0, policy_version 220731 (0.0012) [2024-06-15 14:21:53,060][1653645] Updated weights for policy 0, policy_version 220795 (0.0012) [2024-06-15 14:21:55,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 452198400. Throughput: 0: 10968.2. Samples: 113107968. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:21:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:21:59,654][1653645] Updated weights for policy 0, policy_version 220834 (0.0024) [2024-06-15 14:22:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 452395008. Throughput: 0: 11332.4. Samples: 113184256. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:22:01,778][1653645] Updated weights for policy 0, policy_version 220928 (0.0293) [2024-06-15 14:22:03,297][1653645] Updated weights for policy 0, policy_version 220985 (0.0012) [2024-06-15 14:22:04,792][1653645] Updated weights for policy 0, policy_version 221046 (0.0012) [2024-06-15 14:22:05,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 452722688. Throughput: 0: 11150.2. Samples: 113207808. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:22:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 452722688. Throughput: 0: 11013.7. Samples: 113278976. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:22:12,097][1653645] Updated weights for policy 0, policy_version 221107 (0.0012) [2024-06-15 14:22:14,239][1653645] Updated weights for policy 0, policy_version 221192 (0.0107) [2024-06-15 14:22:15,571][1653645] Updated weights for policy 0, policy_version 221244 (0.0172) [2024-06-15 14:22:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.9, 300 sec: 45097.6). Total num frames: 453115904. Throughput: 0: 11104.7. Samples: 113338368. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:22:17,101][1653645] Updated weights for policy 0, policy_version 221301 (0.0014) [2024-06-15 14:22:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 453246976. Throughput: 0: 10956.8. Samples: 113370112. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:22:23,469][1653645] Updated weights for policy 0, policy_version 221332 (0.0012) [2024-06-15 14:22:25,399][1653645] Updated weights for policy 0, policy_version 221411 (0.0012) [2024-06-15 14:22:25,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 43144.4, 300 sec: 44764.4). Total num frames: 453476352. Throughput: 0: 11320.8. Samples: 113447936. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:25,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:22:27,138][1653645] Updated weights for policy 0, policy_version 221488 (0.0013) [2024-06-15 14:22:27,301][1651596] Signal inference workers to stop experience collection... (11450 times) [2024-06-15 14:22:27,337][1653645] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-06-15 14:22:27,598][1651596] Signal inference workers to resume experience collection... (11450 times) [2024-06-15 14:22:27,598][1653645] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-06-15 14:22:29,120][1653645] Updated weights for policy 0, policy_version 221561 (0.0091) [2024-06-15 14:22:30,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 453771264. Throughput: 0: 10854.4. Samples: 113498624. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:22:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 453836800. Throughput: 0: 10831.6. Samples: 113537536. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:35,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:22:36,344][1653645] Updated weights for policy 0, policy_version 221632 (0.0018) [2024-06-15 14:22:38,618][1653645] Updated weights for policy 0, policy_version 221714 (0.0012) [2024-06-15 14:22:39,457][1653645] Updated weights for policy 0, policy_version 221753 (0.0011) [2024-06-15 14:22:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 454230016. Throughput: 0: 10808.9. Samples: 113594368. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:22:41,219][1653645] Updated weights for policy 0, policy_version 221823 (0.0039) [2024-06-15 14:22:45,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 454295552. Throughput: 0: 10717.9. Samples: 113666560. Policy #0 lag: (min: 59.0, avg: 131.9, max: 315.0) [2024-06-15 14:22:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:22:48,699][1653645] Updated weights for policy 0, policy_version 221874 (0.0014) [2024-06-15 14:22:50,400][1653645] Updated weights for policy 0, policy_version 221938 (0.0013) [2024-06-15 14:22:50,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42052.2, 300 sec: 44653.3). Total num frames: 454557696. Throughput: 0: 10990.9. Samples: 113702400. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:22:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:22:51,973][1653645] Updated weights for policy 0, policy_version 222011 (0.0014) [2024-06-15 14:22:53,905][1653645] Updated weights for policy 0, policy_version 222080 (0.0014) [2024-06-15 14:22:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 454819840. Throughput: 0: 10467.5. Samples: 113750016. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:22:55,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:22:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000222080_454819840.pth... [2024-06-15 14:22:56,021][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000216896_444203008.pth [2024-06-15 14:23:00,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 41506.1, 300 sec: 44209.0). Total num frames: 454885376. Throughput: 0: 10899.9. Samples: 113828864. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:23:02,224][1653645] Updated weights for policy 0, policy_version 222163 (0.0020) [2024-06-15 14:23:03,212][1653645] Updated weights for policy 0, policy_version 222224 (0.0012) [2024-06-15 14:23:04,294][1653645] Updated weights for policy 0, policy_version 222266 (0.0016) [2024-06-15 14:23:05,945][1653645] Updated weights for policy 0, policy_version 222327 (0.0013) [2024-06-15 14:23:05,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 455311360. Throughput: 0: 10808.9. Samples: 113856512. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:23:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 455344128. Throughput: 0: 10649.7. Samples: 113927168. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:23:12,085][1653645] Updated weights for policy 0, policy_version 222387 (0.0016) [2024-06-15 14:23:12,522][1651596] Signal inference workers to stop experience collection... (11500 times) [2024-06-15 14:23:12,575][1653645] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-06-15 14:23:12,782][1651596] Signal inference workers to resume experience collection... (11500 times) [2024-06-15 14:23:12,783][1653645] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-06-15 14:23:13,626][1653645] Updated weights for policy 0, policy_version 222450 (0.0014) [2024-06-15 14:23:14,844][1653645] Updated weights for policy 0, policy_version 222524 (0.0034) [2024-06-15 14:23:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 455737344. Throughput: 0: 10956.8. Samples: 113991680. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:23:17,743][1653645] Updated weights for policy 0, policy_version 222584 (0.0012) [2024-06-15 14:23:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 455868416. Throughput: 0: 10786.2. Samples: 114022912. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:23:24,511][1653645] Updated weights for policy 0, policy_version 222672 (0.0110) [2024-06-15 14:23:25,608][1653645] Updated weights for policy 0, policy_version 222723 (0.0116) [2024-06-15 14:23:25,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 456163328. Throughput: 0: 11195.7. Samples: 114098176. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:25,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:23:27,112][1653645] Updated weights for policy 0, policy_version 222783 (0.0107) [2024-06-15 14:23:28,972][1653645] Updated weights for policy 0, policy_version 222841 (0.0013) [2024-06-15 14:23:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 456392704. Throughput: 0: 10911.3. Samples: 114157568. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:23:35,201][1653645] Updated weights for policy 0, policy_version 222885 (0.0012) [2024-06-15 14:23:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 456556544. Throughput: 0: 11002.3. Samples: 114197504. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:23:37,109][1653645] Updated weights for policy 0, policy_version 222981 (0.0013) [2024-06-15 14:23:38,630][1653645] Updated weights for policy 0, policy_version 223040 (0.0307) [2024-06-15 14:23:40,533][1653645] Updated weights for policy 0, policy_version 223104 (0.0014) [2024-06-15 14:23:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44782.9, 300 sec: 44542.2). Total num frames: 456916992. Throughput: 0: 11241.3. Samples: 114255872. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:23:45,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 44782.7, 300 sec: 44431.2). Total num frames: 456982528. Throughput: 0: 11298.1. Samples: 114337280. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:23:47,206][1653645] Updated weights for policy 0, policy_version 223170 (0.0020) [2024-06-15 14:23:48,275][1653645] Updated weights for policy 0, policy_version 223225 (0.0014) [2024-06-15 14:23:49,932][1653645] Updated weights for policy 0, policy_version 223291 (0.0041) [2024-06-15 14:23:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 457342976. Throughput: 0: 11377.8. Samples: 114368512. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:23:51,616][1653645] Updated weights for policy 0, policy_version 223351 (0.0083) [2024-06-15 14:23:55,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 457441280. Throughput: 0: 11332.2. Samples: 114437120. Policy #0 lag: (min: 15.0, avg: 70.9, max: 271.0) [2024-06-15 14:23:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:23:56,755][1651596] Signal inference workers to stop experience collection... (11550 times) [2024-06-15 14:23:56,787][1653645] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-06-15 14:23:57,009][1651596] Signal inference workers to resume experience collection... (11550 times) [2024-06-15 14:23:57,010][1653645] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-06-15 14:23:57,618][1653645] Updated weights for policy 0, policy_version 223408 (0.0020) [2024-06-15 14:23:59,105][1653645] Updated weights for policy 0, policy_version 223443 (0.0013) [2024-06-15 14:24:00,204][1653645] Updated weights for policy 0, policy_version 223483 (0.0014) [2024-06-15 14:24:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 47513.6, 300 sec: 44098.0). Total num frames: 457736192. Throughput: 0: 11468.8. Samples: 114507776. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:24:02,076][1653645] Updated weights for policy 0, policy_version 223552 (0.0017) [2024-06-15 14:24:03,388][1653645] Updated weights for policy 0, policy_version 223611 (0.0012) [2024-06-15 14:24:05,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 457965568. Throughput: 0: 11320.9. Samples: 114532352. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:24:08,791][1653645] Updated weights for policy 0, policy_version 223665 (0.0036) [2024-06-15 14:24:10,856][1653645] Updated weights for policy 0, policy_version 223715 (0.0013) [2024-06-15 14:24:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46967.4, 300 sec: 43986.9). Total num frames: 458162176. Throughput: 0: 11446.1. Samples: 114613248. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:24:12,553][1653645] Updated weights for policy 0, policy_version 223776 (0.0014) [2024-06-15 14:24:14,118][1653645] Updated weights for policy 0, policy_version 223840 (0.0065) [2024-06-15 14:24:15,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 458489856. Throughput: 0: 11446.0. Samples: 114672640. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:24:20,384][1653645] Updated weights for policy 0, policy_version 223920 (0.0015) [2024-06-15 14:24:20,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 45875.0, 300 sec: 44097.9). Total num frames: 458620928. Throughput: 0: 11480.1. Samples: 114714112. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:20,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:24:23,040][1653645] Updated weights for policy 0, policy_version 223993 (0.0014) [2024-06-15 14:24:25,093][1653645] Updated weights for policy 0, policy_version 224048 (0.0012) [2024-06-15 14:24:25,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 458915840. Throughput: 0: 11502.9. Samples: 114773504. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:24:30,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 459014144. Throughput: 0: 11104.8. Samples: 114836992. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:24:32,369][1653645] Updated weights for policy 0, policy_version 224129 (0.0013) [2024-06-15 14:24:33,818][1653645] Updated weights for policy 0, policy_version 224192 (0.0024) [2024-06-15 14:24:35,828][1653645] Updated weights for policy 0, policy_version 224255 (0.0068) [2024-06-15 14:24:35,984][1648982] Fps is (10 sec: 35949.8, 60 sec: 45309.1, 300 sec: 43985.4). Total num frames: 459276288. Throughput: 0: 11098.2. Samples: 114868224. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:35,985][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:24:37,825][1653645] Updated weights for policy 0, policy_version 224310 (0.0012) [2024-06-15 14:24:38,120][1651596] Signal inference workers to stop experience collection... (11600 times) [2024-06-15 14:24:38,162][1653645] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-06-15 14:24:38,357][1651596] Signal inference workers to resume experience collection... (11600 times) [2024-06-15 14:24:38,357][1653645] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-06-15 14:24:39,125][1653645] Updated weights for policy 0, policy_version 224368 (0.0074) [2024-06-15 14:24:40,979][1648982] Fps is (10 sec: 52318.6, 60 sec: 43675.3, 300 sec: 44428.0). Total num frames: 459538432. Throughput: 0: 10940.4. Samples: 114929664. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:40,980][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:24:44,484][1653645] Updated weights for policy 0, policy_version 224416 (0.0012) [2024-06-15 14:24:45,210][1653645] Updated weights for policy 0, policy_version 224448 (0.0014) [2024-06-15 14:24:45,958][1648982] Fps is (10 sec: 39426.0, 60 sec: 44783.2, 300 sec: 43542.6). Total num frames: 459669504. Throughput: 0: 11036.4. Samples: 115004416. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:24:48,560][1653645] Updated weights for policy 0, policy_version 224544 (0.0081) [2024-06-15 14:24:49,685][1653645] Updated weights for policy 0, policy_version 224592 (0.0011) [2024-06-15 14:24:50,766][1653645] Updated weights for policy 0, policy_version 224639 (0.0012) [2024-06-15 14:24:50,958][1648982] Fps is (10 sec: 52539.2, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 460062720. Throughput: 0: 11161.6. Samples: 115034624. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:50,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:24:55,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.9, 300 sec: 43542.6). Total num frames: 460095488. Throughput: 0: 10911.3. Samples: 115104256. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:24:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:24:56,526][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000224688_460161024.pth... [2024-06-15 14:24:56,567][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000219520_449576960.pth [2024-06-15 14:24:56,713][1653645] Updated weights for policy 0, policy_version 224692 (0.0014) [2024-06-15 14:24:59,718][1653645] Updated weights for policy 0, policy_version 224754 (0.0013) [2024-06-15 14:25:00,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 460390400. Throughput: 0: 10968.2. Samples: 115166208. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:25:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:25:01,281][1653645] Updated weights for policy 0, policy_version 224824 (0.0132) [2024-06-15 14:25:02,494][1653645] Updated weights for policy 0, policy_version 224866 (0.0013) [2024-06-15 14:25:05,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 460587008. Throughput: 0: 10717.9. Samples: 115196416. Policy #0 lag: (min: 15.0, avg: 82.7, max: 271.0) [2024-06-15 14:25:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:25:08,063][1653645] Updated weights for policy 0, policy_version 224928 (0.0019) [2024-06-15 14:25:10,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 460750848. Throughput: 0: 10854.4. Samples: 115261952. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:25:11,172][1653645] Updated weights for policy 0, policy_version 224992 (0.0012) [2024-06-15 14:25:12,715][1653645] Updated weights for policy 0, policy_version 225057 (0.0072) [2024-06-15 14:25:14,947][1653645] Updated weights for policy 0, policy_version 225123 (0.0020) [2024-06-15 14:25:15,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 461111296. Throughput: 0: 10854.3. Samples: 115325440. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:25:19,856][1653645] Updated weights for policy 0, policy_version 225185 (0.0014) [2024-06-15 14:25:20,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.8, 300 sec: 43653.7). Total num frames: 461242368. Throughput: 0: 10951.9. Samples: 115360768. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:25:22,767][1653645] Updated weights for policy 0, policy_version 225248 (0.0013) [2024-06-15 14:25:23,305][1651596] Signal inference workers to stop experience collection... (11650 times) [2024-06-15 14:25:23,349][1653645] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-06-15 14:25:23,617][1651596] Signal inference workers to resume experience collection... (11650 times) [2024-06-15 14:25:23,618][1653645] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-06-15 14:25:24,343][1653645] Updated weights for policy 0, policy_version 225312 (0.0011) [2024-06-15 14:25:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 43986.8). Total num frames: 461504512. Throughput: 0: 11087.1. Samples: 115428352. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:25,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:25:26,119][1653645] Updated weights for policy 0, policy_version 225349 (0.0025) [2024-06-15 14:25:27,002][1653645] Updated weights for policy 0, policy_version 225403 (0.0012) [2024-06-15 14:25:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 461668352. Throughput: 0: 11059.2. Samples: 115502080. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:25:31,976][1653645] Updated weights for policy 0, policy_version 225472 (0.0013) [2024-06-15 14:25:35,334][1653645] Updated weights for policy 0, policy_version 225538 (0.0020) [2024-06-15 14:25:35,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44802.7, 300 sec: 44098.0). Total num frames: 461963264. Throughput: 0: 11138.8. Samples: 115535872. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:25:36,482][1653645] Updated weights for policy 0, policy_version 225595 (0.0011) [2024-06-15 14:25:40,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43705.9, 300 sec: 44431.2). Total num frames: 462159872. Throughput: 0: 11070.6. Samples: 115602432. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:40,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:25:41,949][1653645] Updated weights for policy 0, policy_version 225680 (0.0014) [2024-06-15 14:25:45,010][1653645] Updated weights for policy 0, policy_version 225744 (0.0014) [2024-06-15 14:25:45,957][1648982] Fps is (10 sec: 45876.2, 60 sec: 45875.3, 300 sec: 44098.0). Total num frames: 462422016. Throughput: 0: 11207.2. Samples: 115670528. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:25:46,628][1653645] Updated weights for policy 0, policy_version 225808 (0.0014) [2024-06-15 14:25:49,814][1653645] Updated weights for policy 0, policy_version 225888 (0.0032) [2024-06-15 14:25:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 462684160. Throughput: 0: 11252.6. Samples: 115702784. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:50,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 14:25:54,646][1653645] Updated weights for policy 0, policy_version 225952 (0.0014) [2024-06-15 14:25:55,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 462815232. Throughput: 0: 11286.8. Samples: 115769856. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:25:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:25:56,654][1653645] Updated weights for policy 0, policy_version 225987 (0.0014) [2024-06-15 14:25:58,712][1653645] Updated weights for policy 0, policy_version 226053 (0.0019) [2024-06-15 14:25:59,786][1653645] Updated weights for policy 0, policy_version 226106 (0.0018) [2024-06-15 14:26:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 463077376. Throughput: 0: 11332.3. Samples: 115835392. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:26:00,958][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 14:26:02,590][1653645] Updated weights for policy 0, policy_version 226165 (0.0016) [2024-06-15 14:26:05,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 463241216. Throughput: 0: 11252.6. Samples: 115867136. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:26:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:26:06,146][1653645] Updated weights for policy 0, policy_version 226208 (0.0016) [2024-06-15 14:26:07,117][1653645] Updated weights for policy 0, policy_version 226240 (0.0011) [2024-06-15 14:26:08,890][1653645] Updated weights for policy 0, policy_version 226304 (0.0013) [2024-06-15 14:26:10,488][1651596] Signal inference workers to stop experience collection... (11700 times) [2024-06-15 14:26:10,530][1653645] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-06-15 14:26:10,731][1651596] Signal inference workers to resume experience collection... (11700 times) [2024-06-15 14:26:10,732][1653645] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-06-15 14:26:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 463503360. Throughput: 0: 11355.1. Samples: 115939328. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:26:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:26:11,696][1653645] Updated weights for policy 0, policy_version 226361 (0.0012) [2024-06-15 14:26:13,814][1653645] Updated weights for policy 0, policy_version 226416 (0.0024) [2024-06-15 14:26:15,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 463732736. Throughput: 0: 11184.3. Samples: 116005376. Policy #0 lag: (min: 6.0, avg: 96.6, max: 262.0) [2024-06-15 14:26:15,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 14:26:17,655][1653645] Updated weights for policy 0, policy_version 226468 (0.0015) [2024-06-15 14:26:19,878][1653645] Updated weights for policy 0, policy_version 226515 (0.0012) [2024-06-15 14:26:20,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 463994880. Throughput: 0: 11127.5. Samples: 116036608. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:26:22,865][1653645] Updated weights for policy 0, policy_version 226576 (0.0013) [2024-06-15 14:26:24,050][1653645] Updated weights for policy 0, policy_version 226624 (0.0021) [2024-06-15 14:26:25,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 464191488. Throughput: 0: 11195.7. Samples: 116106240. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:26:26,272][1653645] Updated weights for policy 0, policy_version 226683 (0.0012) [2024-06-15 14:26:29,940][1653645] Updated weights for policy 0, policy_version 226736 (0.0013) [2024-06-15 14:26:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 44653.4). Total num frames: 464388096. Throughput: 0: 11002.3. Samples: 116165632. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:30,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:26:32,120][1653645] Updated weights for policy 0, policy_version 226809 (0.0012) [2024-06-15 14:26:35,739][1653645] Updated weights for policy 0, policy_version 226864 (0.0013) [2024-06-15 14:26:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 464617472. Throughput: 0: 11104.7. Samples: 116202496. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:26:37,312][1653645] Updated weights for policy 0, policy_version 226912 (0.0012) [2024-06-15 14:26:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 464814080. Throughput: 0: 10990.9. Samples: 116264448. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:26:41,367][1653645] Updated weights for policy 0, policy_version 226977 (0.0014) [2024-06-15 14:26:41,889][1653645] Updated weights for policy 0, policy_version 227004 (0.0028) [2024-06-15 14:26:43,237][1653645] Updated weights for policy 0, policy_version 227056 (0.0023) [2024-06-15 14:26:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 465076224. Throughput: 0: 11332.3. Samples: 116345344. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:45,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:26:46,140][1653645] Updated weights for policy 0, policy_version 227091 (0.0012) [2024-06-15 14:26:47,028][1653645] Updated weights for policy 0, policy_version 227136 (0.0084) [2024-06-15 14:26:50,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 465305600. Throughput: 0: 11377.8. Samples: 116379136. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:26:51,735][1653645] Updated weights for policy 0, policy_version 227203 (0.0014) [2024-06-15 14:26:53,660][1653645] Updated weights for policy 0, policy_version 227282 (0.0013) [2024-06-15 14:26:55,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 465567744. Throughput: 0: 11218.5. Samples: 116444160. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:26:55,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 14:26:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000227328_465567744.pth... [2024-06-15 14:26:56,023][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000222080_454819840.pth [2024-06-15 14:26:58,453][1651596] Signal inference workers to stop experience collection... (11750 times) [2024-06-15 14:26:58,465][1653645] Updated weights for policy 0, policy_version 227362 (0.0108) [2024-06-15 14:26:58,559][1653645] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-06-15 14:26:58,770][1651596] Signal inference workers to resume experience collection... (11750 times) [2024-06-15 14:26:58,771][1653645] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-06-15 14:27:00,682][1653645] Updated weights for policy 0, policy_version 227424 (0.0013) [2024-06-15 14:27:00,962][1648982] Fps is (10 sec: 45853.1, 60 sec: 44779.4, 300 sec: 44208.3). Total num frames: 465764352. Throughput: 0: 11217.4. Samples: 116510208. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:27:00,963][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:27:04,726][1653645] Updated weights for policy 0, policy_version 227511 (0.0015) [2024-06-15 14:27:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 46421.4, 300 sec: 45097.7). Total num frames: 466026496. Throughput: 0: 11252.6. Samples: 116542976. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:27:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:27:06,348][1653645] Updated weights for policy 0, policy_version 227574 (0.0013) [2024-06-15 14:27:10,980][1648982] Fps is (10 sec: 42522.2, 60 sec: 44766.0, 300 sec: 44316.7). Total num frames: 466190336. Throughput: 0: 11269.7. Samples: 116613632. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:27:10,981][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:27:11,142][1653645] Updated weights for policy 0, policy_version 227646 (0.0014) [2024-06-15 14:27:14,004][1653645] Updated weights for policy 0, policy_version 227708 (0.0013) [2024-06-15 14:27:15,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44237.1, 300 sec: 44542.3). Total num frames: 466386944. Throughput: 0: 11241.2. Samples: 116671488. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:27:15,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:27:16,753][1653645] Updated weights for policy 0, policy_version 227774 (0.0013) [2024-06-15 14:27:18,081][1653645] Updated weights for policy 0, policy_version 227835 (0.0012) [2024-06-15 14:27:20,958][1648982] Fps is (10 sec: 42695.2, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 466616320. Throughput: 0: 11104.7. Samples: 116702208. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:27:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:27:23,332][1653645] Updated weights for policy 0, policy_version 227888 (0.0030) [2024-06-15 14:27:24,885][1653645] Updated weights for policy 0, policy_version 227936 (0.0023) [2024-06-15 14:27:25,960][1648982] Fps is (10 sec: 49143.2, 60 sec: 44781.7, 300 sec: 44430.9). Total num frames: 466878464. Throughput: 0: 11388.7. Samples: 116776960. Policy #0 lag: (min: 13.0, avg: 117.2, max: 269.0) [2024-06-15 14:27:25,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:27:27,628][1653645] Updated weights for policy 0, policy_version 228002 (0.0011) [2024-06-15 14:27:29,419][1653645] Updated weights for policy 0, policy_version 228088 (0.0013) [2024-06-15 14:27:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.1, 300 sec: 45097.7). Total num frames: 467140608. Throughput: 0: 10956.8. Samples: 116838400. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:27:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:27:35,533][1653645] Updated weights for policy 0, policy_version 228158 (0.0021) [2024-06-15 14:27:35,958][1648982] Fps is (10 sec: 39328.7, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 467271680. Throughput: 0: 11093.3. Samples: 116878336. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:27:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:27:37,216][1653645] Updated weights for policy 0, policy_version 228224 (0.0015) [2024-06-15 14:27:40,636][1653645] Updated weights for policy 0, policy_version 228290 (0.0013) [2024-06-15 14:27:40,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 45875.1, 300 sec: 44986.5). Total num frames: 467566592. Throughput: 0: 11025.0. Samples: 116940288. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:27:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:27:41,781][1653645] Updated weights for policy 0, policy_version 228345 (0.0013) [2024-06-15 14:27:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 467664896. Throughput: 0: 11208.3. Samples: 117014528. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:27:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:27:46,702][1651596] Signal inference workers to stop experience collection... (11800 times) [2024-06-15 14:27:46,736][1653645] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-06-15 14:27:46,928][1651596] Signal inference workers to resume experience collection... (11800 times) [2024-06-15 14:27:46,930][1653645] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-06-15 14:27:47,325][1653645] Updated weights for policy 0, policy_version 228400 (0.0013) [2024-06-15 14:27:49,038][1653645] Updated weights for policy 0, policy_version 228475 (0.0013) [2024-06-15 14:27:50,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 468025344. Throughput: 0: 11127.5. Samples: 117043712. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:27:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:27:51,142][1653645] Updated weights for policy 0, policy_version 228537 (0.0012) [2024-06-15 14:27:52,503][1653645] Updated weights for policy 0, policy_version 228592 (0.0012) [2024-06-15 14:27:55,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 45097.6). Total num frames: 468189184. Throughput: 0: 11144.4. Samples: 117114880. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:27:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:27:58,217][1653645] Updated weights for policy 0, policy_version 228640 (0.0011) [2024-06-15 14:27:59,423][1653645] Updated weights for policy 0, policy_version 228704 (0.0013) [2024-06-15 14:28:00,958][1648982] Fps is (10 sec: 42596.1, 60 sec: 44786.1, 300 sec: 44542.2). Total num frames: 468451328. Throughput: 0: 11434.5. Samples: 117186048. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:28:02,223][1653645] Updated weights for policy 0, policy_version 228800 (0.0015) [2024-06-15 14:28:03,756][1653645] Updated weights for policy 0, policy_version 228862 (0.0013) [2024-06-15 14:28:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 468713472. Throughput: 0: 11252.6. Samples: 117208576. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:28:10,783][1653645] Updated weights for policy 0, policy_version 228914 (0.0012) [2024-06-15 14:28:10,958][1648982] Fps is (10 sec: 36045.9, 60 sec: 43707.0, 300 sec: 44320.1). Total num frames: 468811776. Throughput: 0: 11412.3. Samples: 117290496. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:28:12,239][1653645] Updated weights for policy 0, policy_version 228982 (0.0051) [2024-06-15 14:28:14,265][1653645] Updated weights for policy 0, policy_version 229048 (0.0012) [2024-06-15 14:28:15,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 469204992. Throughput: 0: 11116.1. Samples: 117338624. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:28:16,073][1653645] Updated weights for policy 0, policy_version 229113 (0.0023) [2024-06-15 14:28:20,962][1648982] Fps is (10 sec: 42580.2, 60 sec: 43687.4, 300 sec: 44319.5). Total num frames: 469237760. Throughput: 0: 11024.0. Samples: 117374464. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:20,963][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:28:23,369][1653645] Updated weights for policy 0, policy_version 229168 (0.0012) [2024-06-15 14:28:25,023][1653645] Updated weights for policy 0, policy_version 229238 (0.0015) [2024-06-15 14:28:25,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 44237.9, 300 sec: 44542.2). Total num frames: 469532672. Throughput: 0: 11173.0. Samples: 117443072. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:28:26,281][1651596] Signal inference workers to stop experience collection... (11850 times) [2024-06-15 14:28:26,346][1653645] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-06-15 14:28:26,648][1651596] Signal inference workers to resume experience collection... (11850 times) [2024-06-15 14:28:26,649][1653645] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-06-15 14:28:26,651][1653645] Updated weights for policy 0, policy_version 229296 (0.0011) [2024-06-15 14:28:28,193][1653645] Updated weights for policy 0, policy_version 229361 (0.0012) [2024-06-15 14:28:30,958][1648982] Fps is (10 sec: 52452.7, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 469762048. Throughput: 0: 10945.4. Samples: 117507072. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:28:35,016][1653645] Updated weights for policy 0, policy_version 229411 (0.0041) [2024-06-15 14:28:35,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 469893120. Throughput: 0: 11207.1. Samples: 117548032. Policy #0 lag: (min: 15.0, avg: 126.9, max: 271.0) [2024-06-15 14:28:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:28:37,486][1653645] Updated weights for policy 0, policy_version 229508 (0.0107) [2024-06-15 14:28:38,807][1653645] Updated weights for policy 0, policy_version 229565 (0.0011) [2024-06-15 14:28:40,502][1653645] Updated weights for policy 0, policy_version 229623 (0.0015) [2024-06-15 14:28:40,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 470286336. Throughput: 0: 10797.5. Samples: 117600768. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:28:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:28:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 470286336. Throughput: 0: 10922.7. Samples: 117677568. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:28:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:28:47,799][1653645] Updated weights for policy 0, policy_version 229700 (0.0014) [2024-06-15 14:28:48,955][1653645] Updated weights for policy 0, policy_version 229757 (0.0023) [2024-06-15 14:28:50,610][1653645] Updated weights for policy 0, policy_version 229824 (0.0012) [2024-06-15 14:28:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 470679552. Throughput: 0: 10990.9. Samples: 117703168. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:28:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:28:52,035][1653645] Updated weights for policy 0, policy_version 229883 (0.0013) [2024-06-15 14:28:55,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 470810624. Throughput: 0: 10672.4. Samples: 117770752. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:28:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:28:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000229888_470810624.pth... [2024-06-15 14:28:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000224688_460161024.pth [2024-06-15 14:28:58,365][1653645] Updated weights for policy 0, policy_version 229947 (0.0015) [2024-06-15 14:28:59,576][1653645] Updated weights for policy 0, policy_version 229986 (0.0012) [2024-06-15 14:29:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 471072768. Throughput: 0: 11150.2. Samples: 117840384. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:29:01,958][1653645] Updated weights for policy 0, policy_version 230048 (0.0016) [2024-06-15 14:29:04,104][1653645] Updated weights for policy 0, policy_version 230128 (0.0012) [2024-06-15 14:29:05,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 471334912. Throughput: 0: 11003.4. Samples: 117869568. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:29:09,379][1653645] Updated weights for policy 0, policy_version 230176 (0.0012) [2024-06-15 14:29:10,846][1651596] Signal inference workers to stop experience collection... (11900 times) [2024-06-15 14:29:10,914][1653645] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-06-15 14:29:10,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44783.1, 300 sec: 44098.0). Total num frames: 471498752. Throughput: 0: 11082.0. Samples: 117941760. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:29:11,073][1651596] Signal inference workers to resume experience collection... (11900 times) [2024-06-15 14:29:11,074][1653645] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-06-15 14:29:11,453][1653645] Updated weights for policy 0, policy_version 230256 (0.0015) [2024-06-15 14:29:14,883][1653645] Updated weights for policy 0, policy_version 230336 (0.0012) [2024-06-15 14:29:15,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 44653.4). Total num frames: 471793664. Throughput: 0: 10865.8. Samples: 117996032. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:29:20,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43693.8, 300 sec: 43875.8). Total num frames: 471859200. Throughput: 0: 10695.1. Samples: 118029312. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:29:21,541][1653645] Updated weights for policy 0, policy_version 230416 (0.0023) [2024-06-15 14:29:23,310][1653645] Updated weights for policy 0, policy_version 230465 (0.0013) [2024-06-15 14:29:24,726][1653645] Updated weights for policy 0, policy_version 230524 (0.0013) [2024-06-15 14:29:25,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 472154112. Throughput: 0: 11127.5. Samples: 118101504. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:29:26,820][1653645] Updated weights for policy 0, policy_version 230577 (0.0013) [2024-06-15 14:29:28,473][1653645] Updated weights for policy 0, policy_version 230648 (0.0012) [2024-06-15 14:29:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.5, 300 sec: 44435.2). Total num frames: 472383488. Throughput: 0: 10763.4. Samples: 118161920. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:30,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:29:34,663][1653645] Updated weights for policy 0, policy_version 230716 (0.0012) [2024-06-15 14:29:35,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.8, 300 sec: 43990.0). Total num frames: 472514560. Throughput: 0: 11082.0. Samples: 118201856. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:29:37,213][1653645] Updated weights for policy 0, policy_version 230780 (0.0088) [2024-06-15 14:29:38,840][1653645] Updated weights for policy 0, policy_version 230832 (0.0013) [2024-06-15 14:29:40,786][1653645] Updated weights for policy 0, policy_version 230912 (0.0012) [2024-06-15 14:29:40,960][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 472907776. Throughput: 0: 10786.2. Samples: 118256128. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:40,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:29:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 472907776. Throughput: 0: 10752.0. Samples: 118324224. Policy #0 lag: (min: 89.0, avg: 143.8, max: 345.0) [2024-06-15 14:29:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:29:49,076][1653645] Updated weights for policy 0, policy_version 230977 (0.0039) [2024-06-15 14:29:50,287][1653645] Updated weights for policy 0, policy_version 231031 (0.0113) [2024-06-15 14:29:50,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 42052.3, 300 sec: 44431.2). Total num frames: 473202688. Throughput: 0: 10729.2. Samples: 118352384. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:29:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:29:52,495][1653645] Updated weights for policy 0, policy_version 231125 (0.0013) [2024-06-15 14:29:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 473432064. Throughput: 0: 10376.5. Samples: 118408704. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:29:55,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 14:29:58,053][1653645] Updated weights for policy 0, policy_version 231172 (0.0013) [2024-06-15 14:29:58,475][1651596] Signal inference workers to stop experience collection... (11950 times) [2024-06-15 14:29:58,543][1653645] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-06-15 14:29:58,784][1651596] Signal inference workers to resume experience collection... (11950 times) [2024-06-15 14:29:58,786][1653645] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-06-15 14:30:00,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 473563136. Throughput: 0: 10763.3. Samples: 118480384. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:30:01,431][1653645] Updated weights for policy 0, policy_version 231236 (0.0128) [2024-06-15 14:30:02,864][1653645] Updated weights for policy 0, policy_version 231312 (0.0013) [2024-06-15 14:30:04,544][1653645] Updated weights for policy 0, policy_version 231377 (0.0017) [2024-06-15 14:30:05,488][1653645] Updated weights for policy 0, policy_version 231422 (0.0012) [2024-06-15 14:30:05,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.5, 300 sec: 44764.4). Total num frames: 473956352. Throughput: 0: 10706.5. Samples: 118511104. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:30:10,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 42052.1, 300 sec: 43764.7). Total num frames: 474021888. Throughput: 0: 10786.1. Samples: 118586880. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:30:11,467][1653645] Updated weights for policy 0, policy_version 231483 (0.0012) [2024-06-15 14:30:13,414][1653645] Updated weights for policy 0, policy_version 231536 (0.0015) [2024-06-15 14:30:14,592][1653645] Updated weights for policy 0, policy_version 231584 (0.0012) [2024-06-15 14:30:15,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 474415104. Throughput: 0: 10752.0. Samples: 118645760. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:30:16,479][1653645] Updated weights for policy 0, policy_version 231676 (0.0018) [2024-06-15 14:30:20,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 474480640. Throughput: 0: 10774.8. Samples: 118686720. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:30:22,610][1653645] Updated weights for policy 0, policy_version 231737 (0.0015) [2024-06-15 14:30:24,097][1653645] Updated weights for policy 0, policy_version 231776 (0.0198) [2024-06-15 14:30:25,510][1653645] Updated weights for policy 0, policy_version 231840 (0.0017) [2024-06-15 14:30:25,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 474808320. Throughput: 0: 11184.3. Samples: 118759424. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:30:27,576][1653645] Updated weights for policy 0, policy_version 231932 (0.0117) [2024-06-15 14:30:30,974][1648982] Fps is (10 sec: 52341.3, 60 sec: 43678.7, 300 sec: 44206.5). Total num frames: 475004928. Throughput: 0: 11203.0. Samples: 118828544. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:30,975][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:30:34,160][1653645] Updated weights for policy 0, policy_version 231994 (0.0014) [2024-06-15 14:30:35,580][1653645] Updated weights for policy 0, policy_version 232049 (0.0014) [2024-06-15 14:30:35,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 475267072. Throughput: 0: 11548.4. Samples: 118872064. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:35,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 14:30:36,223][1651596] Signal inference workers to stop experience collection... (12000 times) [2024-06-15 14:30:36,262][1653645] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-06-15 14:30:36,507][1651596] Signal inference workers to resume experience collection... (12000 times) [2024-06-15 14:30:36,509][1653645] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-06-15 14:30:37,252][1653645] Updated weights for policy 0, policy_version 232128 (0.0013) [2024-06-15 14:30:40,958][1648982] Fps is (10 sec: 52515.0, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 475529216. Throughput: 0: 11446.0. Samples: 118923776. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:40,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:30:44,384][1653645] Updated weights for policy 0, policy_version 232193 (0.0013) [2024-06-15 14:30:45,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 45875.1, 300 sec: 43986.8). Total num frames: 475660288. Throughput: 0: 11628.0. Samples: 119003648. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:30:46,140][1653645] Updated weights for policy 0, policy_version 232264 (0.0067) [2024-06-15 14:30:47,758][1653645] Updated weights for policy 0, policy_version 232322 (0.0013) [2024-06-15 14:30:49,341][1653645] Updated weights for policy 0, policy_version 232386 (0.0045) [2024-06-15 14:30:50,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 476053504. Throughput: 0: 11537.1. Samples: 119030272. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:30:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 476053504. Throughput: 0: 11537.0. Samples: 119106048. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 14:30:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:30:56,039][1653645] Updated weights for policy 0, policy_version 232464 (0.0118) [2024-06-15 14:30:56,361][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000232480_476119040.pth... [2024-06-15 14:30:56,540][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000227328_465567744.pth [2024-06-15 14:30:56,544][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000232480_476119040.pth [2024-06-15 14:30:58,496][1653645] Updated weights for policy 0, policy_version 232530 (0.0014) [2024-06-15 14:31:00,432][1653645] Updated weights for policy 0, policy_version 232610 (0.0241) [2024-06-15 14:31:00,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 47513.4, 300 sec: 44653.3). Total num frames: 476413952. Throughput: 0: 11593.9. Samples: 119167488. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:00,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:31:02,551][1653645] Updated weights for policy 0, policy_version 232692 (0.0019) [2024-06-15 14:31:05,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 476577792. Throughput: 0: 11366.4. Samples: 119198208. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:31:08,053][1653645] Updated weights for policy 0, policy_version 232765 (0.0014) [2024-06-15 14:31:10,862][1653645] Updated weights for policy 0, policy_version 232819 (0.0017) [2024-06-15 14:31:10,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 46421.3, 300 sec: 44320.1). Total num frames: 476807168. Throughput: 0: 11514.3. Samples: 119277568. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:31:12,594][1653645] Updated weights for policy 0, policy_version 232883 (0.0012) [2024-06-15 14:31:14,192][1653645] Updated weights for policy 0, policy_version 232952 (0.0120) [2024-06-15 14:31:15,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 477102080. Throughput: 0: 11188.5. Samples: 119331840. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:31:19,577][1653645] Updated weights for policy 0, policy_version 232992 (0.0013) [2024-06-15 14:31:20,970][1648982] Fps is (10 sec: 42545.9, 60 sec: 45865.6, 300 sec: 44207.2). Total num frames: 477233152. Throughput: 0: 11078.9. Samples: 119370752. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:20,971][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:31:21,260][1651596] Signal inference workers to stop experience collection... (12050 times) [2024-06-15 14:31:21,341][1653645] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-06-15 14:31:21,459][1651596] Signal inference workers to resume experience collection... (12050 times) [2024-06-15 14:31:21,460][1653645] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-06-15 14:31:21,462][1653645] Updated weights for policy 0, policy_version 233040 (0.0013) [2024-06-15 14:31:23,682][1653645] Updated weights for policy 0, policy_version 233120 (0.0013) [2024-06-15 14:31:25,319][1653645] Updated weights for policy 0, policy_version 233184 (0.0023) [2024-06-15 14:31:25,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 46421.5, 300 sec: 44764.4). Total num frames: 477593600. Throughput: 0: 11229.9. Samples: 119429120. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:31:30,791][1653645] Updated weights for policy 0, policy_version 233232 (0.0013) [2024-06-15 14:31:30,958][1648982] Fps is (10 sec: 42651.9, 60 sec: 44249.1, 300 sec: 44209.0). Total num frames: 477659136. Throughput: 0: 11127.5. Samples: 119504384. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:31:33,335][1653645] Updated weights for policy 0, policy_version 233296 (0.0028) [2024-06-15 14:31:35,685][1653645] Updated weights for policy 0, policy_version 233381 (0.0104) [2024-06-15 14:31:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45328.9, 300 sec: 44653.3). Total num frames: 477986816. Throughput: 0: 11332.2. Samples: 119540224. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:35,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 14:31:37,621][1653645] Updated weights for policy 0, policy_version 233432 (0.0022) [2024-06-15 14:31:40,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 478150656. Throughput: 0: 10865.8. Samples: 119595008. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:40,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 14:31:42,527][1653645] Updated weights for policy 0, policy_version 233488 (0.0014) [2024-06-15 14:31:43,338][1653645] Updated weights for policy 0, policy_version 233529 (0.0012) [2024-06-15 14:31:45,527][1653645] Updated weights for policy 0, policy_version 233584 (0.0014) [2024-06-15 14:31:45,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 478412800. Throughput: 0: 11173.0. Samples: 119670272. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:31:47,493][1653645] Updated weights for policy 0, policy_version 233632 (0.0035) [2024-06-15 14:31:49,042][1653645] Updated weights for policy 0, policy_version 233696 (0.0016) [2024-06-15 14:31:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 478674944. Throughput: 0: 11138.8. Samples: 119699456. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:31:55,169][1653645] Updated weights for policy 0, policy_version 233782 (0.0014) [2024-06-15 14:31:55,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45875.3, 300 sec: 44209.7). Total num frames: 478806016. Throughput: 0: 11002.3. Samples: 119772672. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:31:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:31:58,072][1653645] Updated weights for policy 0, policy_version 233850 (0.0013) [2024-06-15 14:31:59,828][1653645] Updated weights for policy 0, policy_version 233918 (0.0012) [2024-06-15 14:32:00,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45329.3, 300 sec: 44431.2). Total num frames: 479133696. Throughput: 0: 11047.8. Samples: 119828992. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:32:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:32:01,553][1653645] Updated weights for policy 0, policy_version 233980 (0.0014) [2024-06-15 14:32:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.6, 300 sec: 44101.3). Total num frames: 479199232. Throughput: 0: 10948.5. Samples: 119863296. Policy #0 lag: (min: 1.0, avg: 58.1, max: 257.0) [2024-06-15 14:32:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:32:06,702][1651596] Signal inference workers to stop experience collection... (12100 times) [2024-06-15 14:32:06,742][1653645] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-06-15 14:32:06,914][1651596] Signal inference workers to resume experience collection... (12100 times) [2024-06-15 14:32:06,917][1653645] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-06-15 14:32:07,475][1653645] Updated weights for policy 0, policy_version 234039 (0.0012) [2024-06-15 14:32:10,047][1653645] Updated weights for policy 0, policy_version 234096 (0.0013) [2024-06-15 14:32:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 479494144. Throughput: 0: 11298.2. Samples: 119937536. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:32:11,857][1653645] Updated weights for policy 0, policy_version 234176 (0.0013) [2024-06-15 14:32:13,316][1653645] Updated weights for policy 0, policy_version 234234 (0.0014) [2024-06-15 14:32:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 479723520. Throughput: 0: 10934.0. Samples: 119996416. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:32:19,079][1653645] Updated weights for policy 0, policy_version 234288 (0.0012) [2024-06-15 14:32:20,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43699.8, 300 sec: 43987.1). Total num frames: 479854592. Throughput: 0: 11002.3. Samples: 120035328. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:20,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:32:21,178][1653645] Updated weights for policy 0, policy_version 234307 (0.0011) [2024-06-15 14:32:22,691][1653645] Updated weights for policy 0, policy_version 234374 (0.0011) [2024-06-15 14:32:25,112][1653645] Updated weights for policy 0, policy_version 234480 (0.0106) [2024-06-15 14:32:25,958][1648982] Fps is (10 sec: 52425.8, 60 sec: 44236.5, 300 sec: 44431.1). Total num frames: 480247808. Throughput: 0: 11116.0. Samples: 120095232. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:32:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 480346112. Throughput: 0: 11093.3. Samples: 120169472. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:32:31,116][1653645] Updated weights for policy 0, policy_version 234556 (0.0107) [2024-06-15 14:32:35,078][1653645] Updated weights for policy 0, policy_version 234656 (0.0070) [2024-06-15 14:32:35,958][1648982] Fps is (10 sec: 39323.8, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 480641024. Throughput: 0: 11184.4. Samples: 120202752. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:32:36,849][1653645] Updated weights for policy 0, policy_version 234746 (0.0015) [2024-06-15 14:32:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 480772096. Throughput: 0: 10808.9. Samples: 120259072. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:32:43,728][1653645] Updated weights for policy 0, policy_version 234789 (0.0043) [2024-06-15 14:32:45,843][1653645] Updated weights for policy 0, policy_version 234867 (0.0017) [2024-06-15 14:32:45,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 481001472. Throughput: 0: 11059.2. Samples: 120326656. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:32:46,991][1651596] Signal inference workers to stop experience collection... (12150 times) [2024-06-15 14:32:47,083][1653645] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-06-15 14:32:47,085][1653645] Updated weights for policy 0, policy_version 234919 (0.0011) [2024-06-15 14:32:47,180][1651596] Signal inference workers to resume experience collection... (12150 times) [2024-06-15 14:32:47,180][1653645] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-06-15 14:32:48,196][1653645] Updated weights for policy 0, policy_version 234979 (0.0068) [2024-06-15 14:32:50,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 481296384. Throughput: 0: 10968.1. Samples: 120356864. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:50,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:32:55,668][1653645] Updated weights for policy 0, policy_version 235042 (0.0013) [2024-06-15 14:32:55,958][1648982] Fps is (10 sec: 39319.4, 60 sec: 43144.3, 300 sec: 43875.8). Total num frames: 481394688. Throughput: 0: 11081.8. Samples: 120436224. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:32:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:32:56,168][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000235072_481427456.pth... [2024-06-15 14:32:56,309][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000229888_470810624.pth [2024-06-15 14:32:56,754][1653645] Updated weights for policy 0, policy_version 235089 (0.0030) [2024-06-15 14:32:58,138][1653645] Updated weights for policy 0, policy_version 235152 (0.0012) [2024-06-15 14:33:00,044][1653645] Updated weights for policy 0, policy_version 235235 (0.0126) [2024-06-15 14:33:00,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 481820672. Throughput: 0: 10990.9. Samples: 120491008. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:33:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:33:05,958][1648982] Fps is (10 sec: 42600.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 481820672. Throughput: 0: 10979.6. Samples: 120529408. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:33:05,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:33:07,334][1653645] Updated weights for policy 0, policy_version 235281 (0.0012) [2024-06-15 14:33:08,791][1653645] Updated weights for policy 0, policy_version 235344 (0.0036) [2024-06-15 14:33:10,636][1653645] Updated weights for policy 0, policy_version 235425 (0.0142) [2024-06-15 14:33:10,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 482181120. Throughput: 0: 11082.1. Samples: 120593920. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:33:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:33:11,899][1653645] Updated weights for policy 0, policy_version 235475 (0.0011) [2024-06-15 14:33:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.9). Total num frames: 482344960. Throughput: 0: 10808.9. Samples: 120655872. Policy #0 lag: (min: 15.0, avg: 96.0, max: 271.0) [2024-06-15 14:33:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:33:19,428][1653645] Updated weights for policy 0, policy_version 235540 (0.0014) [2024-06-15 14:33:20,136][1653645] Updated weights for policy 0, policy_version 235584 (0.0013) [2024-06-15 14:33:20,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 482476032. Throughput: 0: 10934.0. Samples: 120694784. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:33:22,052][1653645] Updated weights for policy 0, policy_version 235634 (0.0013) [2024-06-15 14:33:23,982][1653645] Updated weights for policy 0, policy_version 235712 (0.0013) [2024-06-15 14:33:25,077][1653645] Updated weights for policy 0, policy_version 235765 (0.0013) [2024-06-15 14:33:25,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.9, 300 sec: 44431.1). Total num frames: 482869248. Throughput: 0: 10968.1. Samples: 120752640. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:25,961][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:33:30,571][1651596] Signal inference workers to stop experience collection... (12200 times) [2024-06-15 14:33:30,611][1653645] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-06-15 14:33:30,758][1651596] Signal inference workers to resume experience collection... (12200 times) [2024-06-15 14:33:30,760][1653645] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-06-15 14:33:30,762][1653645] Updated weights for policy 0, policy_version 235808 (0.0015) [2024-06-15 14:33:30,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 44209.1). Total num frames: 482934784. Throughput: 0: 11173.0. Samples: 120829440. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:33:33,120][1653645] Updated weights for policy 0, policy_version 235872 (0.0018) [2024-06-15 14:33:34,648][1653645] Updated weights for policy 0, policy_version 235936 (0.0011) [2024-06-15 14:33:35,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 483295232. Throughput: 0: 11229.9. Samples: 120862208. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:35,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:33:36,743][1653645] Updated weights for policy 0, policy_version 236025 (0.0013) [2024-06-15 14:33:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 483393536. Throughput: 0: 10740.7. Samples: 120919552. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:33:42,691][1653645] Updated weights for policy 0, policy_version 236070 (0.0015) [2024-06-15 14:33:44,958][1653645] Updated weights for policy 0, policy_version 236112 (0.0013) [2024-06-15 14:33:45,959][1648982] Fps is (10 sec: 32765.1, 60 sec: 43690.0, 300 sec: 43875.7). Total num frames: 483622912. Throughput: 0: 11206.9. Samples: 120995328. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:33:47,183][1653645] Updated weights for policy 0, policy_version 236208 (0.0012) [2024-06-15 14:33:48,820][1653645] Updated weights for policy 0, policy_version 236272 (0.0013) [2024-06-15 14:33:50,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 483917824. Throughput: 0: 10808.9. Samples: 121015808. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:33:55,238][1653645] Updated weights for policy 0, policy_version 236321 (0.0023) [2024-06-15 14:33:55,958][1648982] Fps is (10 sec: 42602.4, 60 sec: 44237.2, 300 sec: 43986.9). Total num frames: 484048896. Throughput: 0: 11013.7. Samples: 121089536. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:33:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:33:55,959][1653645] Updated weights for policy 0, policy_version 236352 (0.0052) [2024-06-15 14:33:58,467][1653645] Updated weights for policy 0, policy_version 236416 (0.0116) [2024-06-15 14:34:00,560][1653645] Updated weights for policy 0, policy_version 236497 (0.0043) [2024-06-15 14:34:00,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 484376576. Throughput: 0: 10831.6. Samples: 121143296. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:34:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:34:05,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 484442112. Throughput: 0: 10763.4. Samples: 121179136. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:34:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:34:06,594][1653645] Updated weights for policy 0, policy_version 236549 (0.0017) [2024-06-15 14:34:07,753][1653645] Updated weights for policy 0, policy_version 236607 (0.0082) [2024-06-15 14:34:10,285][1653645] Updated weights for policy 0, policy_version 236688 (0.0140) [2024-06-15 14:34:10,877][1651596] Signal inference workers to stop experience collection... (12250 times) [2024-06-15 14:34:10,929][1653645] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-06-15 14:34:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43144.3, 300 sec: 43986.8). Total num frames: 484769792. Throughput: 0: 11104.7. Samples: 121252352. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:34:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:34:11,086][1651596] Signal inference workers to resume experience collection... (12250 times) [2024-06-15 14:34:11,087][1653645] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-06-15 14:34:12,355][1653645] Updated weights for policy 0, policy_version 236770 (0.0200) [2024-06-15 14:34:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 484966400. Throughput: 0: 10683.7. Samples: 121310208. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:34:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:34:19,199][1653645] Updated weights for policy 0, policy_version 236820 (0.0014) [2024-06-15 14:34:20,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 485130240. Throughput: 0: 10877.2. Samples: 121351680. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:34:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:34:21,121][1653645] Updated weights for policy 0, policy_version 236886 (0.0013) [2024-06-15 14:34:23,461][1653645] Updated weights for policy 0, policy_version 236981 (0.0013) [2024-06-15 14:34:25,020][1653645] Updated weights for policy 0, policy_version 237056 (0.0013) [2024-06-15 14:34:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 485490688. Throughput: 0: 10729.2. Samples: 121402368. Policy #0 lag: (min: 15.0, avg: 77.2, max: 271.0) [2024-06-15 14:34:25,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:34:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 485523456. Throughput: 0: 10740.8. Samples: 121478656. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:34:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:34:31,857][1653645] Updated weights for policy 0, policy_version 237120 (0.0018) [2024-06-15 14:34:35,200][1653645] Updated weights for policy 0, policy_version 237216 (0.0013) [2024-06-15 14:34:35,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 42598.2, 300 sec: 43875.8). Total num frames: 485851136. Throughput: 0: 11047.7. Samples: 121512960. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:34:35,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:34:36,601][1653645] Updated weights for policy 0, policy_version 237267 (0.0010) [2024-06-15 14:34:37,541][1653645] Updated weights for policy 0, policy_version 237311 (0.0012) [2024-06-15 14:34:41,002][1648982] Fps is (10 sec: 48932.8, 60 sec: 43658.1, 300 sec: 44424.4). Total num frames: 486014976. Throughput: 0: 10639.0. Samples: 121568768. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:34:41,003][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:34:43,259][1653645] Updated weights for policy 0, policy_version 237346 (0.0012) [2024-06-15 14:34:45,589][1653645] Updated weights for policy 0, policy_version 237395 (0.0014) [2024-06-15 14:34:45,958][1648982] Fps is (10 sec: 36046.2, 60 sec: 43145.2, 300 sec: 44098.0). Total num frames: 486211584. Throughput: 0: 11082.0. Samples: 121641984. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:34:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:34:47,451][1653645] Updated weights for policy 0, policy_version 237472 (0.0013) [2024-06-15 14:34:49,007][1653645] Updated weights for policy 0, policy_version 237536 (0.0012) [2024-06-15 14:34:50,958][1648982] Fps is (10 sec: 52662.6, 60 sec: 43690.3, 300 sec: 44431.1). Total num frames: 486539264. Throughput: 0: 10774.7. Samples: 121664000. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:34:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:34:55,134][1653645] Updated weights for policy 0, policy_version 237573 (0.0046) [2024-06-15 14:34:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 486604800. Throughput: 0: 10729.3. Samples: 121735168. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:34:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:34:56,304][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000237632_486670336.pth... [2024-06-15 14:34:56,345][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000232480_476119040.pth [2024-06-15 14:34:57,340][1651596] Signal inference workers to stop experience collection... (12300 times) [2024-06-15 14:34:57,411][1653645] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-06-15 14:34:57,631][1651596] Signal inference workers to resume experience collection... (12300 times) [2024-06-15 14:34:57,633][1653645] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-06-15 14:34:57,749][1653645] Updated weights for policy 0, policy_version 237649 (0.0013) [2024-06-15 14:34:59,155][1653645] Updated weights for policy 0, policy_version 237712 (0.0011) [2024-06-15 14:35:00,958][1648982] Fps is (10 sec: 45877.2, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 486998016. Throughput: 0: 10774.8. Samples: 121795072. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:35:01,037][1653645] Updated weights for policy 0, policy_version 237794 (0.0011) [2024-06-15 14:35:05,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 487063552. Throughput: 0: 10672.4. Samples: 121831936. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:35:08,082][1653645] Updated weights for policy 0, policy_version 237856 (0.0011) [2024-06-15 14:35:10,279][1653645] Updated weights for policy 0, policy_version 237907 (0.0035) [2024-06-15 14:35:10,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 42052.5, 300 sec: 43653.6). Total num frames: 487292928. Throughput: 0: 11047.8. Samples: 121899520. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:35:12,383][1653645] Updated weights for policy 0, policy_version 238003 (0.0014) [2024-06-15 14:35:14,163][1653645] Updated weights for policy 0, policy_version 238076 (0.0025) [2024-06-15 14:35:15,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 487587840. Throughput: 0: 10467.5. Samples: 121949696. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:15,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 14:35:20,867][1653645] Updated weights for policy 0, policy_version 238143 (0.0011) [2024-06-15 14:35:20,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 43764.7). Total num frames: 487718912. Throughput: 0: 10558.6. Samples: 121988096. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:20,959][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 14:35:22,977][1653645] Updated weights for policy 0, policy_version 238209 (0.0015) [2024-06-15 14:35:25,450][1653645] Updated weights for policy 0, policy_version 238304 (0.0014) [2024-06-15 14:35:25,957][1648982] Fps is (10 sec: 49153.6, 60 sec: 43144.7, 300 sec: 44322.6). Total num frames: 488079360. Throughput: 0: 10751.4. Samples: 122052096. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:35:30,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 488112128. Throughput: 0: 10581.3. Samples: 122118144. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:35:32,771][1653645] Updated weights for policy 0, policy_version 238368 (0.0075) [2024-06-15 14:35:34,221][1653645] Updated weights for policy 0, policy_version 238416 (0.0150) [2024-06-15 14:35:35,988][1648982] Fps is (10 sec: 29401.9, 60 sec: 42031.3, 300 sec: 43538.1). Total num frames: 488374272. Throughput: 0: 10915.4. Samples: 122155520. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:35,988][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:35:37,076][1653645] Updated weights for policy 0, policy_version 238515 (0.0274) [2024-06-15 14:35:37,455][1651596] Signal inference workers to stop experience collection... (12350 times) [2024-06-15 14:35:37,510][1653645] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-06-15 14:35:37,876][1651596] Signal inference workers to resume experience collection... (12350 times) [2024-06-15 14:35:37,877][1653645] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-06-15 14:35:38,932][1653645] Updated weights for policy 0, policy_version 238592 (0.0018) [2024-06-15 14:35:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43723.2, 300 sec: 43986.9). Total num frames: 488636416. Throughput: 0: 10319.6. Samples: 122199552. Policy #0 lag: (min: 3.0, avg: 63.3, max: 259.0) [2024-06-15 14:35:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:35:45,958][1648982] Fps is (10 sec: 36153.2, 60 sec: 42052.1, 300 sec: 42987.1). Total num frames: 488734720. Throughput: 0: 10763.3. Samples: 122279424. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:35:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:35:46,094][1653645] Updated weights for policy 0, policy_version 238656 (0.0012) [2024-06-15 14:35:49,632][1653645] Updated weights for policy 0, policy_version 238755 (0.0089) [2024-06-15 14:35:50,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 42052.6, 300 sec: 44098.0). Total num frames: 489062400. Throughput: 0: 10547.2. Samples: 122306560. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:35:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:35:51,334][1653645] Updated weights for policy 0, policy_version 238818 (0.0012) [2024-06-15 14:35:55,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 43209.4). Total num frames: 489160704. Throughput: 0: 10365.2. Samples: 122365952. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:35:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:35:58,406][1653645] Updated weights for policy 0, policy_version 238907 (0.0014) [2024-06-15 14:36:00,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 39867.7, 300 sec: 43431.5). Total num frames: 489390080. Throughput: 0: 10763.4. Samples: 122434048. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:00,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:36:01,590][1653645] Updated weights for policy 0, policy_version 238978 (0.0014) [2024-06-15 14:36:03,595][1653645] Updated weights for policy 0, policy_version 239056 (0.0015) [2024-06-15 14:36:04,962][1653645] Updated weights for policy 0, policy_version 239101 (0.0012) [2024-06-15 14:36:05,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 489684992. Throughput: 0: 10410.8. Samples: 122456576. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:36:10,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 40413.8, 300 sec: 42765.0). Total num frames: 489717760. Throughput: 0: 10501.6. Samples: 122524672. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:36:11,794][1653645] Updated weights for policy 0, policy_version 239158 (0.0014) [2024-06-15 14:36:13,092][1653645] Updated weights for policy 0, policy_version 239200 (0.0012) [2024-06-15 14:36:14,853][1653645] Updated weights for policy 0, policy_version 239274 (0.0091) [2024-06-15 14:36:15,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 42052.4, 300 sec: 43655.5). Total num frames: 490110976. Throughput: 0: 10251.4. Samples: 122579456. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:36:16,789][1653645] Updated weights for policy 0, policy_version 239358 (0.0024) [2024-06-15 14:36:20,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 41506.3, 300 sec: 42765.0). Total num frames: 490209280. Throughput: 0: 10087.5. Samples: 122609152. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:20,963][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 14:36:24,869][1653645] Updated weights for policy 0, policy_version 239414 (0.0022) [2024-06-15 14:36:25,143][1651596] Signal inference workers to stop experience collection... (12400 times) [2024-06-15 14:36:25,241][1653645] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-06-15 14:36:25,354][1651596] Signal inference workers to resume experience collection... (12400 times) [2024-06-15 14:36:25,355][1653645] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-06-15 14:36:25,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 38775.3, 300 sec: 43209.3). Total num frames: 490405888. Throughput: 0: 10740.6. Samples: 122682880. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:25,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:36:26,599][1653645] Updated weights for policy 0, policy_version 239488 (0.0024) [2024-06-15 14:36:28,414][1653645] Updated weights for policy 0, policy_version 239570 (0.0012) [2024-06-15 14:36:29,187][1653645] Updated weights for policy 0, policy_version 239611 (0.0035) [2024-06-15 14:36:30,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 490733568. Throughput: 0: 10228.6. Samples: 122739712. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:36:35,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 40434.0, 300 sec: 42876.1). Total num frames: 490799104. Throughput: 0: 10490.2. Samples: 122778624. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:35,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:36:36,553][1653645] Updated weights for policy 0, policy_version 239665 (0.0012) [2024-06-15 14:36:38,202][1653645] Updated weights for policy 0, policy_version 239731 (0.0014) [2024-06-15 14:36:40,076][1653645] Updated weights for policy 0, policy_version 239824 (0.0020) [2024-06-15 14:36:40,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 491225088. Throughput: 0: 10513.1. Samples: 122839040. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:36:45,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 491257856. Throughput: 0: 10592.7. Samples: 122910720. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:36:47,486][1653645] Updated weights for policy 0, policy_version 239888 (0.0029) [2024-06-15 14:36:49,627][1653645] Updated weights for policy 0, policy_version 239974 (0.0045) [2024-06-15 14:36:50,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 42052.3, 300 sec: 43320.4). Total num frames: 491585536. Throughput: 0: 10808.9. Samples: 122942976. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:36:51,375][1653645] Updated weights for policy 0, policy_version 240064 (0.0119) [2024-06-15 14:36:52,747][1653645] Updated weights for policy 0, policy_version 240128 (0.0016) [2024-06-15 14:36:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 491782144. Throughput: 0: 10592.8. Samples: 123001344. Policy #0 lag: (min: 15.0, avg: 74.8, max: 271.0) [2024-06-15 14:36:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:36:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000240128_491782144.pth... [2024-06-15 14:36:56,046][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000235072_481427456.pth [2024-06-15 14:36:59,746][1653645] Updated weights for policy 0, policy_version 240180 (0.0012) [2024-06-15 14:37:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 43320.4). Total num frames: 491978752. Throughput: 0: 11104.7. Samples: 123079168. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:37:01,163][1653645] Updated weights for policy 0, policy_version 240246 (0.0027) [2024-06-15 14:37:01,673][1651596] Signal inference workers to stop experience collection... (12450 times) [2024-06-15 14:37:01,688][1653645] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-06-15 14:37:01,907][1651596] Signal inference workers to resume experience collection... (12450 times) [2024-06-15 14:37:01,910][1653645] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-06-15 14:37:03,080][1653645] Updated weights for policy 0, policy_version 240336 (0.0117) [2024-06-15 14:37:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 492306432. Throughput: 0: 10979.6. Samples: 123103232. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:37:10,471][1653645] Updated weights for policy 0, policy_version 240400 (0.0013) [2024-06-15 14:37:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44236.9, 300 sec: 42876.1). Total num frames: 492371968. Throughput: 0: 11059.2. Samples: 123180544. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:37:13,578][1653645] Updated weights for policy 0, policy_version 240514 (0.0013) [2024-06-15 14:37:15,881][1653645] Updated weights for policy 0, policy_version 240608 (0.0014) [2024-06-15 14:37:15,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 492765184. Throughput: 0: 11025.1. Samples: 123235840. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:37:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 492830720. Throughput: 0: 10968.3. Samples: 123272192. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:37:22,322][1653645] Updated weights for policy 0, policy_version 240650 (0.0013) [2024-06-15 14:37:24,229][1653645] Updated weights for policy 0, policy_version 240736 (0.0011) [2024-06-15 14:37:25,166][1653645] Updated weights for policy 0, policy_version 240770 (0.0013) [2024-06-15 14:37:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 43431.5). Total num frames: 493158400. Throughput: 0: 11332.3. Samples: 123348992. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:37:27,791][1653645] Updated weights for policy 0, policy_version 240864 (0.0015) [2024-06-15 14:37:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.8, 300 sec: 43098.2). Total num frames: 493355008. Throughput: 0: 10922.7. Samples: 123402240. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:37:35,154][1653645] Updated weights for policy 0, policy_version 240944 (0.0014) [2024-06-15 14:37:35,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.2, 300 sec: 43209.3). Total num frames: 493518848. Throughput: 0: 11081.9. Samples: 123441664. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:37:37,108][1653645] Updated weights for policy 0, policy_version 241016 (0.0012) [2024-06-15 14:37:38,986][1653645] Updated weights for policy 0, policy_version 241076 (0.0016) [2024-06-15 14:37:40,760][1653645] Updated weights for policy 0, policy_version 241136 (0.0025) [2024-06-15 14:37:40,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 493846528. Throughput: 0: 10968.2. Samples: 123494912. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:37:45,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 493879296. Throughput: 0: 10831.6. Samples: 123566592. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:45,962][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:37:46,770][1651596] Signal inference workers to stop experience collection... (12500 times) [2024-06-15 14:37:46,826][1653645] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-06-15 14:37:47,006][1651596] Signal inference workers to resume experience collection... (12500 times) [2024-06-15 14:37:47,007][1653645] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-06-15 14:37:47,180][1653645] Updated weights for policy 0, policy_version 241188 (0.0011) [2024-06-15 14:37:49,252][1653645] Updated weights for policy 0, policy_version 241264 (0.0026) [2024-06-15 14:37:50,587][1653645] Updated weights for policy 0, policy_version 241296 (0.0011) [2024-06-15 14:37:50,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43144.5, 300 sec: 43320.5). Total num frames: 494174208. Throughput: 0: 11036.4. Samples: 123599872. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:50,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 14:37:52,064][1653645] Updated weights for policy 0, policy_version 241345 (0.0015) [2024-06-15 14:37:53,506][1653645] Updated weights for policy 0, policy_version 241407 (0.0014) [2024-06-15 14:37:55,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 494403584. Throughput: 0: 10535.8. Samples: 123654656. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:37:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:38:00,530][1653645] Updated weights for policy 0, policy_version 241475 (0.0110) [2024-06-15 14:38:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 494567424. Throughput: 0: 10934.1. Samples: 123727872. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:38:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:38:03,057][1653645] Updated weights for policy 0, policy_version 241539 (0.0018) [2024-06-15 14:38:05,181][1653645] Updated weights for policy 0, policy_version 241632 (0.0013) [2024-06-15 14:38:05,896][1653645] Updated weights for policy 0, policy_version 241664 (0.0042) [2024-06-15 14:38:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 494927872. Throughput: 0: 10717.9. Samples: 123754496. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:38:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:38:10,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 494927872. Throughput: 0: 10615.4. Samples: 123826688. Policy #0 lag: (min: 7.0, avg: 57.2, max: 263.0) [2024-06-15 14:38:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:38:12,849][1653645] Updated weights for policy 0, policy_version 241731 (0.0014) [2024-06-15 14:38:14,018][1653645] Updated weights for policy 0, policy_version 241790 (0.0127) [2024-06-15 14:38:15,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 42052.1, 300 sec: 43431.5). Total num frames: 495288320. Throughput: 0: 10729.2. Samples: 123885056. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:38:16,398][1653645] Updated weights for policy 0, policy_version 241856 (0.0077) [2024-06-15 14:38:17,793][1653645] Updated weights for policy 0, policy_version 241917 (0.0014) [2024-06-15 14:38:20,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 495452160. Throughput: 0: 10387.9. Samples: 123909120. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:38:24,355][1653645] Updated weights for policy 0, policy_version 241976 (0.0013) [2024-06-15 14:38:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42052.1, 300 sec: 43209.3). Total num frames: 495681536. Throughput: 0: 10865.7. Samples: 123983872. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:38:26,035][1653645] Updated weights for policy 0, policy_version 242048 (0.0012) [2024-06-15 14:38:27,830][1651596] Signal inference workers to stop experience collection... (12550 times) [2024-06-15 14:38:27,884][1653645] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-06-15 14:38:28,032][1651596] Signal inference workers to resume experience collection... (12550 times) [2024-06-15 14:38:28,033][1653645] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-06-15 14:38:28,327][1653645] Updated weights for policy 0, policy_version 242112 (0.0107) [2024-06-15 14:38:29,588][1653645] Updated weights for policy 0, policy_version 242170 (0.0012) [2024-06-15 14:38:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 495976448. Throughput: 0: 10672.4. Samples: 124046848. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:38:34,785][1653645] Updated weights for policy 0, policy_version 242224 (0.0014) [2024-06-15 14:38:35,852][1653645] Updated weights for policy 0, policy_version 242256 (0.0012) [2024-06-15 14:38:35,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 496140288. Throughput: 0: 10899.9. Samples: 124090368. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:38:38,996][1653645] Updated weights for policy 0, policy_version 242320 (0.0014) [2024-06-15 14:38:40,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 43320.5). Total num frames: 496402432. Throughput: 0: 11036.4. Samples: 124151296. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:38:41,311][1653645] Updated weights for policy 0, policy_version 242404 (0.0014) [2024-06-15 14:38:45,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 496500736. Throughput: 0: 10945.4. Samples: 124220416. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:38:46,847][1653645] Updated weights for policy 0, policy_version 242469 (0.0023) [2024-06-15 14:38:48,959][1653645] Updated weights for policy 0, policy_version 242520 (0.0013) [2024-06-15 14:38:50,644][1653645] Updated weights for policy 0, policy_version 242583 (0.0013) [2024-06-15 14:38:50,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44236.7, 300 sec: 43320.4). Total num frames: 496828416. Throughput: 0: 11059.2. Samples: 124252160. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:38:52,548][1653645] Updated weights for policy 0, policy_version 242656 (0.0014) [2024-06-15 14:38:55,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 497025024. Throughput: 0: 10729.3. Samples: 124309504. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:38:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:38:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000242688_497025024.pth... [2024-06-15 14:38:56,030][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000237632_486670336.pth [2024-06-15 14:38:57,200][1653645] Updated weights for policy 0, policy_version 242690 (0.0013) [2024-06-15 14:39:00,958][1648982] Fps is (10 sec: 32766.8, 60 sec: 43144.1, 300 sec: 43098.2). Total num frames: 497156096. Throughput: 0: 11059.1. Samples: 124382720. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:39:00,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 14:39:01,582][1653645] Updated weights for policy 0, policy_version 242784 (0.0101) [2024-06-15 14:39:02,658][1653645] Updated weights for policy 0, policy_version 242833 (0.0013) [2024-06-15 14:39:03,889][1653645] Updated weights for policy 0, policy_version 242880 (0.0078) [2024-06-15 14:39:05,402][1653645] Updated weights for policy 0, policy_version 242937 (0.0037) [2024-06-15 14:39:05,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43320.5). Total num frames: 497549312. Throughput: 0: 11184.3. Samples: 124412416. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:39:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:39:10,960][1648982] Fps is (10 sec: 52419.7, 60 sec: 45873.6, 300 sec: 43097.9). Total num frames: 497680384. Throughput: 0: 11183.9. Samples: 124487168. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:39:10,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:39:12,242][1653645] Updated weights for policy 0, policy_version 243024 (0.0014) [2024-06-15 14:39:12,979][1651596] Signal inference workers to stop experience collection... (12600 times) [2024-06-15 14:39:13,038][1653645] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-06-15 14:39:13,252][1651596] Signal inference workers to resume experience collection... (12600 times) [2024-06-15 14:39:13,253][1653645] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-06-15 14:39:14,732][1653645] Updated weights for policy 0, policy_version 243127 (0.0013) [2024-06-15 14:39:15,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44237.0, 300 sec: 43431.5). Total num frames: 497942528. Throughput: 0: 11104.7. Samples: 124546560. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:39:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:39:17,182][1653645] Updated weights for policy 0, policy_version 243192 (0.0155) [2024-06-15 14:39:20,958][1648982] Fps is (10 sec: 45883.6, 60 sec: 44782.6, 300 sec: 42876.0). Total num frames: 498139136. Throughput: 0: 11013.6. Samples: 124585984. Policy #0 lag: (min: 15.0, avg: 80.7, max: 271.0) [2024-06-15 14:39:20,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:39:21,128][1653645] Updated weights for policy 0, policy_version 243237 (0.0018) [2024-06-15 14:39:24,261][1653645] Updated weights for policy 0, policy_version 243312 (0.0014) [2024-06-15 14:39:25,082][1653645] Updated weights for policy 0, policy_version 243344 (0.0013) [2024-06-15 14:39:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45329.3, 300 sec: 43653.6). Total num frames: 498401280. Throughput: 0: 11229.9. Samples: 124656640. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:39:26,369][1653645] Updated weights for policy 0, policy_version 243392 (0.0013) [2024-06-15 14:39:28,695][1653645] Updated weights for policy 0, policy_version 243455 (0.0014) [2024-06-15 14:39:30,958][1648982] Fps is (10 sec: 45877.6, 60 sec: 43690.7, 300 sec: 43209.4). Total num frames: 498597888. Throughput: 0: 11173.0. Samples: 124723200. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:39:35,415][1653645] Updated weights for policy 0, policy_version 243521 (0.0013) [2024-06-15 14:39:35,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.7, 300 sec: 43215.9). Total num frames: 498761728. Throughput: 0: 11229.9. Samples: 124757504. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:39:37,093][1653645] Updated weights for policy 0, policy_version 243585 (0.0012) [2024-06-15 14:39:38,214][1653645] Updated weights for policy 0, policy_version 243642 (0.0118) [2024-06-15 14:39:40,340][1653645] Updated weights for policy 0, policy_version 243711 (0.0012) [2024-06-15 14:39:40,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 499122176. Throughput: 0: 11332.2. Samples: 124819456. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:39:45,262][1653645] Updated weights for policy 0, policy_version 243766 (0.0012) [2024-06-15 14:39:45,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 43098.3). Total num frames: 499253248. Throughput: 0: 11070.7. Samples: 124880896. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:39:48,897][1653645] Updated weights for policy 0, policy_version 243830 (0.0013) [2024-06-15 14:39:50,952][1653645] Updated weights for policy 0, policy_version 243891 (0.0014) [2024-06-15 14:39:50,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 44237.0, 300 sec: 43653.7). Total num frames: 499482624. Throughput: 0: 11252.6. Samples: 124918784. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:39:52,679][1653645] Updated weights for policy 0, policy_version 243964 (0.0102) [2024-06-15 14:39:55,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.4, 300 sec: 42876.0). Total num frames: 499646464. Throughput: 0: 10820.7. Samples: 124974080. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:39:55,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 14:39:57,831][1653645] Updated weights for policy 0, policy_version 244022 (0.0013) [2024-06-15 14:40:00,815][1651596] Signal inference workers to stop experience collection... (12650 times) [2024-06-15 14:40:00,870][1653645] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-06-15 14:40:00,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44237.2, 300 sec: 43209.3). Total num frames: 499810304. Throughput: 0: 11082.0. Samples: 125045248. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:40:01,133][1651596] Signal inference workers to resume experience collection... (12650 times) [2024-06-15 14:40:01,134][1653645] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-06-15 14:40:01,797][1653645] Updated weights for policy 0, policy_version 244093 (0.0079) [2024-06-15 14:40:03,885][1653645] Updated weights for policy 0, policy_version 244162 (0.0012) [2024-06-15 14:40:05,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 500170752. Throughput: 0: 10809.0. Samples: 125072384. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:40:08,484][1653645] Updated weights for policy 0, policy_version 244227 (0.0012) [2024-06-15 14:40:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43692.3, 300 sec: 43098.3). Total num frames: 500301824. Throughput: 0: 10706.5. Samples: 125138432. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:40:12,331][1653645] Updated weights for policy 0, policy_version 244291 (0.0012) [2024-06-15 14:40:13,777][1653645] Updated weights for policy 0, policy_version 244353 (0.0013) [2024-06-15 14:40:15,157][1653645] Updated weights for policy 0, policy_version 244405 (0.0012) [2024-06-15 14:40:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 500563968. Throughput: 0: 10717.8. Samples: 125205504. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:15,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:40:17,084][1653645] Updated weights for policy 0, policy_version 244465 (0.0013) [2024-06-15 14:40:20,192][1653645] Updated weights for policy 0, policy_version 244484 (0.0024) [2024-06-15 14:40:20,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 43691.1, 300 sec: 42987.2). Total num frames: 500760576. Throughput: 0: 10615.5. Samples: 125235200. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:40:21,256][1653645] Updated weights for policy 0, policy_version 244535 (0.0011) [2024-06-15 14:40:24,859][1653645] Updated weights for policy 0, policy_version 244593 (0.0012) [2024-06-15 14:40:25,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 501022720. Throughput: 0: 11002.3. Samples: 125314560. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:40:26,408][1653645] Updated weights for policy 0, policy_version 244656 (0.0012) [2024-06-15 14:40:27,231][1653645] Updated weights for policy 0, policy_version 244688 (0.0011) [2024-06-15 14:40:28,364][1653645] Updated weights for policy 0, policy_version 244736 (0.0012) [2024-06-15 14:40:30,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 43547.0). Total num frames: 501219328. Throughput: 0: 11025.1. Samples: 125377024. Policy #0 lag: (min: 47.0, avg: 146.7, max: 303.0) [2024-06-15 14:40:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:40:35,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 501350400. Throughput: 0: 10888.5. Samples: 125408768. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:40:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:40:36,180][1653645] Updated weights for policy 0, policy_version 244816 (0.0033) [2024-06-15 14:40:38,699][1653645] Updated weights for policy 0, policy_version 244920 (0.0080) [2024-06-15 14:40:40,095][1653645] Updated weights for policy 0, policy_version 244964 (0.0013) [2024-06-15 14:40:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 501743616. Throughput: 0: 11036.5. Samples: 125470720. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:40:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:40:43,633][1653645] Updated weights for policy 0, policy_version 244993 (0.0012) [2024-06-15 14:40:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 501874688. Throughput: 0: 11013.7. Samples: 125540864. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:40:45,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:40:47,777][1651596] Signal inference workers to stop experience collection... (12700 times) [2024-06-15 14:40:47,861][1653645] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-06-15 14:40:48,147][1651596] Signal inference workers to resume experience collection... (12700 times) [2024-06-15 14:40:48,148][1653645] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-06-15 14:40:48,229][1653645] Updated weights for policy 0, policy_version 245072 (0.0092) [2024-06-15 14:40:49,993][1653645] Updated weights for policy 0, policy_version 245142 (0.0097) [2024-06-15 14:40:50,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 502104064. Throughput: 0: 11241.2. Samples: 125578240. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:40:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:40:51,098][1653645] Updated weights for policy 0, policy_version 245183 (0.0016) [2024-06-15 14:40:52,465][1653645] Updated weights for policy 0, policy_version 245241 (0.0025) [2024-06-15 14:40:55,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 502267904. Throughput: 0: 11081.9. Samples: 125637120. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:40:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:40:56,476][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000245280_502333440.pth... [2024-06-15 14:40:56,625][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000240128_491782144.pth [2024-06-15 14:40:57,189][1653645] Updated weights for policy 0, policy_version 245309 (0.0014) [2024-06-15 14:41:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 502497280. Throughput: 0: 10990.9. Samples: 125700096. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:41:01,047][1653645] Updated weights for policy 0, policy_version 245370 (0.0011) [2024-06-15 14:41:02,869][1653645] Updated weights for policy 0, policy_version 245428 (0.0014) [2024-06-15 14:41:04,461][1653645] Updated weights for policy 0, policy_version 245472 (0.0018) [2024-06-15 14:41:05,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 502792192. Throughput: 0: 11047.8. Samples: 125732352. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:41:08,341][1653645] Updated weights for policy 0, policy_version 245536 (0.0113) [2024-06-15 14:41:10,962][1648982] Fps is (10 sec: 42582.3, 60 sec: 43687.9, 300 sec: 43430.9). Total num frames: 502923264. Throughput: 0: 10785.2. Samples: 125799936. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:10,962][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:41:11,892][1653645] Updated weights for policy 0, policy_version 245602 (0.0026) [2024-06-15 14:41:14,291][1653645] Updated weights for policy 0, policy_version 245664 (0.0098) [2024-06-15 14:41:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 503185408. Throughput: 0: 10865.8. Samples: 125865984. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:41:16,927][1653645] Updated weights for policy 0, policy_version 245744 (0.0022) [2024-06-15 14:41:20,959][1648982] Fps is (10 sec: 45892.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 503382016. Throughput: 0: 10797.5. Samples: 125894656. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:20,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:41:21,290][1653645] Updated weights for policy 0, policy_version 245814 (0.0013) [2024-06-15 14:41:24,523][1653645] Updated weights for policy 0, policy_version 245877 (0.0013) [2024-06-15 14:41:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 43542.6). Total num frames: 503578624. Throughput: 0: 10865.8. Samples: 125959680. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:41:26,584][1653645] Updated weights for policy 0, policy_version 245920 (0.0012) [2024-06-15 14:41:29,721][1653645] Updated weights for policy 0, policy_version 246014 (0.0013) [2024-06-15 14:41:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 503840768. Throughput: 0: 10740.6. Samples: 126024192. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:41:33,318][1653645] Updated weights for policy 0, policy_version 246071 (0.0012) [2024-06-15 14:41:34,798][1651596] Signal inference workers to stop experience collection... (12750 times) [2024-06-15 14:41:34,848][1653645] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-06-15 14:41:35,124][1651596] Signal inference workers to resume experience collection... (12750 times) [2024-06-15 14:41:35,125][1653645] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-06-15 14:41:35,491][1653645] Updated weights for policy 0, policy_version 246112 (0.0010) [2024-06-15 14:41:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 504037376. Throughput: 0: 10638.2. Samples: 126056960. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:41:37,836][1653645] Updated weights for policy 0, policy_version 246148 (0.0012) [2024-06-15 14:41:38,663][1653645] Updated weights for policy 0, policy_version 246203 (0.0078) [2024-06-15 14:41:40,562][1653645] Updated weights for policy 0, policy_version 246265 (0.0013) [2024-06-15 14:41:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 504365056. Throughput: 0: 11116.2. Samples: 126137344. Policy #0 lag: (min: 10.0, avg: 128.5, max: 266.0) [2024-06-15 14:41:40,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:41:44,079][1653645] Updated weights for policy 0, policy_version 246325 (0.0045) [2024-06-15 14:41:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 504496128. Throughput: 0: 11138.9. Samples: 126201344. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:41:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:41:46,622][1653645] Updated weights for policy 0, policy_version 246368 (0.0012) [2024-06-15 14:41:49,911][1653645] Updated weights for policy 0, policy_version 246418 (0.0014) [2024-06-15 14:41:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 504758272. Throughput: 0: 11116.1. Samples: 126232576. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:41:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:41:51,334][1653645] Updated weights for policy 0, policy_version 246480 (0.0014) [2024-06-15 14:41:55,230][1653645] Updated weights for policy 0, policy_version 246583 (0.0016) [2024-06-15 14:41:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.4, 300 sec: 44209.0). Total num frames: 505020416. Throughput: 0: 11151.1. Samples: 126301696. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:41:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:41:58,707][1653645] Updated weights for policy 0, policy_version 246624 (0.0082) [2024-06-15 14:42:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 505151488. Throughput: 0: 11218.5. Samples: 126370816. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:42:01,917][1653645] Updated weights for policy 0, policy_version 246691 (0.0026) [2024-06-15 14:42:03,463][1653645] Updated weights for policy 0, policy_version 246736 (0.0037) [2024-06-15 14:42:05,902][1653645] Updated weights for policy 0, policy_version 246800 (0.0014) [2024-06-15 14:42:05,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 505446400. Throughput: 0: 11332.2. Samples: 126404608. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:42:07,050][1653645] Updated weights for policy 0, policy_version 246841 (0.0013) [2024-06-15 14:42:10,199][1653645] Updated weights for policy 0, policy_version 246881 (0.0012) [2024-06-15 14:42:10,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45878.1, 300 sec: 43764.7). Total num frames: 505675776. Throughput: 0: 11423.3. Samples: 126473728. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:42:13,858][1653645] Updated weights for policy 0, policy_version 246960 (0.0014) [2024-06-15 14:42:15,481][1653645] Updated weights for policy 0, policy_version 247028 (0.0013) [2024-06-15 14:42:15,971][1648982] Fps is (10 sec: 49090.4, 60 sec: 45865.6, 300 sec: 44429.3). Total num frames: 505937920. Throughput: 0: 11499.7. Samples: 126541824. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:15,973][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:42:17,478][1653645] Updated weights for policy 0, policy_version 247088 (0.0023) [2024-06-15 14:42:20,771][1651596] Signal inference workers to stop experience collection... (12800 times) [2024-06-15 14:42:20,864][1653645] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-06-15 14:42:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 506101760. Throughput: 0: 11457.4. Samples: 126572544. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:20,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 14:42:21,023][1651596] Signal inference workers to resume experience collection... (12800 times) [2024-06-15 14:42:21,024][1653645] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-06-15 14:42:21,408][1653645] Updated weights for policy 0, policy_version 247152 (0.0012) [2024-06-15 14:42:25,582][1653645] Updated weights for policy 0, policy_version 247216 (0.0013) [2024-06-15 14:42:25,958][1648982] Fps is (10 sec: 39370.2, 60 sec: 45875.1, 300 sec: 43986.8). Total num frames: 506331136. Throughput: 0: 11343.6. Samples: 126647808. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:42:27,531][1653645] Updated weights for policy 0, policy_version 247265 (0.0018) [2024-06-15 14:42:28,653][1653645] Updated weights for policy 0, policy_version 247297 (0.0011) [2024-06-15 14:42:30,222][1653645] Updated weights for policy 0, policy_version 247359 (0.0012) [2024-06-15 14:42:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 506593280. Throughput: 0: 11150.2. Samples: 126703104. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:42:35,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 44782.9, 300 sec: 43653.6). Total num frames: 506724352. Throughput: 0: 11332.3. Samples: 126742528. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:42:36,446][1653645] Updated weights for policy 0, policy_version 247425 (0.0013) [2024-06-15 14:42:38,682][1653645] Updated weights for policy 0, policy_version 247504 (0.0012) [2024-06-15 14:42:40,524][1653645] Updated weights for policy 0, policy_version 247556 (0.0158) [2024-06-15 14:42:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 507019264. Throughput: 0: 11207.1. Samples: 126806016. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:40,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 14:42:44,529][1653645] Updated weights for policy 0, policy_version 247632 (0.0013) [2024-06-15 14:42:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 507248640. Throughput: 0: 11104.7. Samples: 126870528. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:45,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:42:49,166][1653645] Updated weights for policy 0, policy_version 247712 (0.0023) [2024-06-15 14:42:49,796][1653645] Updated weights for policy 0, policy_version 247744 (0.0025) [2024-06-15 14:42:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 507412480. Throughput: 0: 11207.1. Samples: 126908928. Policy #0 lag: (min: 10.0, avg: 135.0, max: 266.0) [2024-06-15 14:42:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:42:51,910][1653645] Updated weights for policy 0, policy_version 247805 (0.0013) [2024-06-15 14:42:53,655][1653645] Updated weights for policy 0, policy_version 247846 (0.0012) [2024-06-15 14:42:55,960][1648982] Fps is (10 sec: 39315.3, 60 sec: 43689.5, 300 sec: 44319.9). Total num frames: 507641856. Throughput: 0: 10990.6. Samples: 126968320. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:42:55,965][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:42:55,978][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000247872_507641856.pth... [2024-06-15 14:42:56,049][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000242688_497025024.pth [2024-06-15 14:42:57,203][1653645] Updated weights for policy 0, policy_version 247906 (0.0016) [2024-06-15 14:43:00,638][1653645] Updated weights for policy 0, policy_version 247952 (0.0013) [2024-06-15 14:43:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 507805696. Throughput: 0: 11130.6. Samples: 127042560. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:43:02,590][1653645] Updated weights for policy 0, policy_version 248002 (0.0011) [2024-06-15 14:43:03,733][1653645] Updated weights for policy 0, policy_version 248052 (0.0013) [2024-06-15 14:43:05,652][1653645] Updated weights for policy 0, policy_version 248096 (0.0020) [2024-06-15 14:43:05,958][1648982] Fps is (10 sec: 45883.0, 60 sec: 44236.9, 300 sec: 44653.4). Total num frames: 508100608. Throughput: 0: 11036.5. Samples: 127069184. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:43:08,412][1653645] Updated weights for policy 0, policy_version 248144 (0.0013) [2024-06-15 14:43:10,963][1648982] Fps is (10 sec: 49124.2, 60 sec: 43686.6, 300 sec: 44097.2). Total num frames: 508297216. Throughput: 0: 10898.6. Samples: 127138304. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:10,964][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:43:11,797][1651596] Signal inference workers to stop experience collection... (12850 times) [2024-06-15 14:43:11,831][1653645] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-06-15 14:43:12,099][1651596] Signal inference workers to resume experience collection... (12850 times) [2024-06-15 14:43:12,099][1653645] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-06-15 14:43:12,102][1653645] Updated weights for policy 0, policy_version 248208 (0.0015) [2024-06-15 14:43:14,254][1653645] Updated weights for policy 0, policy_version 248272 (0.0026) [2024-06-15 14:43:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43699.9, 300 sec: 44431.2). Total num frames: 508559360. Throughput: 0: 11013.7. Samples: 127198720. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:43:17,209][1653645] Updated weights for policy 0, policy_version 248339 (0.0089) [2024-06-15 14:43:20,958][1648982] Fps is (10 sec: 39344.2, 60 sec: 43144.7, 300 sec: 44098.0). Total num frames: 508690432. Throughput: 0: 10934.1. Samples: 127234560. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:43:21,376][1653645] Updated weights for policy 0, policy_version 248416 (0.0104) [2024-06-15 14:43:24,567][1653645] Updated weights for policy 0, policy_version 248480 (0.0104) [2024-06-15 14:43:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 508952576. Throughput: 0: 10956.8. Samples: 127299072. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:43:26,200][1653645] Updated weights for policy 0, policy_version 248516 (0.0011) [2024-06-15 14:43:27,578][1653645] Updated weights for policy 0, policy_version 248574 (0.0012) [2024-06-15 14:43:30,559][1653645] Updated weights for policy 0, policy_version 248632 (0.0013) [2024-06-15 14:43:30,959][1648982] Fps is (10 sec: 52420.9, 60 sec: 43689.6, 300 sec: 44319.9). Total num frames: 509214720. Throughput: 0: 11013.4. Samples: 127366144. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:30,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:43:33,317][1653645] Updated weights for policy 0, policy_version 248696 (0.0024) [2024-06-15 14:43:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 509378560. Throughput: 0: 10945.4. Samples: 127401472. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:43:36,715][1653645] Updated weights for policy 0, policy_version 248761 (0.0014) [2024-06-15 14:43:38,924][1653645] Updated weights for policy 0, policy_version 248825 (0.0012) [2024-06-15 14:43:40,958][1648982] Fps is (10 sec: 39326.4, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 509607936. Throughput: 0: 10923.0. Samples: 127459840. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:40,959][1648982] Avg episode reward: [(0, '37.540')] [2024-06-15 14:43:42,854][1653645] Updated weights for policy 0, policy_version 248890 (0.0014) [2024-06-15 14:43:45,588][1653645] Updated weights for policy 0, policy_version 248960 (0.0015) [2024-06-15 14:43:45,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 509870080. Throughput: 0: 10752.0. Samples: 127526400. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:43:49,013][1653645] Updated weights for policy 0, policy_version 249018 (0.0014) [2024-06-15 14:43:50,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44782.7, 300 sec: 44320.1). Total num frames: 510099456. Throughput: 0: 10945.3. Samples: 127561728. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:50,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:43:51,152][1653645] Updated weights for policy 0, policy_version 249084 (0.0025) [2024-06-15 14:43:55,091][1653645] Updated weights for policy 0, policy_version 249144 (0.0012) [2024-06-15 14:43:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43691.9, 300 sec: 44431.3). Total num frames: 510263296. Throughput: 0: 10878.5. Samples: 127627776. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:43:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:43:57,412][1653645] Updated weights for policy 0, policy_version 249209 (0.0013) [2024-06-15 14:44:00,179][1651596] Signal inference workers to stop experience collection... (12900 times) [2024-06-15 14:44:00,245][1653645] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-06-15 14:44:00,438][1651596] Signal inference workers to resume experience collection... (12900 times) [2024-06-15 14:44:00,438][1653645] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-06-15 14:44:00,646][1653645] Updated weights for policy 0, policy_version 249274 (0.0013) [2024-06-15 14:44:00,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 510525440. Throughput: 0: 10899.9. Samples: 127689216. Policy #0 lag: (min: 31.0, avg: 168.9, max: 287.0) [2024-06-15 14:44:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:44:03,266][1653645] Updated weights for policy 0, policy_version 249328 (0.0014) [2024-06-15 14:44:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 44098.3). Total num frames: 510689280. Throughput: 0: 10865.8. Samples: 127723520. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:44:06,108][1653645] Updated weights for policy 0, policy_version 249376 (0.0029) [2024-06-15 14:44:07,875][1653645] Updated weights for policy 0, policy_version 249409 (0.0018) [2024-06-15 14:44:10,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43694.8, 300 sec: 43986.9). Total num frames: 510918656. Throughput: 0: 10899.9. Samples: 127789568. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:44:11,548][1653645] Updated weights for policy 0, policy_version 249488 (0.0014) [2024-06-15 14:44:13,922][1653645] Updated weights for policy 0, policy_version 249542 (0.0022) [2024-06-15 14:44:14,883][1653645] Updated weights for policy 0, policy_version 249591 (0.0014) [2024-06-15 14:44:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 511180800. Throughput: 0: 11059.5. Samples: 127863808. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:44:18,576][1653645] Updated weights for policy 0, policy_version 249657 (0.0012) [2024-06-15 14:44:20,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 44097.9). Total num frames: 511410176. Throughput: 0: 10990.9. Samples: 127896064. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:44:21,033][1653645] Updated weights for policy 0, policy_version 249723 (0.0014) [2024-06-15 14:44:24,409][1653645] Updated weights for policy 0, policy_version 249776 (0.0013) [2024-06-15 14:44:25,778][1653645] Updated weights for policy 0, policy_version 249808 (0.0011) [2024-06-15 14:44:25,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 511606784. Throughput: 0: 11093.4. Samples: 127959040. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:44:26,769][1653645] Updated weights for policy 0, policy_version 249856 (0.0011) [2024-06-15 14:44:30,452][1653645] Updated weights for policy 0, policy_version 249919 (0.0014) [2024-06-15 14:44:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43691.6, 300 sec: 44320.1). Total num frames: 511836160. Throughput: 0: 11161.6. Samples: 128028672. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:44:32,516][1653645] Updated weights for policy 0, policy_version 249976 (0.0080) [2024-06-15 14:44:35,543][1653645] Updated weights for policy 0, policy_version 250020 (0.0131) [2024-06-15 14:44:35,957][1648982] Fps is (10 sec: 45875.6, 60 sec: 44783.1, 300 sec: 43875.8). Total num frames: 512065536. Throughput: 0: 11116.2. Samples: 128061952. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:44:37,596][1653645] Updated weights for policy 0, policy_version 250064 (0.0013) [2024-06-15 14:44:40,143][1653645] Updated weights for policy 0, policy_version 250128 (0.0011) [2024-06-15 14:44:40,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 512327680. Throughput: 0: 11252.6. Samples: 128134144. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:44:41,341][1653645] Updated weights for policy 0, policy_version 250175 (0.0012) [2024-06-15 14:44:43,797][1653645] Updated weights for policy 0, policy_version 250225 (0.0014) [2024-06-15 14:44:45,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 512491520. Throughput: 0: 11434.6. Samples: 128203776. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:45,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:44:46,960][1653645] Updated weights for policy 0, policy_version 250299 (0.0013) [2024-06-15 14:44:49,169][1651596] Signal inference workers to stop experience collection... (12950 times) [2024-06-15 14:44:49,253][1653645] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-06-15 14:44:49,448][1651596] Signal inference workers to resume experience collection... (12950 times) [2024-06-15 14:44:49,448][1653645] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-06-15 14:44:50,031][1653645] Updated weights for policy 0, policy_version 250358 (0.0013) [2024-06-15 14:44:50,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 512753664. Throughput: 0: 11389.1. Samples: 128236032. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:44:52,195][1653645] Updated weights for policy 0, policy_version 250388 (0.0012) [2024-06-15 14:44:54,644][1653645] Updated weights for policy 0, policy_version 250448 (0.0030) [2024-06-15 14:44:55,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 513015808. Throughput: 0: 11400.5. Samples: 128302592. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:44:55,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:44:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000250496_513015808.pth... [2024-06-15 14:44:56,006][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000245280_502333440.pth [2024-06-15 14:44:58,075][1653645] Updated weights for policy 0, policy_version 250515 (0.0011) [2024-06-15 14:45:00,932][1653645] Updated weights for policy 0, policy_version 250576 (0.0109) [2024-06-15 14:45:00,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 513179648. Throughput: 0: 11218.5. Samples: 128368640. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:45:00,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 14:45:03,303][1653645] Updated weights for policy 0, policy_version 250628 (0.0013) [2024-06-15 14:45:05,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 513409024. Throughput: 0: 11286.7. Samples: 128403968. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:45:05,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:45:06,275][1653645] Updated weights for policy 0, policy_version 250693 (0.0014) [2024-06-15 14:45:07,615][1653645] Updated weights for policy 0, policy_version 250748 (0.0015) [2024-06-15 14:45:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 513605632. Throughput: 0: 11264.0. Samples: 128465920. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:45:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:45:11,320][1653645] Updated weights for policy 0, policy_version 250816 (0.0012) [2024-06-15 14:45:13,731][1653645] Updated weights for policy 0, policy_version 250872 (0.0017) [2024-06-15 14:45:15,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.0, 300 sec: 44542.2). Total num frames: 513900544. Throughput: 0: 11207.1. Samples: 128532992. Policy #0 lag: (min: 15.0, avg: 118.5, max: 271.0) [2024-06-15 14:45:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:45:16,029][1653645] Updated weights for policy 0, policy_version 250937 (0.0013) [2024-06-15 14:45:19,220][1653645] Updated weights for policy 0, policy_version 250992 (0.0013) [2024-06-15 14:45:20,957][1648982] Fps is (10 sec: 45875.8, 60 sec: 44236.9, 300 sec: 44209.1). Total num frames: 514064384. Throughput: 0: 11264.0. Samples: 128568832. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:45:23,308][1653645] Updated weights for policy 0, policy_version 251061 (0.0165) [2024-06-15 14:45:25,766][1653645] Updated weights for policy 0, policy_version 251126 (0.0014) [2024-06-15 14:45:25,959][1648982] Fps is (10 sec: 42598.7, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 514326528. Throughput: 0: 11070.6. Samples: 128632320. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:25,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:45:27,656][1653645] Updated weights for policy 0, policy_version 251169 (0.0012) [2024-06-15 14:45:29,703][1653645] Updated weights for policy 0, policy_version 251204 (0.0028) [2024-06-15 14:45:30,869][1653645] Updated weights for policy 0, policy_version 251262 (0.0018) [2024-06-15 14:45:30,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 514588672. Throughput: 0: 11059.2. Samples: 128701440. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:45:35,308][1653645] Updated weights for policy 0, policy_version 251326 (0.0012) [2024-06-15 14:45:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 514719744. Throughput: 0: 11082.0. Samples: 128734720. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:35,962][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:45:37,880][1653645] Updated weights for policy 0, policy_version 251376 (0.0014) [2024-06-15 14:45:38,737][1651596] Signal inference workers to stop experience collection... (13000 times) [2024-06-15 14:45:38,841][1653645] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-06-15 14:45:39,047][1651596] Signal inference workers to resume experience collection... (13000 times) [2024-06-15 14:45:39,067][1653645] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-06-15 14:45:39,520][1653645] Updated weights for policy 0, policy_version 251424 (0.0013) [2024-06-15 14:45:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 514981888. Throughput: 0: 10968.2. Samples: 128796160. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 14:45:42,801][1653645] Updated weights for policy 0, policy_version 251504 (0.0099) [2024-06-15 14:45:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 515112960. Throughput: 0: 10945.4. Samples: 128861184. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:45:46,797][1653645] Updated weights for policy 0, policy_version 251552 (0.0013) [2024-06-15 14:45:48,888][1653645] Updated weights for policy 0, policy_version 251592 (0.0013) [2024-06-15 14:45:50,547][1653645] Updated weights for policy 0, policy_version 251650 (0.0013) [2024-06-15 14:45:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 515407872. Throughput: 0: 10979.6. Samples: 128898048. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:45:51,641][1653645] Updated weights for policy 0, policy_version 251709 (0.0014) [2024-06-15 14:45:54,705][1653645] Updated weights for policy 0, policy_version 251771 (0.0035) [2024-06-15 14:45:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 515637248. Throughput: 0: 10968.2. Samples: 128959488. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:45:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:45:58,952][1653645] Updated weights for policy 0, policy_version 251824 (0.0012) [2024-06-15 14:46:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 515801088. Throughput: 0: 11116.1. Samples: 129033216. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:46:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:46:01,853][1653645] Updated weights for policy 0, policy_version 251901 (0.0014) [2024-06-15 14:46:02,987][1653645] Updated weights for policy 0, policy_version 251954 (0.0013) [2024-06-15 14:46:04,922][1653645] Updated weights for policy 0, policy_version 251970 (0.0012) [2024-06-15 14:46:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.1, 300 sec: 44653.9). Total num frames: 516096000. Throughput: 0: 10956.8. Samples: 129061888. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:46:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:46:06,420][1653645] Updated weights for policy 0, policy_version 252032 (0.0013) [2024-06-15 14:46:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 516259840. Throughput: 0: 11104.7. Samples: 129132032. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:46:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:46:10,994][1653645] Updated weights for policy 0, policy_version 252084 (0.0010) [2024-06-15 14:46:12,957][1653645] Updated weights for policy 0, policy_version 252130 (0.0012) [2024-06-15 14:46:14,955][1653645] Updated weights for policy 0, policy_version 252213 (0.0103) [2024-06-15 14:46:15,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 516554752. Throughput: 0: 10911.2. Samples: 129192448. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:46:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:46:17,743][1653645] Updated weights for policy 0, policy_version 252272 (0.0012) [2024-06-15 14:46:20,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 516685824. Throughput: 0: 10979.5. Samples: 129228800. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:46:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 14:46:22,860][1653645] Updated weights for policy 0, policy_version 252342 (0.0022) [2024-06-15 14:46:24,988][1653645] Updated weights for policy 0, policy_version 252390 (0.0058) [2024-06-15 14:46:25,232][1651596] Signal inference workers to stop experience collection... (13050 times) [2024-06-15 14:46:25,264][1653645] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-06-15 14:46:25,486][1651596] Signal inference workers to resume experience collection... (13050 times) [2024-06-15 14:46:25,487][1653645] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-06-15 14:46:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 516980736. Throughput: 0: 11252.6. Samples: 129302528. Policy #0 lag: (min: 58.0, avg: 184.7, max: 314.0) [2024-06-15 14:46:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:46:26,912][1653645] Updated weights for policy 0, policy_version 252473 (0.0019) [2024-06-15 14:46:29,452][1653645] Updated weights for policy 0, policy_version 252534 (0.0092) [2024-06-15 14:46:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 517210112. Throughput: 0: 11286.8. Samples: 129369088. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:46:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:46:33,844][1653645] Updated weights for policy 0, policy_version 252579 (0.0012) [2024-06-15 14:46:35,811][1653645] Updated weights for policy 0, policy_version 252624 (0.0134) [2024-06-15 14:46:35,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 517373952. Throughput: 0: 11229.9. Samples: 129403392. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:46:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:46:37,940][1653645] Updated weights for policy 0, policy_version 252705 (0.0013) [2024-06-15 14:46:40,904][1653645] Updated weights for policy 0, policy_version 252743 (0.0012) [2024-06-15 14:46:40,958][1648982] Fps is (10 sec: 39319.1, 60 sec: 43690.2, 300 sec: 44431.1). Total num frames: 517603328. Throughput: 0: 11252.5. Samples: 129465856. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:46:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:46:45,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 517767168. Throughput: 0: 11172.9. Samples: 129536000. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:46:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:46:45,997][1653645] Updated weights for policy 0, policy_version 252832 (0.0013) [2024-06-15 14:46:49,269][1653645] Updated weights for policy 0, policy_version 252928 (0.0012) [2024-06-15 14:46:50,524][1653645] Updated weights for policy 0, policy_version 252976 (0.0013) [2024-06-15 14:46:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45328.7, 300 sec: 44431.1). Total num frames: 518127616. Throughput: 0: 11150.1. Samples: 129563648. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:46:50,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:46:53,004][1653645] Updated weights for policy 0, policy_version 253010 (0.0011) [2024-06-15 14:46:53,810][1653645] Updated weights for policy 0, policy_version 253048 (0.0012) [2024-06-15 14:46:55,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 518258688. Throughput: 0: 11081.9. Samples: 129630720. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:46:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:46:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000253056_518258688.pth... [2024-06-15 14:46:56,038][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000247872_507641856.pth [2024-06-15 14:46:58,417][1653645] Updated weights for policy 0, policy_version 253118 (0.0012) [2024-06-15 14:47:00,875][1653645] Updated weights for policy 0, policy_version 253188 (0.0012) [2024-06-15 14:47:00,960][1648982] Fps is (10 sec: 39316.3, 60 sec: 45327.6, 300 sec: 44319.8). Total num frames: 518520832. Throughput: 0: 11297.7. Samples: 129700864. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:00,960][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 14:47:02,181][1653645] Updated weights for policy 0, policy_version 253242 (0.0012) [2024-06-15 14:47:05,191][1653645] Updated weights for policy 0, policy_version 253296 (0.0012) [2024-06-15 14:47:05,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 518782976. Throughput: 0: 11195.7. Samples: 129732608. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:05,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 14:47:10,239][1653645] Updated weights for policy 0, policy_version 253360 (0.0118) [2024-06-15 14:47:10,958][1648982] Fps is (10 sec: 39329.0, 60 sec: 44236.8, 300 sec: 43988.8). Total num frames: 518914048. Throughput: 0: 11070.6. Samples: 129800704. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:47:11,622][1651596] Signal inference workers to stop experience collection... (13100 times) [2024-06-15 14:47:11,664][1653645] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-06-15 14:47:11,890][1651596] Signal inference workers to resume experience collection... (13100 times) [2024-06-15 14:47:11,891][1653645] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-06-15 14:47:12,826][1653645] Updated weights for policy 0, policy_version 253440 (0.0035) [2024-06-15 14:47:14,354][1653645] Updated weights for policy 0, policy_version 253499 (0.0014) [2024-06-15 14:47:15,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 519176192. Throughput: 0: 10854.4. Samples: 129857536. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:47:17,089][1653645] Updated weights for policy 0, policy_version 253552 (0.0014) [2024-06-15 14:47:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 519307264. Throughput: 0: 10877.1. Samples: 129892864. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:47:22,217][1653645] Updated weights for policy 0, policy_version 253602 (0.0014) [2024-06-15 14:47:24,548][1653645] Updated weights for policy 0, policy_version 253664 (0.0011) [2024-06-15 14:47:25,959][1648982] Fps is (10 sec: 42591.5, 60 sec: 43689.8, 300 sec: 44097.7). Total num frames: 519602176. Throughput: 0: 10990.7. Samples: 129960448. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:25,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:47:26,208][1653645] Updated weights for policy 0, policy_version 253730 (0.0012) [2024-06-15 14:47:29,505][1653645] Updated weights for policy 0, policy_version 253815 (0.0013) [2024-06-15 14:47:30,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 519831552. Throughput: 0: 10831.7. Samples: 130023424. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:47:34,282][1653645] Updated weights for policy 0, policy_version 253860 (0.0011) [2024-06-15 14:47:35,651][1653645] Updated weights for policy 0, policy_version 253904 (0.0012) [2024-06-15 14:47:35,958][1648982] Fps is (10 sec: 39327.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 519995392. Throughput: 0: 11036.6. Samples: 130060288. Policy #0 lag: (min: 109.0, avg: 199.0, max: 365.0) [2024-06-15 14:47:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:47:37,639][1653645] Updated weights for policy 0, policy_version 253985 (0.0014) [2024-06-15 14:47:40,593][1653645] Updated weights for policy 0, policy_version 254048 (0.0089) [2024-06-15 14:47:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.5, 300 sec: 44320.1). Total num frames: 520323072. Throughput: 0: 10979.6. Samples: 130124800. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:47:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:47:45,283][1653645] Updated weights for policy 0, policy_version 254081 (0.0013) [2024-06-15 14:47:45,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44237.0, 300 sec: 44097.9). Total num frames: 520421376. Throughput: 0: 11025.5. Samples: 130196992. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:47:45,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 14:47:47,186][1653645] Updated weights for policy 0, policy_version 254146 (0.0011) [2024-06-15 14:47:49,051][1653645] Updated weights for policy 0, policy_version 254240 (0.0013) [2024-06-15 14:47:49,813][1653645] Updated weights for policy 0, policy_version 254272 (0.0012) [2024-06-15 14:47:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43691.1, 300 sec: 44431.4). Total num frames: 520749056. Throughput: 0: 10911.3. Samples: 130223616. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:47:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:47:52,682][1653645] Updated weights for policy 0, policy_version 254330 (0.0013) [2024-06-15 14:47:55,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 520880128. Throughput: 0: 10922.7. Samples: 130292224. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:47:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:47:57,775][1651596] Signal inference workers to stop experience collection... (13150 times) [2024-06-15 14:47:57,824][1653645] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-06-15 14:47:58,068][1651596] Signal inference workers to resume experience collection... (13150 times) [2024-06-15 14:47:58,069][1653645] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-06-15 14:47:58,231][1653645] Updated weights for policy 0, policy_version 254390 (0.0030) [2024-06-15 14:47:59,568][1653645] Updated weights for policy 0, policy_version 254434 (0.0011) [2024-06-15 14:48:00,575][1653645] Updated weights for policy 0, policy_version 254485 (0.0016) [2024-06-15 14:48:00,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44784.4, 300 sec: 44431.2). Total num frames: 521207808. Throughput: 0: 11150.2. Samples: 130359296. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:48:01,540][1653645] Updated weights for policy 0, policy_version 254526 (0.0016) [2024-06-15 14:48:04,325][1653645] Updated weights for policy 0, policy_version 254583 (0.0100) [2024-06-15 14:48:05,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44432.0). Total num frames: 521404416. Throughput: 0: 11173.0. Samples: 130395648. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:48:10,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 521568256. Throughput: 0: 11332.6. Samples: 130470400. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:48:11,232][1653645] Updated weights for policy 0, policy_version 254688 (0.0012) [2024-06-15 14:48:13,238][1653645] Updated weights for policy 0, policy_version 254753 (0.0036) [2024-06-15 14:48:15,377][1653645] Updated weights for policy 0, policy_version 254803 (0.0012) [2024-06-15 14:48:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 521895936. Throughput: 0: 11093.4. Samples: 130522624. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:48:16,053][1653645] Updated weights for policy 0, policy_version 254837 (0.0014) [2024-06-15 14:48:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 521961472. Throughput: 0: 11218.5. Samples: 130565120. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:48:21,089][1653645] Updated weights for policy 0, policy_version 254869 (0.0011) [2024-06-15 14:48:22,886][1653645] Updated weights for policy 0, policy_version 254944 (0.0096) [2024-06-15 14:48:24,727][1653645] Updated weights for policy 0, policy_version 255026 (0.0014) [2024-06-15 14:48:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45330.2, 300 sec: 44431.4). Total num frames: 522321920. Throughput: 0: 11116.1. Samples: 130625024. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:48:26,970][1653645] Updated weights for policy 0, policy_version 255088 (0.0014) [2024-06-15 14:48:30,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 522452992. Throughput: 0: 11195.7. Samples: 130700800. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:48:33,190][1653645] Updated weights for policy 0, policy_version 255136 (0.0012) [2024-06-15 14:48:34,688][1653645] Updated weights for policy 0, policy_version 255187 (0.0013) [2024-06-15 14:48:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 522747904. Throughput: 0: 11434.7. Samples: 130738176. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:48:37,010][1653645] Updated weights for policy 0, policy_version 255296 (0.0027) [2024-06-15 14:48:37,889][1651596] Signal inference workers to stop experience collection... (13200 times) [2024-06-15 14:48:37,960][1653645] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-06-15 14:48:38,176][1651596] Signal inference workers to resume experience collection... (13200 times) [2024-06-15 14:48:38,177][1653645] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-06-15 14:48:39,318][1653645] Updated weights for policy 0, policy_version 255359 (0.0013) [2024-06-15 14:48:40,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 522977280. Throughput: 0: 10934.0. Samples: 130784256. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:48:45,974][1648982] Fps is (10 sec: 32713.3, 60 sec: 44224.5, 300 sec: 43984.4). Total num frames: 523075584. Throughput: 0: 11237.1. Samples: 130865152. Policy #0 lag: (min: 84.0, avg: 177.1, max: 340.0) [2024-06-15 14:48:45,975][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:48:46,301][1653645] Updated weights for policy 0, policy_version 255424 (0.0014) [2024-06-15 14:48:47,941][1653645] Updated weights for policy 0, policy_version 255493 (0.0014) [2024-06-15 14:48:50,482][1653645] Updated weights for policy 0, policy_version 255556 (0.0094) [2024-06-15 14:48:50,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 523403264. Throughput: 0: 10911.3. Samples: 130886656. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:48:50,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:48:51,647][1653645] Updated weights for policy 0, policy_version 255615 (0.0015) [2024-06-15 14:48:55,958][1648982] Fps is (10 sec: 42669.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 523501568. Throughput: 0: 10945.4. Samples: 130962944. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:48:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:48:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000255616_523501568.pth... [2024-06-15 14:48:56,058][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000250496_513015808.pth [2024-06-15 14:48:58,446][1653645] Updated weights for policy 0, policy_version 255680 (0.0030) [2024-06-15 14:49:00,446][1653645] Updated weights for policy 0, policy_version 255752 (0.0106) [2024-06-15 14:49:00,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 523829248. Throughput: 0: 10956.8. Samples: 131015680. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:49:01,419][1653645] Updated weights for policy 0, policy_version 255806 (0.0012) [2024-06-15 14:49:03,600][1653645] Updated weights for policy 0, policy_version 255864 (0.0097) [2024-06-15 14:49:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 524025856. Throughput: 0: 10672.3. Samples: 131045376. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:49:10,676][1653645] Updated weights for policy 0, policy_version 255936 (0.0014) [2024-06-15 14:49:10,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 524156928. Throughput: 0: 11104.7. Samples: 131124736. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:49:12,235][1653645] Updated weights for policy 0, policy_version 256006 (0.0018) [2024-06-15 14:49:13,632][1653645] Updated weights for policy 0, policy_version 256060 (0.0010) [2024-06-15 14:49:15,234][1653645] Updated weights for policy 0, policy_version 256112 (0.0020) [2024-06-15 14:49:15,641][1653645] Updated weights for policy 0, policy_version 256128 (0.0010) [2024-06-15 14:49:15,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 524550144. Throughput: 0: 10683.7. Samples: 131181568. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:49:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 524582912. Throughput: 0: 10820.3. Samples: 131225088. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:49:22,102][1653645] Updated weights for policy 0, policy_version 256195 (0.0070) [2024-06-15 14:49:22,405][1651596] Signal inference workers to stop experience collection... (13250 times) [2024-06-15 14:49:22,464][1653645] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-06-15 14:49:22,656][1651596] Signal inference workers to resume experience collection... (13250 times) [2024-06-15 14:49:22,657][1653645] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-06-15 14:49:24,079][1653645] Updated weights for policy 0, policy_version 256276 (0.0110) [2024-06-15 14:49:25,117][1653645] Updated weights for policy 0, policy_version 256316 (0.0014) [2024-06-15 14:49:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 524943360. Throughput: 0: 11161.6. Samples: 131286528. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:49:26,817][1653645] Updated weights for policy 0, policy_version 256356 (0.0013) [2024-06-15 14:49:30,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 525074432. Throughput: 0: 10983.6. Samples: 131359232. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:49:32,408][1653645] Updated weights for policy 0, policy_version 256400 (0.0021) [2024-06-15 14:49:35,038][1653645] Updated weights for policy 0, policy_version 256512 (0.0015) [2024-06-15 14:49:35,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 525402112. Throughput: 0: 11264.0. Samples: 131393536. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:35,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:49:36,620][1653645] Updated weights for policy 0, policy_version 256572 (0.0014) [2024-06-15 14:49:39,594][1653645] Updated weights for policy 0, policy_version 256632 (0.0025) [2024-06-15 14:49:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 525598720. Throughput: 0: 10752.0. Samples: 131446784. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:49:45,571][1653645] Updated weights for policy 0, policy_version 256691 (0.0043) [2024-06-15 14:49:45,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 44249.1, 300 sec: 43986.9). Total num frames: 525729792. Throughput: 0: 11172.9. Samples: 131518464. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:49:47,406][1653645] Updated weights for policy 0, policy_version 256762 (0.0099) [2024-06-15 14:49:48,794][1653645] Updated weights for policy 0, policy_version 256816 (0.0011) [2024-06-15 14:49:50,578][1653645] Updated weights for policy 0, policy_version 256848 (0.0012) [2024-06-15 14:49:50,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43690.4, 300 sec: 44097.9). Total num frames: 526024704. Throughput: 0: 11036.4. Samples: 131542016. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:49:55,958][1648982] Fps is (10 sec: 39319.7, 60 sec: 43690.3, 300 sec: 43875.7). Total num frames: 526123008. Throughput: 0: 10979.4. Samples: 131618816. Policy #0 lag: (min: 47.0, avg: 162.5, max: 319.0) [2024-06-15 14:49:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:49:56,615][1653645] Updated weights for policy 0, policy_version 256912 (0.0013) [2024-06-15 14:49:58,390][1653645] Updated weights for policy 0, policy_version 256976 (0.0017) [2024-06-15 14:50:00,799][1653645] Updated weights for policy 0, policy_version 257058 (0.0019) [2024-06-15 14:50:00,958][1648982] Fps is (10 sec: 42600.1, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 526450688. Throughput: 0: 10968.2. Samples: 131675136. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:00,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:50:01,353][1653645] Updated weights for policy 0, policy_version 257088 (0.0011) [2024-06-15 14:50:03,888][1653645] Updated weights for policy 0, policy_version 257152 (0.0014) [2024-06-15 14:50:05,958][1648982] Fps is (10 sec: 52431.5, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 526647296. Throughput: 0: 10763.4. Samples: 131709440. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:50:08,556][1651596] Signal inference workers to stop experience collection... (13300 times) [2024-06-15 14:50:08,625][1653645] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-06-15 14:50:08,826][1651596] Signal inference workers to resume experience collection... (13300 times) [2024-06-15 14:50:08,827][1653645] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-06-15 14:50:10,058][1653645] Updated weights for policy 0, policy_version 257203 (0.0012) [2024-06-15 14:50:10,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 526811136. Throughput: 0: 10945.5. Samples: 131779072. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:50:11,854][1653645] Updated weights for policy 0, policy_version 257280 (0.0106) [2024-06-15 14:50:13,376][1653645] Updated weights for policy 0, policy_version 257337 (0.0012) [2024-06-15 14:50:15,743][1653645] Updated weights for policy 0, policy_version 257404 (0.0014) [2024-06-15 14:50:15,986][1648982] Fps is (10 sec: 52279.7, 60 sec: 43670.1, 300 sec: 44426.9). Total num frames: 527171584. Throughput: 0: 10563.3. Samples: 131834880. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:15,987][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:50:20,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 527171584. Throughput: 0: 10649.7. Samples: 131872768. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:50:22,350][1653645] Updated weights for policy 0, policy_version 257464 (0.0035) [2024-06-15 14:50:23,573][1653645] Updated weights for policy 0, policy_version 257520 (0.0013) [2024-06-15 14:50:24,982][1653645] Updated weights for policy 0, policy_version 257571 (0.0085) [2024-06-15 14:50:25,958][1648982] Fps is (10 sec: 39432.8, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 527564800. Throughput: 0: 10979.5. Samples: 131940864. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:50:27,060][1653645] Updated weights for policy 0, policy_version 257648 (0.0013) [2024-06-15 14:50:30,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 527695872. Throughput: 0: 10808.9. Samples: 132004864. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:50:33,575][1653645] Updated weights for policy 0, policy_version 257696 (0.0015) [2024-06-15 14:50:35,368][1653645] Updated weights for policy 0, policy_version 257760 (0.0013) [2024-06-15 14:50:35,959][1648982] Fps is (10 sec: 36042.2, 60 sec: 42051.7, 300 sec: 43875.6). Total num frames: 527925248. Throughput: 0: 11127.3. Samples: 132042752. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:50:36,914][1653645] Updated weights for policy 0, policy_version 257825 (0.0012) [2024-06-15 14:50:38,796][1653645] Updated weights for policy 0, policy_version 257910 (0.0012) [2024-06-15 14:50:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 528220160. Throughput: 0: 10570.1. Samples: 132094464. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:50:45,958][1648982] Fps is (10 sec: 32771.5, 60 sec: 42052.3, 300 sec: 43542.6). Total num frames: 528252928. Throughput: 0: 11013.7. Samples: 132170752. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:45,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:50:46,399][1653645] Updated weights for policy 0, policy_version 257959 (0.0119) [2024-06-15 14:50:47,828][1653645] Updated weights for policy 0, policy_version 258018 (0.0014) [2024-06-15 14:50:49,020][1651596] Signal inference workers to stop experience collection... (13350 times) [2024-06-15 14:50:49,080][1653645] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-06-15 14:50:49,296][1651596] Signal inference workers to resume experience collection... (13350 times) [2024-06-15 14:50:49,298][1653645] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-06-15 14:50:49,772][1653645] Updated weights for policy 0, policy_version 258096 (0.0012) [2024-06-15 14:50:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43691.0, 300 sec: 44098.0). Total num frames: 528646144. Throughput: 0: 10877.2. Samples: 132198912. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:50:55,958][1648982] Fps is (10 sec: 49149.9, 60 sec: 43690.7, 300 sec: 43875.7). Total num frames: 528744448. Throughput: 0: 10797.4. Samples: 132264960. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:50:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:50:55,974][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000258176_528744448.pth... [2024-06-15 14:50:56,055][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000253056_518258688.pth [2024-06-15 14:50:56,060][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000258176_528744448.pth [2024-06-15 14:50:57,536][1653645] Updated weights for policy 0, policy_version 258179 (0.0046) [2024-06-15 14:50:59,088][1653645] Updated weights for policy 0, policy_version 258256 (0.0013) [2024-06-15 14:51:00,761][1653645] Updated weights for policy 0, policy_version 258320 (0.0012) [2024-06-15 14:51:00,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 529039360. Throughput: 0: 10986.5. Samples: 132328960. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:51:00,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:51:03,157][1653645] Updated weights for policy 0, policy_version 258423 (0.0014) [2024-06-15 14:51:05,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 529268736. Throughput: 0: 10626.8. Samples: 132350976. Policy #0 lag: (min: 95.0, avg: 164.4, max: 367.0) [2024-06-15 14:51:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:51:10,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42598.3, 300 sec: 43431.5). Total num frames: 529367040. Throughput: 0: 10945.5. Samples: 132433408. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:51:11,226][1653645] Updated weights for policy 0, policy_version 258496 (0.0013) [2024-06-15 14:51:13,212][1653645] Updated weights for policy 0, policy_version 258576 (0.0235) [2024-06-15 14:51:14,549][1653645] Updated weights for policy 0, policy_version 258626 (0.0013) [2024-06-15 14:51:15,796][1653645] Updated weights for policy 0, policy_version 258679 (0.0014) [2024-06-15 14:51:15,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43711.5, 300 sec: 44431.2). Total num frames: 529793024. Throughput: 0: 10558.6. Samples: 132480000. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:51:20,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 529793024. Throughput: 0: 10536.1. Samples: 132516864. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:51:23,151][1653645] Updated weights for policy 0, policy_version 258724 (0.0012) [2024-06-15 14:51:25,457][1653645] Updated weights for policy 0, policy_version 258832 (0.0012) [2024-06-15 14:51:25,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 42598.6, 300 sec: 43764.7). Total num frames: 530120704. Throughput: 0: 10945.4. Samples: 132587008. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:51:27,503][1651596] Signal inference workers to stop experience collection... (13400 times) [2024-06-15 14:51:27,576][1653645] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-06-15 14:51:27,749][1651596] Signal inference workers to resume experience collection... (13400 times) [2024-06-15 14:51:27,749][1653645] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-06-15 14:51:27,751][1653645] Updated weights for policy 0, policy_version 258928 (0.0113) [2024-06-15 14:51:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 530317312. Throughput: 0: 10501.7. Samples: 132643328. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:30,960][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:51:35,735][1653645] Updated weights for policy 0, policy_version 258994 (0.0014) [2024-06-15 14:51:35,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 41506.8, 300 sec: 43431.6). Total num frames: 530415616. Throughput: 0: 10843.0. Samples: 132686848. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:51:37,423][1653645] Updated weights for policy 0, policy_version 259072 (0.0110) [2024-06-15 14:51:40,122][1653645] Updated weights for policy 0, policy_version 259184 (0.0014) [2024-06-15 14:51:40,958][1648982] Fps is (10 sec: 52425.8, 60 sec: 43690.3, 300 sec: 44320.1). Total num frames: 530841600. Throughput: 0: 10422.0. Samples: 132733952. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:51:45,970][1648982] Fps is (10 sec: 42545.6, 60 sec: 43135.6, 300 sec: 43096.5). Total num frames: 530841600. Throughput: 0: 10715.0. Samples: 132811264. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:45,971][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:51:49,664][1653645] Updated weights for policy 0, policy_version 259305 (0.0196) [2024-06-15 14:51:50,958][1648982] Fps is (10 sec: 32769.5, 60 sec: 42052.2, 300 sec: 43764.8). Total num frames: 531169280. Throughput: 0: 10888.5. Samples: 132840960. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:51:51,360][1653645] Updated weights for policy 0, policy_version 259382 (0.0093) [2024-06-15 14:51:52,684][1653645] Updated weights for policy 0, policy_version 259448 (0.0012) [2024-06-15 14:51:55,958][1648982] Fps is (10 sec: 52493.1, 60 sec: 43690.8, 300 sec: 43542.8). Total num frames: 531365888. Throughput: 0: 10331.0. Samples: 132898304. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:51:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:52:00,144][1653645] Updated weights for policy 0, policy_version 259506 (0.0125) [2024-06-15 14:52:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 41506.3, 300 sec: 43209.3). Total num frames: 531529728. Throughput: 0: 10934.0. Samples: 132972032. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:52:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:52:01,259][1653645] Updated weights for policy 0, policy_version 259555 (0.0013) [2024-06-15 14:52:02,675][1653645] Updated weights for policy 0, policy_version 259619 (0.0091) [2024-06-15 14:52:04,750][1653645] Updated weights for policy 0, policy_version 259696 (0.0013) [2024-06-15 14:52:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 531890176. Throughput: 0: 10717.8. Samples: 132999168. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:52:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:52:10,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 42052.4, 300 sec: 43098.2). Total num frames: 531890176. Throughput: 0: 10763.4. Samples: 133071360. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:52:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:52:11,929][1653645] Updated weights for policy 0, policy_version 259754 (0.0013) [2024-06-15 14:52:12,567][1651596] Signal inference workers to stop experience collection... (13450 times) [2024-06-15 14:52:12,654][1653645] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-06-15 14:52:12,781][1651596] Signal inference workers to resume experience collection... (13450 times) [2024-06-15 14:52:12,786][1653645] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-06-15 14:52:13,472][1653645] Updated weights for policy 0, policy_version 259824 (0.0014) [2024-06-15 14:52:15,234][1653645] Updated weights for policy 0, policy_version 259889 (0.0103) [2024-06-15 14:52:15,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 44098.0). Total num frames: 532316160. Throughput: 0: 10877.2. Samples: 133132800. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:52:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:52:16,750][1653645] Updated weights for policy 0, policy_version 259952 (0.0075) [2024-06-15 14:52:20,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.5, 300 sec: 43431.7). Total num frames: 532414464. Throughput: 0: 10558.5. Samples: 133161984. Policy #0 lag: (min: 31.0, avg: 82.8, max: 287.0) [2024-06-15 14:52:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:52:25,193][1653645] Updated weights for policy 0, policy_version 260032 (0.0051) [2024-06-15 14:52:25,958][1648982] Fps is (10 sec: 26214.0, 60 sec: 40959.9, 300 sec: 43209.3). Total num frames: 532578304. Throughput: 0: 11207.2. Samples: 133238272. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:25,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:52:26,536][1653645] Updated weights for policy 0, policy_version 260085 (0.0011) [2024-06-15 14:52:28,207][1653645] Updated weights for policy 0, policy_version 260160 (0.0012) [2024-06-15 14:52:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.4, 300 sec: 43875.7). Total num frames: 532938752. Throughput: 0: 10561.4. Samples: 133286400. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:52:35,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 42052.1, 300 sec: 42765.0). Total num frames: 532938752. Throughput: 0: 10763.3. Samples: 133325312. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:35,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:52:36,366][1653645] Updated weights for policy 0, policy_version 260226 (0.0015) [2024-06-15 14:52:38,863][1653645] Updated weights for policy 0, policy_version 260320 (0.0013) [2024-06-15 14:52:40,515][1653645] Updated weights for policy 0, policy_version 260371 (0.0162) [2024-06-15 14:52:40,958][1648982] Fps is (10 sec: 32769.1, 60 sec: 40414.2, 300 sec: 43542.6). Total num frames: 533266432. Throughput: 0: 10820.3. Samples: 133385216. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:52:42,548][1653645] Updated weights for policy 0, policy_version 260464 (0.0106) [2024-06-15 14:52:45,974][1648982] Fps is (10 sec: 52343.3, 60 sec: 43687.6, 300 sec: 43095.8). Total num frames: 533463040. Throughput: 0: 10486.4. Samples: 133444096. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:45,975][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:52:49,865][1653645] Updated weights for policy 0, policy_version 260512 (0.0026) [2024-06-15 14:52:50,957][1648982] Fps is (10 sec: 32768.3, 60 sec: 40414.0, 300 sec: 43098.3). Total num frames: 533594112. Throughput: 0: 10717.9. Samples: 133481472. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:52:51,901][1653645] Updated weights for policy 0, policy_version 260578 (0.0011) [2024-06-15 14:52:53,098][1651596] Signal inference workers to stop experience collection... (13500 times) [2024-06-15 14:52:53,177][1653645] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-06-15 14:52:53,339][1651596] Signal inference workers to resume experience collection... (13500 times) [2024-06-15 14:52:53,340][1653645] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-06-15 14:52:53,852][1653645] Updated weights for policy 0, policy_version 260662 (0.0111) [2024-06-15 14:52:55,958][1648982] Fps is (10 sec: 52516.5, 60 sec: 43690.8, 300 sec: 43320.4). Total num frames: 533987328. Throughput: 0: 10240.0. Samples: 133532160. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:52:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:52:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000260736_533987328.pth... [2024-06-15 14:52:56,026][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000255616_523501568.pth [2024-06-15 14:53:00,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 42653.9). Total num frames: 533987328. Throughput: 0: 10456.2. Samples: 133603328. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:53:01,968][1653645] Updated weights for policy 0, policy_version 260753 (0.0108) [2024-06-15 14:53:03,503][1653645] Updated weights for policy 0, policy_version 260816 (0.0013) [2024-06-15 14:53:05,689][1653645] Updated weights for policy 0, policy_version 260912 (0.0086) [2024-06-15 14:53:05,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 40960.1, 300 sec: 43320.4). Total num frames: 534347776. Throughput: 0: 10513.1. Samples: 133635072. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:05,961][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:53:07,000][1653645] Updated weights for policy 0, policy_version 260977 (0.0085) [2024-06-15 14:53:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.5, 300 sec: 42765.0). Total num frames: 534511616. Throughput: 0: 10114.8. Samples: 133693440. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 14:53:14,588][1653645] Updated weights for policy 0, policy_version 261047 (0.0014) [2024-06-15 14:53:15,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 39321.5, 300 sec: 43098.2). Total num frames: 534675456. Throughput: 0: 10615.5. Samples: 133764096. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:53:16,296][1653645] Updated weights for policy 0, policy_version 261092 (0.0046) [2024-06-15 14:53:18,148][1653645] Updated weights for policy 0, policy_version 261168 (0.0012) [2024-06-15 14:53:19,720][1653645] Updated weights for policy 0, policy_version 261238 (0.0011) [2024-06-15 14:53:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 535035904. Throughput: 0: 10331.1. Samples: 133790208. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:53:25,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 535101440. Throughput: 0: 10615.5. Samples: 133862912. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:53:26,165][1653645] Updated weights for policy 0, policy_version 261296 (0.0013) [2024-06-15 14:53:27,612][1653645] Updated weights for policy 0, policy_version 261344 (0.0013) [2024-06-15 14:53:29,315][1653645] Updated weights for policy 0, policy_version 261410 (0.0014) [2024-06-15 14:53:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 43209.3). Total num frames: 535494656. Throughput: 0: 10653.5. Samples: 133923328. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:30,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:53:31,312][1653645] Updated weights for policy 0, policy_version 261495 (0.0014) [2024-06-15 14:53:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43691.0, 300 sec: 42654.0). Total num frames: 535560192. Throughput: 0: 10626.8. Samples: 133959680. Policy #0 lag: (min: 15.0, avg: 69.8, max: 271.0) [2024-06-15 14:53:35,958][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 14:53:37,544][1651596] Signal inference workers to stop experience collection... (13550 times) [2024-06-15 14:53:37,600][1653645] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-06-15 14:53:37,853][1651596] Signal inference workers to resume experience collection... (13550 times) [2024-06-15 14:53:37,854][1653645] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-06-15 14:53:38,067][1653645] Updated weights for policy 0, policy_version 261562 (0.0033) [2024-06-15 14:53:39,391][1653645] Updated weights for policy 0, policy_version 261618 (0.0125) [2024-06-15 14:53:40,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.8, 300 sec: 43545.0). Total num frames: 535920640. Throughput: 0: 11173.0. Samples: 134034944. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:53:40,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 14:53:41,169][1653645] Updated weights for policy 0, policy_version 261698 (0.0013) [2024-06-15 14:53:42,468][1653645] Updated weights for policy 0, policy_version 261758 (0.0012) [2024-06-15 14:53:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43702.8, 300 sec: 42987.2). Total num frames: 536084480. Throughput: 0: 10968.2. Samples: 134096896. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:53:45,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 14:53:49,918][1653645] Updated weights for policy 0, policy_version 261824 (0.0012) [2024-06-15 14:53:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45328.9, 300 sec: 43431.5). Total num frames: 536313856. Throughput: 0: 11241.2. Samples: 134140928. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:53:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 14:53:51,458][1653645] Updated weights for policy 0, policy_version 261892 (0.0013) [2024-06-15 14:53:53,637][1653645] Updated weights for policy 0, policy_version 261984 (0.0013) [2024-06-15 14:53:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 536608768. Throughput: 0: 11161.6. Samples: 134195712. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:53:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:54:00,482][1653645] Updated weights for policy 0, policy_version 262035 (0.0022) [2024-06-15 14:54:00,960][1648982] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 42876.1). Total num frames: 536674304. Throughput: 0: 11389.2. Samples: 134276608. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:00,961][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:54:02,040][1653645] Updated weights for policy 0, policy_version 262112 (0.0013) [2024-06-15 14:54:04,301][1653645] Updated weights for policy 0, policy_version 262195 (0.0029) [2024-06-15 14:54:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 43764.7). Total num frames: 537067520. Throughput: 0: 11548.5. Samples: 134309888. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:54:06,288][1653645] Updated weights for policy 0, policy_version 262269 (0.0025) [2024-06-15 14:54:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 537133056. Throughput: 0: 11332.3. Samples: 134372864. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:10,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:54:12,591][1653645] Updated weights for policy 0, policy_version 262324 (0.0120) [2024-06-15 14:54:13,776][1653645] Updated weights for policy 0, policy_version 262390 (0.0012) [2024-06-15 14:54:14,020][1651596] Signal inference workers to stop experience collection... (13600 times) [2024-06-15 14:54:14,080][1653645] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-06-15 14:54:14,287][1651596] Signal inference workers to resume experience collection... (13600 times) [2024-06-15 14:54:14,288][1653645] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-06-15 14:54:15,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 48059.8, 300 sec: 43986.9). Total num frames: 537559040. Throughput: 0: 11525.7. Samples: 134441984. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:15,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:54:16,636][1653645] Updated weights for policy 0, policy_version 262498 (0.0012) [2024-06-15 14:54:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 537657344. Throughput: 0: 11286.8. Samples: 134467584. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:20,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 14:54:23,646][1653645] Updated weights for policy 0, policy_version 262548 (0.0014) [2024-06-15 14:54:25,138][1653645] Updated weights for policy 0, policy_version 262624 (0.0012) [2024-06-15 14:54:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 46967.4, 300 sec: 43542.5). Total num frames: 537919488. Throughput: 0: 11377.8. Samples: 134546944. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:54:26,685][1653645] Updated weights for policy 0, policy_version 262688 (0.0011) [2024-06-15 14:54:29,137][1653645] Updated weights for policy 0, policy_version 262782 (0.0218) [2024-06-15 14:54:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44783.0, 300 sec: 43320.4). Total num frames: 538181632. Throughput: 0: 11150.2. Samples: 134598656. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:54:35,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 44783.0, 300 sec: 42876.1). Total num frames: 538247168. Throughput: 0: 11013.7. Samples: 134636544. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:54:36,563][1653645] Updated weights for policy 0, policy_version 262835 (0.0108) [2024-06-15 14:54:37,723][1653645] Updated weights for policy 0, policy_version 262885 (0.0013) [2024-06-15 14:54:40,326][1653645] Updated weights for policy 0, policy_version 262976 (0.0013) [2024-06-15 14:54:40,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 44782.7, 300 sec: 43653.6). Total num frames: 538607616. Throughput: 0: 11184.3. Samples: 134699008. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:40,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:54:42,063][1653645] Updated weights for policy 0, policy_version 263038 (0.0015) [2024-06-15 14:54:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 538705920. Throughput: 0: 10808.9. Samples: 134763008. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 14:54:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:54:49,356][1653645] Updated weights for policy 0, policy_version 263120 (0.0014) [2024-06-15 14:54:50,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 538968064. Throughput: 0: 11025.1. Samples: 134806016. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:54:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:54:52,210][1653645] Updated weights for policy 0, policy_version 263205 (0.0177) [2024-06-15 14:54:53,302][1651596] Signal inference workers to stop experience collection... (13650 times) [2024-06-15 14:54:53,345][1653645] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-06-15 14:54:53,520][1651596] Signal inference workers to resume experience collection... (13650 times) [2024-06-15 14:54:53,521][1653645] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-06-15 14:54:53,738][1653645] Updated weights for policy 0, policy_version 263270 (0.0013) [2024-06-15 14:54:55,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 539230208. Throughput: 0: 10581.3. Samples: 134849024. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:54:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:54:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000263296_539230208.pth... [2024-06-15 14:54:56,068][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000258176_528744448.pth [2024-06-15 14:55:00,698][1653645] Updated weights for policy 0, policy_version 263328 (0.0015) [2024-06-15 14:55:00,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 539295744. Throughput: 0: 10843.0. Samples: 134929920. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:55:03,059][1653645] Updated weights for policy 0, policy_version 263417 (0.0013) [2024-06-15 14:55:04,310][1653645] Updated weights for policy 0, policy_version 263444 (0.0012) [2024-06-15 14:55:05,547][1653645] Updated weights for policy 0, policy_version 263504 (0.0012) [2024-06-15 14:55:05,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 539688960. Throughput: 0: 10695.1. Samples: 134948864. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:55:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 42658.1). Total num frames: 539754496. Throughput: 0: 10524.5. Samples: 135020544. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:55:11,683][1653645] Updated weights for policy 0, policy_version 263553 (0.0015) [2024-06-15 14:55:13,102][1653645] Updated weights for policy 0, policy_version 263613 (0.0012) [2024-06-15 14:55:14,699][1653645] Updated weights for policy 0, policy_version 263674 (0.0012) [2024-06-15 14:55:15,962][1648982] Fps is (10 sec: 39303.8, 60 sec: 42049.1, 300 sec: 43764.0). Total num frames: 540082176. Throughput: 0: 10796.4. Samples: 135084544. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:15,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:55:16,103][1653645] Updated weights for policy 0, policy_version 263730 (0.0017) [2024-06-15 14:55:17,369][1653645] Updated weights for policy 0, policy_version 263797 (0.0026) [2024-06-15 14:55:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 540278784. Throughput: 0: 10615.4. Samples: 135114240. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:20,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 14:55:23,724][1653645] Updated weights for policy 0, policy_version 263824 (0.0012) [2024-06-15 14:55:25,020][1653645] Updated weights for policy 0, policy_version 263869 (0.0017) [2024-06-15 14:55:25,958][1648982] Fps is (10 sec: 36060.8, 60 sec: 42052.2, 300 sec: 43209.3). Total num frames: 540442624. Throughput: 0: 10934.1. Samples: 135191040. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:25,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:55:26,826][1653645] Updated weights for policy 0, policy_version 263931 (0.0015) [2024-06-15 14:55:28,009][1653645] Updated weights for policy 0, policy_version 263986 (0.0016) [2024-06-15 14:55:29,654][1653645] Updated weights for policy 0, policy_version 264056 (0.0097) [2024-06-15 14:55:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 43653.8). Total num frames: 540803072. Throughput: 0: 10740.6. Samples: 135246336. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:55:35,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 540835840. Throughput: 0: 10592.7. Samples: 135282688. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:35,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:55:36,532][1653645] Updated weights for policy 0, policy_version 264112 (0.0015) [2024-06-15 14:55:37,540][1651596] Signal inference workers to stop experience collection... (13700 times) [2024-06-15 14:55:37,603][1653645] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-06-15 14:55:37,879][1651596] Signal inference workers to resume experience collection... (13700 times) [2024-06-15 14:55:37,880][1653645] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-06-15 14:55:37,882][1653645] Updated weights for policy 0, policy_version 264160 (0.0010) [2024-06-15 14:55:39,752][1653645] Updated weights for policy 0, policy_version 264240 (0.0118) [2024-06-15 14:55:40,959][1648982] Fps is (10 sec: 42594.4, 60 sec: 43690.1, 300 sec: 43986.7). Total num frames: 541229056. Throughput: 0: 11059.0. Samples: 135346688. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:55:41,502][1653645] Updated weights for policy 0, policy_version 264318 (0.0042) [2024-06-15 14:55:45,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 43690.4, 300 sec: 42987.1). Total num frames: 541327360. Throughput: 0: 10717.8. Samples: 135412224. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:45,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 14:55:49,512][1653645] Updated weights for policy 0, policy_version 264369 (0.0028) [2024-06-15 14:55:50,958][1648982] Fps is (10 sec: 32771.2, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 541556736. Throughput: 0: 11207.1. Samples: 135453184. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:55:51,717][1653645] Updated weights for policy 0, policy_version 264466 (0.0183) [2024-06-15 14:55:52,978][1653645] Updated weights for policy 0, policy_version 264528 (0.0012) [2024-06-15 14:55:55,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 541851648. Throughput: 0: 10661.0. Samples: 135500288. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:55:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:56:00,812][1653645] Updated weights for policy 0, policy_version 264579 (0.0025) [2024-06-15 14:56:00,958][1648982] Fps is (10 sec: 29490.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 541851648. Throughput: 0: 10935.1. Samples: 135576576. Policy #0 lag: (min: 31.0, avg: 83.6, max: 271.0) [2024-06-15 14:56:00,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 14:56:02,176][1653645] Updated weights for policy 0, policy_version 264644 (0.0013) [2024-06-15 14:56:03,871][1653645] Updated weights for policy 0, policy_version 264725 (0.0014) [2024-06-15 14:56:05,590][1653645] Updated weights for policy 0, policy_version 264800 (0.0011) [2024-06-15 14:56:05,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 542343168. Throughput: 0: 10922.7. Samples: 135605760. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:56:06,390][1653645] Updated weights for policy 0, policy_version 264832 (0.0012) [2024-06-15 14:56:10,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 542375936. Throughput: 0: 10706.5. Samples: 135672832. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 14:56:13,976][1653645] Updated weights for policy 0, policy_version 264896 (0.0014) [2024-06-15 14:56:15,978][1648982] Fps is (10 sec: 32700.6, 60 sec: 43133.1, 300 sec: 43650.6). Total num frames: 542670848. Throughput: 0: 10883.6. Samples: 135736320. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:15,979][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:56:16,293][1653645] Updated weights for policy 0, policy_version 264992 (0.0233) [2024-06-15 14:56:17,259][1651596] Signal inference workers to stop experience collection... (13750 times) [2024-06-15 14:56:17,334][1653645] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-06-15 14:56:17,525][1651596] Signal inference workers to resume experience collection... (13750 times) [2024-06-15 14:56:17,542][1653645] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-06-15 14:56:17,979][1653645] Updated weights for policy 0, policy_version 265056 (0.0011) [2024-06-15 14:56:20,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 542900224. Throughput: 0: 10672.4. Samples: 135762944. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:56:24,840][1653645] Updated weights for policy 0, policy_version 265104 (0.0012) [2024-06-15 14:56:25,932][1653645] Updated weights for policy 0, policy_version 265152 (0.0014) [2024-06-15 14:56:25,958][1648982] Fps is (10 sec: 36118.5, 60 sec: 43144.6, 300 sec: 43098.2). Total num frames: 543031296. Throughput: 0: 10911.5. Samples: 135837696. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:56:28,067][1653645] Updated weights for policy 0, policy_version 265232 (0.0014) [2024-06-15 14:56:29,223][1653645] Updated weights for policy 0, policy_version 265279 (0.0010) [2024-06-15 14:56:30,900][1653645] Updated weights for policy 0, policy_version 265333 (0.0011) [2024-06-15 14:56:30,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 543391744. Throughput: 0: 10718.0. Samples: 135894528. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 14:56:35,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 543424512. Throughput: 0: 10763.4. Samples: 135937536. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:56:36,556][1653645] Updated weights for policy 0, policy_version 265360 (0.0018) [2024-06-15 14:56:38,594][1653645] Updated weights for policy 0, policy_version 265441 (0.0019) [2024-06-15 14:56:40,777][1653645] Updated weights for policy 0, policy_version 265520 (0.0010) [2024-06-15 14:56:40,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 42599.1, 300 sec: 43877.6). Total num frames: 543784960. Throughput: 0: 11059.2. Samples: 135997952. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:56:42,828][1653645] Updated weights for policy 0, policy_version 265584 (0.0011) [2024-06-15 14:56:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.9, 300 sec: 43320.4). Total num frames: 543948800. Throughput: 0: 10774.8. Samples: 136061440. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 14:56:49,745][1653645] Updated weights for policy 0, policy_version 265635 (0.0013) [2024-06-15 14:56:50,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 42598.5, 300 sec: 43209.4). Total num frames: 544112640. Throughput: 0: 10934.0. Samples: 136097792. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:56:51,545][1653645] Updated weights for policy 0, policy_version 265712 (0.0012) [2024-06-15 14:56:53,124][1653645] Updated weights for policy 0, policy_version 265776 (0.0012) [2024-06-15 14:56:54,847][1653645] Updated weights for policy 0, policy_version 265827 (0.0014) [2024-06-15 14:56:55,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 544473088. Throughput: 0: 10660.9. Samples: 136152576. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:56:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:56:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000265856_544473088.pth... [2024-06-15 14:56:55,994][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000260736_533987328.pth [2024-06-15 14:57:00,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.8, 300 sec: 42654.0). Total num frames: 544473088. Throughput: 0: 10779.7. Samples: 136221184. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:57:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:57:02,057][1653645] Updated weights for policy 0, policy_version 265888 (0.0021) [2024-06-15 14:57:03,294][1651596] Signal inference workers to stop experience collection... (13800 times) [2024-06-15 14:57:03,327][1653645] Updated weights for policy 0, policy_version 265938 (0.0016) [2024-06-15 14:57:03,362][1653645] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-06-15 14:57:03,595][1651596] Signal inference workers to resume experience collection... (13800 times) [2024-06-15 14:57:03,596][1653645] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-06-15 14:57:04,834][1653645] Updated weights for policy 0, policy_version 266000 (0.0010) [2024-06-15 14:57:05,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 41505.9, 300 sec: 43875.7). Total num frames: 544833536. Throughput: 0: 10865.7. Samples: 136251904. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:57:05,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 14:57:06,288][1653645] Updated weights for policy 0, policy_version 266050 (0.0045) [2024-06-15 14:57:07,723][1653645] Updated weights for policy 0, policy_version 266108 (0.0013) [2024-06-15 14:57:10,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 544997376. Throughput: 0: 10581.4. Samples: 136313856. Policy #0 lag: (min: 14.0, avg: 58.1, max: 266.0) [2024-06-15 14:57:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 14:57:14,260][1653645] Updated weights for policy 0, policy_version 266152 (0.0013) [2024-06-15 14:57:15,959][1648982] Fps is (10 sec: 36039.4, 60 sec: 42065.4, 300 sec: 43320.2). Total num frames: 545193984. Throughput: 0: 10910.8. Samples: 136385536. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:15,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:57:16,061][1653645] Updated weights for policy 0, policy_version 266224 (0.0012) [2024-06-15 14:57:16,743][1653645] Updated weights for policy 0, policy_version 266246 (0.0013) [2024-06-15 14:57:19,460][1653645] Updated weights for policy 0, policy_version 266336 (0.0110) [2024-06-15 14:57:20,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 545521664. Throughput: 0: 10535.8. Samples: 136411648. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:57:25,958][1648982] Fps is (10 sec: 32773.1, 60 sec: 41506.1, 300 sec: 42654.0). Total num frames: 545521664. Throughput: 0: 10660.9. Samples: 136477696. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:25,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:57:26,415][1653645] Updated weights for policy 0, policy_version 266400 (0.0016) [2024-06-15 14:57:28,564][1653645] Updated weights for policy 0, policy_version 266487 (0.0067) [2024-06-15 14:57:30,141][1653645] Updated weights for policy 0, policy_version 266528 (0.0024) [2024-06-15 14:57:30,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 42052.2, 300 sec: 43986.9). Total num frames: 545914880. Throughput: 0: 10501.7. Samples: 136534016. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:57:31,942][1653645] Updated weights for policy 0, policy_version 266601 (0.0017) [2024-06-15 14:57:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 546045952. Throughput: 0: 10319.6. Samples: 136562176. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:57:39,492][1653645] Updated weights for policy 0, policy_version 266675 (0.0018) [2024-06-15 14:57:40,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 40960.0, 300 sec: 43322.9). Total num frames: 546242560. Throughput: 0: 10683.8. Samples: 136633344. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:57:41,153][1653645] Updated weights for policy 0, policy_version 266736 (0.0015) [2024-06-15 14:57:42,586][1653645] Updated weights for policy 0, policy_version 266773 (0.0015) [2024-06-15 14:57:43,689][1651596] Signal inference workers to stop experience collection... (13850 times) [2024-06-15 14:57:43,736][1653645] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-06-15 14:57:43,742][1653645] Updated weights for policy 0, policy_version 266820 (0.0026) [2024-06-15 14:57:43,886][1651596] Signal inference workers to resume experience collection... (13850 times) [2024-06-15 14:57:43,886][1653645] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-06-15 14:57:44,720][1653645] Updated weights for policy 0, policy_version 266880 (0.0014) [2024-06-15 14:57:45,960][1648982] Fps is (10 sec: 52418.5, 60 sec: 43689.2, 300 sec: 43986.6). Total num frames: 546570240. Throughput: 0: 10569.5. Samples: 136696832. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:45,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:57:50,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 546701312. Throughput: 0: 10752.1. Samples: 136735744. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:57:51,568][1653645] Updated weights for policy 0, policy_version 266960 (0.0130) [2024-06-15 14:57:54,407][1653645] Updated weights for policy 0, policy_version 267042 (0.0014) [2024-06-15 14:57:55,672][1653645] Updated weights for policy 0, policy_version 267077 (0.0061) [2024-06-15 14:57:55,958][1648982] Fps is (10 sec: 42606.4, 60 sec: 42052.3, 300 sec: 44097.9). Total num frames: 546996224. Throughput: 0: 10672.3. Samples: 136794112. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:57:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:57:56,738][1653645] Updated weights for policy 0, policy_version 267133 (0.0011) [2024-06-15 14:58:00,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 547094528. Throughput: 0: 10957.2. Samples: 136878592. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:58:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:58:02,198][1653645] Updated weights for policy 0, policy_version 267185 (0.0017) [2024-06-15 14:58:03,802][1653645] Updated weights for policy 0, policy_version 267248 (0.0014) [2024-06-15 14:58:05,591][1653645] Updated weights for policy 0, policy_version 267296 (0.0013) [2024-06-15 14:58:05,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 547454976. Throughput: 0: 10877.2. Samples: 136901120. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:58:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 14:58:06,744][1653645] Updated weights for policy 0, policy_version 267344 (0.0011) [2024-06-15 14:58:07,743][1653645] Updated weights for policy 0, policy_version 267392 (0.0012) [2024-06-15 14:58:10,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 547618816. Throughput: 0: 11104.7. Samples: 136977408. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:58:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:58:14,187][1653645] Updated weights for policy 0, policy_version 267460 (0.0013) [2024-06-15 14:58:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44784.2, 300 sec: 43542.6). Total num frames: 547880960. Throughput: 0: 11207.1. Samples: 137038336. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:58:15,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 14:58:16,738][1653645] Updated weights for policy 0, policy_version 267524 (0.0012) [2024-06-15 14:58:17,780][1653645] Updated weights for policy 0, policy_version 267575 (0.0013) [2024-06-15 14:58:19,286][1653645] Updated weights for policy 0, policy_version 267638 (0.0011) [2024-06-15 14:58:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 548143104. Throughput: 0: 11298.1. Samples: 137070592. Policy #0 lag: (min: 15.0, avg: 80.3, max: 271.0) [2024-06-15 14:58:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 14:58:24,497][1653645] Updated weights for policy 0, policy_version 267696 (0.0014) [2024-06-15 14:58:25,871][1653645] Updated weights for policy 0, policy_version 267744 (0.0012) [2024-06-15 14:58:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 46967.6, 300 sec: 43542.6). Total num frames: 548339712. Throughput: 0: 11468.8. Samples: 137149440. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:58:28,487][1651596] Signal inference workers to stop experience collection... (13900 times) [2024-06-15 14:58:28,544][1653645] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-06-15 14:58:28,726][1651596] Signal inference workers to resume experience collection... (13900 times) [2024-06-15 14:58:28,727][1653645] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-06-15 14:58:28,877][1653645] Updated weights for policy 0, policy_version 267826 (0.0015) [2024-06-15 14:58:30,508][1653645] Updated weights for policy 0, policy_version 267897 (0.0013) [2024-06-15 14:58:30,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 548667392. Throughput: 0: 11366.9. Samples: 137208320. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:58:35,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 548700160. Throughput: 0: 11377.7. Samples: 137247744. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 14:58:36,510][1653645] Updated weights for policy 0, policy_version 267962 (0.0014) [2024-06-15 14:58:38,435][1653645] Updated weights for policy 0, policy_version 268029 (0.0016) [2024-06-15 14:58:40,627][1653645] Updated weights for policy 0, policy_version 268101 (0.0013) [2024-06-15 14:58:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 47513.6, 300 sec: 44098.0). Total num frames: 549093376. Throughput: 0: 11525.7. Samples: 137312768. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:58:45,958][1648982] Fps is (10 sec: 49150.7, 60 sec: 43691.9, 300 sec: 43653.6). Total num frames: 549191680. Throughput: 0: 11093.3. Samples: 137377792. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:58:47,657][1653645] Updated weights for policy 0, policy_version 268167 (0.0014) [2024-06-15 14:58:49,792][1653645] Updated weights for policy 0, policy_version 268241 (0.0013) [2024-06-15 14:58:50,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45875.1, 300 sec: 43542.6). Total num frames: 549453824. Throughput: 0: 11411.9. Samples: 137414656. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 14:58:51,359][1653645] Updated weights for policy 0, policy_version 268291 (0.0015) [2024-06-15 14:58:53,229][1653645] Updated weights for policy 0, policy_version 268368 (0.0017) [2024-06-15 14:58:54,369][1653645] Updated weights for policy 0, policy_version 268416 (0.0125) [2024-06-15 14:58:55,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 549715968. Throughput: 0: 11002.3. Samples: 137472512. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:58:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:58:56,003][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000268416_549715968.pth... [2024-06-15 14:58:56,042][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000263296_539230208.pth [2024-06-15 14:58:59,889][1653645] Updated weights for policy 0, policy_version 268479 (0.0156) [2024-06-15 14:59:00,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 46421.4, 300 sec: 43431.5). Total num frames: 549879808. Throughput: 0: 11389.2. Samples: 137550848. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:59:01,410][1653645] Updated weights for policy 0, policy_version 268531 (0.0016) [2024-06-15 14:59:03,523][1653645] Updated weights for policy 0, policy_version 268599 (0.0032) [2024-06-15 14:59:05,601][1653645] Updated weights for policy 0, policy_version 268666 (0.0011) [2024-06-15 14:59:05,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 46421.2, 300 sec: 44431.2). Total num frames: 550240256. Throughput: 0: 11309.5. Samples: 137579520. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 14:59:10,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44783.0, 300 sec: 43209.3). Total num frames: 550305792. Throughput: 0: 11252.6. Samples: 137655808. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 14:59:11,144][1653645] Updated weights for policy 0, policy_version 268730 (0.0013) [2024-06-15 14:59:13,123][1653645] Updated weights for policy 0, policy_version 268800 (0.0112) [2024-06-15 14:59:14,476][1651596] Signal inference workers to stop experience collection... (13950 times) [2024-06-15 14:59:14,504][1653645] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-06-15 14:59:14,796][1651596] Signal inference workers to resume experience collection... (13950 times) [2024-06-15 14:59:14,796][1653645] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-06-15 14:59:15,964][1648982] Fps is (10 sec: 39298.8, 60 sec: 45870.7, 300 sec: 43986.0). Total num frames: 550633472. Throughput: 0: 11319.4. Samples: 137717760. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:15,964][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 14:59:16,436][1653645] Updated weights for policy 0, policy_version 268882 (0.0013) [2024-06-15 14:59:17,435][1653645] Updated weights for policy 0, policy_version 268926 (0.0020) [2024-06-15 14:59:20,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 550764544. Throughput: 0: 11184.4. Samples: 137751040. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:59:22,095][1653645] Updated weights for policy 0, policy_version 268980 (0.0036) [2024-06-15 14:59:23,370][1653645] Updated weights for policy 0, policy_version 269015 (0.0015) [2024-06-15 14:59:25,192][1653645] Updated weights for policy 0, policy_version 269058 (0.0046) [2024-06-15 14:59:25,958][1648982] Fps is (10 sec: 42623.6, 60 sec: 45329.1, 300 sec: 43653.6). Total num frames: 551059456. Throughput: 0: 11423.3. Samples: 137826816. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 14:59:27,174][1653645] Updated weights for policy 0, policy_version 269121 (0.0015) [2024-06-15 14:59:28,715][1653645] Updated weights for policy 0, policy_version 269182 (0.0032) [2024-06-15 14:59:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 551288832. Throughput: 0: 11412.0. Samples: 137891328. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 14:59:33,824][1653645] Updated weights for policy 0, policy_version 269232 (0.0033) [2024-06-15 14:59:35,020][1653645] Updated weights for policy 0, policy_version 269265 (0.0012) [2024-06-15 14:59:35,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 47513.6, 300 sec: 43875.8). Total num frames: 551550976. Throughput: 0: 11423.3. Samples: 137928704. Policy #0 lag: (min: 7.0, avg: 125.7, max: 327.0) [2024-06-15 14:59:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 14:59:38,048][1653645] Updated weights for policy 0, policy_version 269346 (0.0031) [2024-06-15 14:59:38,646][1653645] Updated weights for policy 0, policy_version 269376 (0.0012) [2024-06-15 14:59:40,320][1653645] Updated weights for policy 0, policy_version 269433 (0.0039) [2024-06-15 14:59:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 551813120. Throughput: 0: 11548.5. Samples: 137992192. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 14:59:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 14:59:45,156][1653645] Updated weights for policy 0, policy_version 269473 (0.0112) [2024-06-15 14:59:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 551944192. Throughput: 0: 11400.5. Samples: 138063872. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 14:59:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 14:59:47,089][1653645] Updated weights for policy 0, policy_version 269552 (0.0013) [2024-06-15 14:59:49,404][1653645] Updated weights for policy 0, policy_version 269616 (0.0024) [2024-06-15 14:59:50,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 46421.1, 300 sec: 44097.9). Total num frames: 552239104. Throughput: 0: 11571.2. Samples: 138100224. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 14:59:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 14:59:51,427][1653645] Updated weights for policy 0, policy_version 269665 (0.0012) [2024-06-15 14:59:55,940][1653645] Updated weights for policy 0, policy_version 269701 (0.0013) [2024-06-15 14:59:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 552337408. Throughput: 0: 11275.4. Samples: 138163200. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 14:59:55,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 14:59:56,827][1653645] Updated weights for policy 0, policy_version 269759 (0.0050) [2024-06-15 14:59:58,996][1653645] Updated weights for policy 0, policy_version 269817 (0.0097) [2024-06-15 15:00:00,506][1653645] Updated weights for policy 0, policy_version 269858 (0.0013) [2024-06-15 15:00:00,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 46967.4, 300 sec: 44098.0). Total num frames: 552697856. Throughput: 0: 11538.6. Samples: 138236928. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:00:01,968][1651596] Signal inference workers to stop experience collection... (14000 times) [2024-06-15 15:00:02,013][1653645] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-06-15 15:00:02,282][1651596] Signal inference workers to resume experience collection... (14000 times) [2024-06-15 15:00:02,283][1653645] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-06-15 15:00:03,207][1653645] Updated weights for policy 0, policy_version 269936 (0.0038) [2024-06-15 15:00:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 552861696. Throughput: 0: 11411.9. Samples: 138264576. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:00:07,028][1653645] Updated weights for policy 0, policy_version 269959 (0.0010) [2024-06-15 15:00:09,759][1653645] Updated weights for policy 0, policy_version 270023 (0.0015) [2024-06-15 15:00:10,666][1653645] Updated weights for policy 0, policy_version 270080 (0.0012) [2024-06-15 15:00:10,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 46967.3, 300 sec: 44209.7). Total num frames: 553123840. Throughput: 0: 11423.2. Samples: 138340864. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:00:13,730][1653645] Updated weights for policy 0, policy_version 270147 (0.0015) [2024-06-15 15:00:15,365][1653645] Updated weights for policy 0, policy_version 270208 (0.0010) [2024-06-15 15:00:15,958][1648982] Fps is (10 sec: 52425.1, 60 sec: 45879.3, 300 sec: 44431.1). Total num frames: 553385984. Throughput: 0: 11343.5. Samples: 138401792. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:15,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:00:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45875.0, 300 sec: 44320.1). Total num frames: 553517056. Throughput: 0: 11332.2. Samples: 138438656. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:00:21,657][1653645] Updated weights for policy 0, policy_version 270288 (0.0012) [2024-06-15 15:00:22,775][1653645] Updated weights for policy 0, policy_version 270336 (0.0012) [2024-06-15 15:00:24,420][1653645] Updated weights for policy 0, policy_version 270391 (0.0012) [2024-06-15 15:00:25,592][1653645] Updated weights for policy 0, policy_version 270419 (0.0019) [2024-06-15 15:00:25,958][1648982] Fps is (10 sec: 45878.1, 60 sec: 46421.3, 300 sec: 44209.0). Total num frames: 553844736. Throughput: 0: 11389.2. Samples: 138504704. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:00:26,649][1653645] Updated weights for policy 0, policy_version 270459 (0.0011) [2024-06-15 15:00:30,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 554041344. Throughput: 0: 11332.3. Samples: 138573824. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:00:33,146][1653645] Updated weights for policy 0, policy_version 270544 (0.0014) [2024-06-15 15:00:34,314][1653645] Updated weights for policy 0, policy_version 270592 (0.0012) [2024-06-15 15:00:35,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45328.9, 300 sec: 44209.2). Total num frames: 554270720. Throughput: 0: 11400.6. Samples: 138613248. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:00:36,063][1653645] Updated weights for policy 0, policy_version 270648 (0.0022) [2024-06-15 15:00:38,603][1653645] Updated weights for policy 0, policy_version 270704 (0.0015) [2024-06-15 15:00:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 554434560. Throughput: 0: 11298.1. Samples: 138671616. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:00:44,836][1653645] Updated weights for policy 0, policy_version 270800 (0.0013) [2024-06-15 15:00:45,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 554696704. Throughput: 0: 11161.6. Samples: 138739200. Policy #0 lag: (min: 15.0, avg: 130.3, max: 271.0) [2024-06-15 15:00:45,960][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:00:47,468][1653645] Updated weights for policy 0, policy_version 270864 (0.0014) [2024-06-15 15:00:48,410][1653645] Updated weights for policy 0, policy_version 270903 (0.0011) [2024-06-15 15:00:50,046][1651596] Signal inference workers to stop experience collection... (14050 times) [2024-06-15 15:00:50,151][1653645] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-06-15 15:00:50,321][1651596] Signal inference workers to resume experience collection... (14050 times) [2024-06-15 15:00:50,323][1653645] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-06-15 15:00:50,326][1653645] Updated weights for policy 0, policy_version 270960 (0.0147) [2024-06-15 15:00:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.3, 300 sec: 44431.2). Total num frames: 554958848. Throughput: 0: 11286.8. Samples: 138772480. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:00:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:00:54,056][1653645] Updated weights for policy 0, policy_version 271024 (0.0012) [2024-06-15 15:00:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 555089920. Throughput: 0: 11013.8. Samples: 138836480. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:00:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:00:55,975][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000271040_555089920.pth... [2024-06-15 15:00:56,049][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000265856_544473088.pth [2024-06-15 15:00:57,631][1653645] Updated weights for policy 0, policy_version 271088 (0.0012) [2024-06-15 15:01:00,142][1653645] Updated weights for policy 0, policy_version 271157 (0.0017) [2024-06-15 15:01:00,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 555352064. Throughput: 0: 11184.5. Samples: 138905088. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:01:01,563][1653645] Updated weights for policy 0, policy_version 271200 (0.0012) [2024-06-15 15:01:02,314][1653645] Updated weights for policy 0, policy_version 271230 (0.0010) [2024-06-15 15:01:05,220][1653645] Updated weights for policy 0, policy_version 271280 (0.0014) [2024-06-15 15:01:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 555614208. Throughput: 0: 11161.7. Samples: 138940928. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:05,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 15:01:08,739][1653645] Updated weights for policy 0, policy_version 271332 (0.0014) [2024-06-15 15:01:10,714][1653645] Updated weights for policy 0, policy_version 271379 (0.0013) [2024-06-15 15:01:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.1, 300 sec: 44545.4). Total num frames: 555810816. Throughput: 0: 11264.0. Samples: 139011584. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:01:11,606][1653645] Updated weights for policy 0, policy_version 271418 (0.0011) [2024-06-15 15:01:12,555][1653645] Updated weights for policy 0, policy_version 271442 (0.0011) [2024-06-15 15:01:13,641][1653645] Updated weights for policy 0, policy_version 271484 (0.0033) [2024-06-15 15:01:15,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 44237.2, 300 sec: 44542.2). Total num frames: 556040192. Throughput: 0: 11298.1. Samples: 139082240. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:15,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 15:01:19,569][1653645] Updated weights for policy 0, policy_version 271553 (0.0038) [2024-06-15 15:01:20,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 556269568. Throughput: 0: 11161.6. Samples: 139115520. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:01:22,292][1653645] Updated weights for policy 0, policy_version 271620 (0.0119) [2024-06-15 15:01:23,747][1653645] Updated weights for policy 0, policy_version 271676 (0.0019) [2024-06-15 15:01:25,632][1653645] Updated weights for policy 0, policy_version 271736 (0.0015) [2024-06-15 15:01:25,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 44783.0, 300 sec: 44542.2). Total num frames: 556531712. Throughput: 0: 11275.4. Samples: 139179008. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:01:29,130][1653645] Updated weights for policy 0, policy_version 271804 (0.0017) [2024-06-15 15:01:30,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 556662784. Throughput: 0: 11173.0. Samples: 139241984. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:01:32,427][1653645] Updated weights for policy 0, policy_version 271856 (0.0012) [2024-06-15 15:01:34,835][1653645] Updated weights for policy 0, policy_version 271893 (0.0012) [2024-06-15 15:01:35,876][1653645] Updated weights for policy 0, policy_version 271933 (0.0011) [2024-06-15 15:01:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 556924928. Throughput: 0: 11229.8. Samples: 139277824. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:01:37,668][1653645] Updated weights for policy 0, policy_version 271998 (0.0014) [2024-06-15 15:01:40,328][1651596] Signal inference workers to stop experience collection... (14100 times) [2024-06-15 15:01:40,367][1653645] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-06-15 15:01:40,633][1651596] Signal inference workers to resume experience collection... (14100 times) [2024-06-15 15:01:40,635][1653645] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-06-15 15:01:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 557121536. Throughput: 0: 11195.7. Samples: 139340288. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:01:41,240][1653645] Updated weights for policy 0, policy_version 272058 (0.0026) [2024-06-15 15:01:44,501][1653645] Updated weights for policy 0, policy_version 272096 (0.0109) [2024-06-15 15:01:45,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 557318144. Throughput: 0: 11082.0. Samples: 139403776. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:01:47,731][1653645] Updated weights for policy 0, policy_version 272176 (0.0133) [2024-06-15 15:01:48,807][1653645] Updated weights for policy 0, policy_version 272208 (0.0014) [2024-06-15 15:01:50,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 557580288. Throughput: 0: 10956.8. Samples: 139433984. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:01:52,774][1653645] Updated weights for policy 0, policy_version 272276 (0.0014) [2024-06-15 15:01:55,884][1653645] Updated weights for policy 0, policy_version 272323 (0.0014) [2024-06-15 15:01:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 557711360. Throughput: 0: 10820.3. Samples: 139498496. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:01:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:01:58,486][1653645] Updated weights for policy 0, policy_version 272385 (0.0032) [2024-06-15 15:01:59,823][1653645] Updated weights for policy 0, policy_version 272448 (0.0012) [2024-06-15 15:02:00,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 557973504. Throughput: 0: 10695.2. Samples: 139563520. Policy #0 lag: (min: 15.0, avg: 141.6, max: 271.0) [2024-06-15 15:02:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:02:04,718][1653645] Updated weights for policy 0, policy_version 272516 (0.0119) [2024-06-15 15:02:05,738][1653645] Updated weights for policy 0, policy_version 272576 (0.0013) [2024-06-15 15:02:05,975][1648982] Fps is (10 sec: 52340.2, 60 sec: 43678.4, 300 sec: 44872.9). Total num frames: 558235648. Throughput: 0: 10645.6. Samples: 139594752. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:05,975][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:02:08,672][1653645] Updated weights for policy 0, policy_version 272629 (0.0013) [2024-06-15 15:02:10,646][1653645] Updated weights for policy 0, policy_version 272688 (0.0013) [2024-06-15 15:02:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 45097.9). Total num frames: 558497792. Throughput: 0: 10911.3. Samples: 139670016. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:02:13,589][1653645] Updated weights for policy 0, policy_version 272760 (0.0014) [2024-06-15 15:02:15,958][1648982] Fps is (10 sec: 39388.3, 60 sec: 43144.7, 300 sec: 44431.2). Total num frames: 558628864. Throughput: 0: 10968.2. Samples: 139735552. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:02:17,295][1653645] Updated weights for policy 0, policy_version 272816 (0.0024) [2024-06-15 15:02:19,210][1653645] Updated weights for policy 0, policy_version 272852 (0.0020) [2024-06-15 15:02:20,151][1653645] Updated weights for policy 0, policy_version 272889 (0.0015) [2024-06-15 15:02:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 558891008. Throughput: 0: 10990.9. Samples: 139772416. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:02:21,439][1653645] Updated weights for policy 0, policy_version 272915 (0.0012) [2024-06-15 15:02:24,657][1653645] Updated weights for policy 0, policy_version 272994 (0.0014) [2024-06-15 15:02:25,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 559153152. Throughput: 0: 11093.3. Samples: 139839488. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:25,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:02:27,988][1653645] Updated weights for policy 0, policy_version 273042 (0.0012) [2024-06-15 15:02:28,845][1653645] Updated weights for policy 0, policy_version 273088 (0.0012) [2024-06-15 15:02:30,676][1651596] Signal inference workers to stop experience collection... (14150 times) [2024-06-15 15:02:30,720][1653645] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-06-15 15:02:30,959][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 559284224. Throughput: 0: 11241.2. Samples: 139909632. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:30,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:02:31,018][1651596] Signal inference workers to resume experience collection... (14150 times) [2024-06-15 15:02:31,019][1653645] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-06-15 15:02:31,760][1653645] Updated weights for policy 0, policy_version 273129 (0.0013) [2024-06-15 15:02:32,652][1653645] Updated weights for policy 0, policy_version 273168 (0.0049) [2024-06-15 15:02:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 559546368. Throughput: 0: 11161.6. Samples: 139936256. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:02:36,454][1653645] Updated weights for policy 0, policy_version 273248 (0.0120) [2024-06-15 15:02:39,611][1653645] Updated weights for policy 0, policy_version 273296 (0.0013) [2024-06-15 15:02:40,823][1653645] Updated weights for policy 0, policy_version 273344 (0.0011) [2024-06-15 15:02:40,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44783.0, 300 sec: 44875.8). Total num frames: 559808512. Throughput: 0: 11343.7. Samples: 140008960. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:40,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:02:43,885][1653645] Updated weights for policy 0, policy_version 273408 (0.0025) [2024-06-15 15:02:45,752][1653645] Updated weights for policy 0, policy_version 273464 (0.0013) [2024-06-15 15:02:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 560070656. Throughput: 0: 11275.4. Samples: 140070912. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:02:48,652][1653645] Updated weights for policy 0, policy_version 273520 (0.0013) [2024-06-15 15:02:50,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 560234496. Throughput: 0: 11347.9. Samples: 140105216. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:02:51,837][1653645] Updated weights for policy 0, policy_version 273589 (0.0012) [2024-06-15 15:02:55,101][1653645] Updated weights for policy 0, policy_version 273648 (0.0014) [2024-06-15 15:02:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 560463872. Throughput: 0: 11229.9. Samples: 140175360. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:02:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:02:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000273664_560463872.pth... [2024-06-15 15:02:56,037][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000268416_549715968.pth [2024-06-15 15:02:58,373][1653645] Updated weights for policy 0, policy_version 273727 (0.0017) [2024-06-15 15:03:00,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 560693248. Throughput: 0: 11104.7. Samples: 140235264. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:03:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:03:01,102][1653645] Updated weights for policy 0, policy_version 273792 (0.0013) [2024-06-15 15:03:03,940][1653645] Updated weights for policy 0, policy_version 273853 (0.0013) [2024-06-15 15:03:05,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44249.1, 300 sec: 44986.6). Total num frames: 560889856. Throughput: 0: 11104.7. Samples: 140272128. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:03:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:03:06,852][1653645] Updated weights for policy 0, policy_version 273913 (0.0012) [2024-06-15 15:03:10,392][1653645] Updated weights for policy 0, policy_version 273955 (0.0014) [2024-06-15 15:03:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 561119232. Throughput: 0: 11059.2. Samples: 140337152. Policy #0 lag: (min: 6.0, avg: 127.7, max: 262.0) [2024-06-15 15:03:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:03:11,738][1653645] Updated weights for policy 0, policy_version 274000 (0.0028) [2024-06-15 15:03:14,484][1653645] Updated weights for policy 0, policy_version 274064 (0.0014) [2024-06-15 15:03:15,668][1653645] Updated weights for policy 0, policy_version 274107 (0.0012) [2024-06-15 15:03:15,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 561381376. Throughput: 0: 10990.9. Samples: 140404224. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:03:17,509][1651596] Signal inference workers to stop experience collection... (14200 times) [2024-06-15 15:03:17,596][1653645] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-06-15 15:03:17,852][1651596] Signal inference workers to resume experience collection... (14200 times) [2024-06-15 15:03:17,853][1653645] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-06-15 15:03:18,466][1653645] Updated weights for policy 0, policy_version 274172 (0.0012) [2024-06-15 15:03:20,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 561512448. Throughput: 0: 11161.6. Samples: 140438528. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:03:22,612][1653645] Updated weights for policy 0, policy_version 274240 (0.0014) [2024-06-15 15:03:24,426][1653645] Updated weights for policy 0, policy_version 274302 (0.0012) [2024-06-15 15:03:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 561774592. Throughput: 0: 10865.8. Samples: 140497920. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:03:27,615][1653645] Updated weights for policy 0, policy_version 274361 (0.0017) [2024-06-15 15:03:29,575][1653645] Updated weights for policy 0, policy_version 274402 (0.0012) [2024-06-15 15:03:30,096][1653645] Updated weights for policy 0, policy_version 274432 (0.0065) [2024-06-15 15:03:30,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 562036736. Throughput: 0: 11207.1. Samples: 140575232. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:03:34,160][1653645] Updated weights for policy 0, policy_version 274496 (0.0140) [2024-06-15 15:03:35,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 562233344. Throughput: 0: 11173.0. Samples: 140608000. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:03:36,401][1653645] Updated weights for policy 0, policy_version 274552 (0.0013) [2024-06-15 15:03:40,548][1653645] Updated weights for policy 0, policy_version 274627 (0.0011) [2024-06-15 15:03:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 562462720. Throughput: 0: 11002.3. Samples: 140670464. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:03:44,965][1653645] Updated weights for policy 0, policy_version 274708 (0.0013) [2024-06-15 15:03:45,779][1653645] Updated weights for policy 0, policy_version 274751 (0.0013) [2024-06-15 15:03:45,962][1648982] Fps is (10 sec: 45854.7, 60 sec: 43687.4, 300 sec: 44874.8). Total num frames: 562692096. Throughput: 0: 11262.8. Samples: 140742144. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:45,963][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:03:48,009][1653645] Updated weights for policy 0, policy_version 274811 (0.0019) [2024-06-15 15:03:50,462][1653645] Updated weights for policy 0, policy_version 274872 (0.0012) [2024-06-15 15:03:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 562954240. Throughput: 0: 11138.9. Samples: 140773376. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:03:53,063][1653645] Updated weights for policy 0, policy_version 274913 (0.0014) [2024-06-15 15:03:55,958][1648982] Fps is (10 sec: 39339.5, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 563085312. Throughput: 0: 11275.4. Samples: 140844544. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:03:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:03:57,223][1653645] Updated weights for policy 0, policy_version 274992 (0.0012) [2024-06-15 15:03:59,778][1653645] Updated weights for policy 0, policy_version 275041 (0.0017) [2024-06-15 15:04:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 563347456. Throughput: 0: 11002.3. Samples: 140899328. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:04:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:04:01,839][1653645] Updated weights for policy 0, policy_version 275073 (0.0011) [2024-06-15 15:04:02,844][1653645] Updated weights for policy 0, policy_version 275132 (0.0011) [2024-06-15 15:04:05,211][1653645] Updated weights for policy 0, policy_version 275193 (0.0013) [2024-06-15 15:04:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 563609600. Throughput: 0: 11082.0. Samples: 140937216. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:04:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:04:08,391][1651596] Signal inference workers to stop experience collection... (14250 times) [2024-06-15 15:04:08,419][1653645] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-06-15 15:04:08,663][1651596] Signal inference workers to resume experience collection... (14250 times) [2024-06-15 15:04:08,664][1653645] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-06-15 15:04:09,703][1653645] Updated weights for policy 0, policy_version 275256 (0.0013) [2024-06-15 15:04:10,792][1653645] Updated weights for policy 0, policy_version 275283 (0.0012) [2024-06-15 15:04:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44654.2). Total num frames: 563806208. Throughput: 0: 11298.1. Samples: 141006336. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:04:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:04:14,065][1653645] Updated weights for policy 0, policy_version 275351 (0.0013) [2024-06-15 15:04:15,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 564068352. Throughput: 0: 11138.8. Samples: 141076480. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:04:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:04:16,107][1653645] Updated weights for policy 0, policy_version 275440 (0.0015) [2024-06-15 15:04:20,201][1653645] Updated weights for policy 0, policy_version 275473 (0.0014) [2024-06-15 15:04:20,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45329.2, 300 sec: 44653.3). Total num frames: 564232192. Throughput: 0: 11195.7. Samples: 141111808. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:04:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:04:22,470][1653645] Updated weights for policy 0, policy_version 275542 (0.0012) [2024-06-15 15:04:25,349][1653645] Updated weights for policy 0, policy_version 275600 (0.0013) [2024-06-15 15:04:25,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 564494336. Throughput: 0: 11229.9. Samples: 141175808. Policy #0 lag: (min: 15.0, avg: 137.9, max: 271.0) [2024-06-15 15:04:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:04:26,202][1653645] Updated weights for policy 0, policy_version 275648 (0.0031) [2024-06-15 15:04:27,937][1653645] Updated weights for policy 0, policy_version 275704 (0.0013) [2024-06-15 15:04:30,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 564658176. Throughput: 0: 11287.9. Samples: 141250048. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:04:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:04:32,008][1653645] Updated weights for policy 0, policy_version 275749 (0.0012) [2024-06-15 15:04:32,570][1653645] Updated weights for policy 0, policy_version 275776 (0.0012) [2024-06-15 15:04:34,781][1653645] Updated weights for policy 0, policy_version 275839 (0.0101) [2024-06-15 15:04:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 564920320. Throughput: 0: 11332.2. Samples: 141283328. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:04:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:04:37,199][1653645] Updated weights for policy 0, policy_version 275892 (0.0104) [2024-06-15 15:04:39,261][1653645] Updated weights for policy 0, policy_version 275964 (0.0013) [2024-06-15 15:04:40,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 45328.9, 300 sec: 44875.5). Total num frames: 565182464. Throughput: 0: 11218.4. Samples: 141349376. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:04:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:04:43,715][1653645] Updated weights for policy 0, policy_version 276019 (0.0131) [2024-06-15 15:04:44,825][1653645] Updated weights for policy 0, policy_version 276054 (0.0011) [2024-06-15 15:04:45,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45878.5, 300 sec: 44764.4). Total num frames: 565444608. Throughput: 0: 11502.9. Samples: 141416960. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:04:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:04:47,591][1653645] Updated weights for policy 0, policy_version 276112 (0.0013) [2024-06-15 15:04:48,629][1653645] Updated weights for policy 0, policy_version 276160 (0.0023) [2024-06-15 15:04:50,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.8, 300 sec: 45097.6). Total num frames: 565641216. Throughput: 0: 11355.0. Samples: 141448192. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:04:50,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:04:55,077][1653645] Updated weights for policy 0, policy_version 276256 (0.0014) [2024-06-15 15:04:55,179][1651596] Signal inference workers to stop experience collection... (14300 times) [2024-06-15 15:04:55,209][1653645] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-06-15 15:04:55,413][1651596] Signal inference workers to resume experience collection... (14300 times) [2024-06-15 15:04:55,414][1653645] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-06-15 15:04:55,959][1648982] Fps is (10 sec: 39318.5, 60 sec: 45874.4, 300 sec: 44542.1). Total num frames: 565837824. Throughput: 0: 11514.1. Samples: 141524480. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:04:55,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:04:56,191][1653645] Updated weights for policy 0, policy_version 276304 (0.0012) [2024-06-15 15:04:56,516][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000276320_565903360.pth... [2024-06-15 15:04:56,607][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000271040_555089920.pth [2024-06-15 15:04:59,454][1653645] Updated weights for policy 0, policy_version 276372 (0.0017) [2024-06-15 15:05:00,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 566099968. Throughput: 0: 11275.4. Samples: 141583872. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:05:02,691][1653645] Updated weights for policy 0, policy_version 276448 (0.0013) [2024-06-15 15:05:05,958][1648982] Fps is (10 sec: 39324.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 566231040. Throughput: 0: 11229.8. Samples: 141617152. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:05:07,421][1653645] Updated weights for policy 0, policy_version 276513 (0.0012) [2024-06-15 15:05:09,060][1653645] Updated weights for policy 0, policy_version 276592 (0.0014) [2024-06-15 15:05:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 44542.4). Total num frames: 566525952. Throughput: 0: 11252.6. Samples: 141682176. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:05:11,457][1653645] Updated weights for policy 0, policy_version 276647 (0.0015) [2024-06-15 15:05:14,200][1653645] Updated weights for policy 0, policy_version 276690 (0.0014) [2024-06-15 15:05:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 566755328. Throughput: 0: 10990.9. Samples: 141744640. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:05:18,757][1653645] Updated weights for policy 0, policy_version 276754 (0.0013) [2024-06-15 15:05:20,781][1653645] Updated weights for policy 0, policy_version 276832 (0.0013) [2024-06-15 15:05:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 566951936. Throughput: 0: 11241.3. Samples: 141789184. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:05:22,994][1653645] Updated weights for policy 0, policy_version 276896 (0.0012) [2024-06-15 15:05:25,926][1653645] Updated weights for policy 0, policy_version 276963 (0.0015) [2024-06-15 15:05:25,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 567214080. Throughput: 0: 11218.5. Samples: 141854208. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:05:30,711][1653645] Updated weights for policy 0, policy_version 277029 (0.0014) [2024-06-15 15:05:30,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 567377920. Throughput: 0: 11195.8. Samples: 141920768. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:05:32,041][1653645] Updated weights for policy 0, policy_version 277072 (0.0011) [2024-06-15 15:05:34,838][1653645] Updated weights for policy 0, policy_version 277136 (0.0012) [2024-06-15 15:05:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 567640064. Throughput: 0: 11252.6. Samples: 141954560. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:05:37,562][1653645] Updated weights for policy 0, policy_version 277188 (0.0016) [2024-06-15 15:05:40,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 567803904. Throughput: 0: 10866.0. Samples: 142013440. Policy #0 lag: (min: 13.0, avg: 146.0, max: 269.0) [2024-06-15 15:05:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:05:41,416][1653645] Updated weights for policy 0, policy_version 277254 (0.0023) [2024-06-15 15:05:42,007][1651596] Signal inference workers to stop experience collection... (14350 times) [2024-06-15 15:05:42,059][1653645] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-06-15 15:05:42,225][1651596] Signal inference workers to resume experience collection... (14350 times) [2024-06-15 15:05:42,246][1653645] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-06-15 15:05:42,513][1653645] Updated weights for policy 0, policy_version 277312 (0.0013) [2024-06-15 15:05:44,980][1653645] Updated weights for policy 0, policy_version 277372 (0.0012) [2024-06-15 15:05:45,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 568066048. Throughput: 0: 11127.5. Samples: 142084608. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:05:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:05:47,334][1653645] Updated weights for policy 0, policy_version 277424 (0.0013) [2024-06-15 15:05:49,535][1653645] Updated weights for policy 0, policy_version 277472 (0.0012) [2024-06-15 15:05:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 568328192. Throughput: 0: 11218.5. Samples: 142121984. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:05:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:05:53,433][1653645] Updated weights for policy 0, policy_version 277539 (0.0012) [2024-06-15 15:05:55,719][1653645] Updated weights for policy 0, policy_version 277600 (0.0012) [2024-06-15 15:05:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.7, 300 sec: 44653.4). Total num frames: 568524800. Throughput: 0: 11355.0. Samples: 142193152. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:05:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:05:57,526][1653645] Updated weights for policy 0, policy_version 277638 (0.0014) [2024-06-15 15:05:58,672][1653645] Updated weights for policy 0, policy_version 277692 (0.0029) [2024-06-15 15:06:00,942][1653645] Updated weights for policy 0, policy_version 277758 (0.0123) [2024-06-15 15:06:00,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 568819712. Throughput: 0: 11343.7. Samples: 142255104. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:00,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:06:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 568918016. Throughput: 0: 11184.4. Samples: 142292480. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:06:06,181][1653645] Updated weights for policy 0, policy_version 277815 (0.0012) [2024-06-15 15:06:07,062][1653645] Updated weights for policy 0, policy_version 277845 (0.0012) [2024-06-15 15:06:08,056][1653645] Updated weights for policy 0, policy_version 277888 (0.0012) [2024-06-15 15:06:10,960][1648982] Fps is (10 sec: 42598.6, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 569245696. Throughput: 0: 11320.9. Samples: 142363648. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:10,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:06:11,529][1653645] Updated weights for policy 0, policy_version 277968 (0.0013) [2024-06-15 15:06:15,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 569376768. Throughput: 0: 11275.4. Samples: 142428160. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:06:16,481][1653645] Updated weights for policy 0, policy_version 278017 (0.0024) [2024-06-15 15:06:17,704][1653645] Updated weights for policy 0, policy_version 278073 (0.0012) [2024-06-15 15:06:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 569638912. Throughput: 0: 11252.6. Samples: 142460928. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:06:21,123][1653645] Updated weights for policy 0, policy_version 278145 (0.0017) [2024-06-15 15:06:22,725][1653645] Updated weights for policy 0, policy_version 278208 (0.0083) [2024-06-15 15:06:24,821][1653645] Updated weights for policy 0, policy_version 278266 (0.0141) [2024-06-15 15:06:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 569901056. Throughput: 0: 11252.6. Samples: 142519808. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:25,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:06:29,673][1653645] Updated weights for policy 0, policy_version 278326 (0.0013) [2024-06-15 15:06:30,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 570032128. Throughput: 0: 11263.9. Samples: 142591488. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:06:31,242][1651596] Signal inference workers to stop experience collection... (14400 times) [2024-06-15 15:06:31,264][1653645] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-06-15 15:06:31,598][1651596] Signal inference workers to resume experience collection... (14400 times) [2024-06-15 15:06:31,598][1653645] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-06-15 15:06:32,024][1653645] Updated weights for policy 0, policy_version 278384 (0.0019) [2024-06-15 15:06:33,534][1653645] Updated weights for policy 0, policy_version 278403 (0.0013) [2024-06-15 15:06:35,871][1653645] Updated weights for policy 0, policy_version 278480 (0.0017) [2024-06-15 15:06:35,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 570327040. Throughput: 0: 11127.5. Samples: 142622720. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:06:40,960][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 570425344. Throughput: 0: 11002.3. Samples: 142688256. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:40,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:06:41,332][1653645] Updated weights for policy 0, policy_version 278547 (0.0022) [2024-06-15 15:06:42,357][1653645] Updated weights for policy 0, policy_version 278586 (0.0010) [2024-06-15 15:06:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 570687488. Throughput: 0: 11013.7. Samples: 142750720. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:06:46,141][1653645] Updated weights for policy 0, policy_version 278672 (0.0015) [2024-06-15 15:06:47,235][1653645] Updated weights for policy 0, policy_version 278720 (0.0012) [2024-06-15 15:06:48,244][1653645] Updated weights for policy 0, policy_version 278773 (0.0014) [2024-06-15 15:06:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 570949632. Throughput: 0: 10831.6. Samples: 142779904. Policy #0 lag: (min: 14.0, avg: 114.9, max: 270.0) [2024-06-15 15:06:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:06:53,695][1653645] Updated weights for policy 0, policy_version 278820 (0.0013) [2024-06-15 15:06:55,831][1653645] Updated weights for policy 0, policy_version 278880 (0.0013) [2024-06-15 15:06:55,961][1648982] Fps is (10 sec: 45861.1, 60 sec: 43688.4, 300 sec: 44652.9). Total num frames: 571146240. Throughput: 0: 10876.4. Samples: 142853120. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:06:55,962][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:06:56,436][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000278912_571211776.pth... [2024-06-15 15:06:56,518][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000273664_560463872.pth [2024-06-15 15:06:58,302][1653645] Updated weights for policy 0, policy_version 278960 (0.0133) [2024-06-15 15:06:59,593][1653645] Updated weights for policy 0, policy_version 279024 (0.0024) [2024-06-15 15:07:00,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 44236.9, 300 sec: 44878.1). Total num frames: 571473920. Throughput: 0: 10808.9. Samples: 142914560. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:07:05,700][1653645] Updated weights for policy 0, policy_version 279088 (0.0012) [2024-06-15 15:07:05,958][1648982] Fps is (10 sec: 42610.3, 60 sec: 44236.5, 300 sec: 44320.1). Total num frames: 571572224. Throughput: 0: 11025.0. Samples: 142957056. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:05,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 15:07:08,053][1653645] Updated weights for policy 0, policy_version 279163 (0.0104) [2024-06-15 15:07:09,508][1653645] Updated weights for policy 0, policy_version 279207 (0.0078) [2024-06-15 15:07:10,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 571932672. Throughput: 0: 11047.8. Samples: 143016960. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:10,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 15:07:11,542][1653645] Updated weights for policy 0, policy_version 279296 (0.0013) [2024-06-15 15:07:15,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 571998208. Throughput: 0: 11013.7. Samples: 143087104. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:07:17,734][1653645] Updated weights for policy 0, policy_version 279356 (0.0012) [2024-06-15 15:07:19,366][1651596] Signal inference workers to stop experience collection... (14450 times) [2024-06-15 15:07:19,430][1653645] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-06-15 15:07:19,432][1653645] Updated weights for policy 0, policy_version 279396 (0.0013) [2024-06-15 15:07:19,560][1651596] Signal inference workers to resume experience collection... (14450 times) [2024-06-15 15:07:19,561][1653645] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-06-15 15:07:20,502][1653645] Updated weights for policy 0, policy_version 279430 (0.0012) [2024-06-15 15:07:20,962][1648982] Fps is (10 sec: 39304.7, 60 sec: 44779.7, 300 sec: 44652.7). Total num frames: 572325888. Throughput: 0: 11046.8. Samples: 143119872. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:20,962][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:07:22,112][1653645] Updated weights for policy 0, policy_version 279495 (0.0012) [2024-06-15 15:07:25,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 572522496. Throughput: 0: 11138.8. Samples: 143189504. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:07:27,435][1653645] Updated weights for policy 0, policy_version 279568 (0.0013) [2024-06-15 15:07:30,603][1653645] Updated weights for policy 0, policy_version 279626 (0.0014) [2024-06-15 15:07:30,958][1648982] Fps is (10 sec: 39337.7, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 572719104. Throughput: 0: 11355.0. Samples: 143261696. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:30,959][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 15:07:31,718][1653645] Updated weights for policy 0, policy_version 279682 (0.0074) [2024-06-15 15:07:32,727][1653645] Updated weights for policy 0, policy_version 279737 (0.0013) [2024-06-15 15:07:34,273][1653645] Updated weights for policy 0, policy_version 279792 (0.0031) [2024-06-15 15:07:35,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 573046784. Throughput: 0: 11457.4. Samples: 143295488. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:35,958][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 15:07:39,811][1653645] Updated weights for policy 0, policy_version 279870 (0.0013) [2024-06-15 15:07:40,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 573177856. Throughput: 0: 11389.9. Samples: 143365632. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 15:07:43,255][1653645] Updated weights for policy 0, policy_version 279922 (0.0013) [2024-06-15 15:07:44,879][1653645] Updated weights for policy 0, policy_version 279988 (0.0012) [2024-06-15 15:07:45,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 573472768. Throughput: 0: 11400.5. Samples: 143427584. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:07:46,713][1653645] Updated weights for policy 0, policy_version 280060 (0.0014) [2024-06-15 15:07:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 573603840. Throughput: 0: 11184.4. Samples: 143460352. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:07:51,691][1653645] Updated weights for policy 0, policy_version 280118 (0.0012) [2024-06-15 15:07:54,950][1653645] Updated weights for policy 0, policy_version 280176 (0.0013) [2024-06-15 15:07:55,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45331.4, 300 sec: 44653.3). Total num frames: 573865984. Throughput: 0: 11446.0. Samples: 143532032. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:07:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:07:56,999][1653645] Updated weights for policy 0, policy_version 280256 (0.0012) [2024-06-15 15:07:58,254][1653645] Updated weights for policy 0, policy_version 280311 (0.0039) [2024-06-15 15:08:00,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.6, 300 sec: 44764.5). Total num frames: 574095360. Throughput: 0: 11434.7. Samples: 143601664. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:08:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:08:02,822][1651596] Signal inference workers to stop experience collection... (14500 times) [2024-06-15 15:08:02,860][1653645] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-06-15 15:08:03,039][1651596] Signal inference workers to resume experience collection... (14500 times) [2024-06-15 15:08:03,040][1653645] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-06-15 15:08:03,043][1653645] Updated weights for policy 0, policy_version 280368 (0.0012) [2024-06-15 15:08:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.2, 300 sec: 44542.3). Total num frames: 574259200. Throughput: 0: 11424.4. Samples: 143633920. Policy #0 lag: (min: 47.0, avg: 131.0, max: 303.0) [2024-06-15 15:08:05,958][1648982] Avg episode reward: [(0, '36.890')] [2024-06-15 15:08:07,041][1653645] Updated weights for policy 0, policy_version 280448 (0.0113) [2024-06-15 15:08:09,812][1653645] Updated weights for policy 0, policy_version 280547 (0.0014) [2024-06-15 15:08:10,323][1653645] Updated weights for policy 0, policy_version 280576 (0.0010) [2024-06-15 15:08:10,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 574619648. Throughput: 0: 11070.6. Samples: 143687680. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:10,959][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 15:08:15,789][1653645] Updated weights for policy 0, policy_version 280634 (0.0012) [2024-06-15 15:08:15,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 574750720. Throughput: 0: 11116.1. Samples: 143761920. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:08:19,052][1653645] Updated weights for policy 0, policy_version 280704 (0.0074) [2024-06-15 15:08:20,555][1653645] Updated weights for policy 0, policy_version 280768 (0.0084) [2024-06-15 15:08:20,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 45332.3, 300 sec: 44986.6). Total num frames: 575045632. Throughput: 0: 11195.7. Samples: 143799296. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:08:21,805][1653645] Updated weights for policy 0, policy_version 280821 (0.0012) [2024-06-15 15:08:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 575143936. Throughput: 0: 11059.2. Samples: 143863296. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:08:27,677][1653645] Updated weights for policy 0, policy_version 280864 (0.0012) [2024-06-15 15:08:29,311][1653645] Updated weights for policy 0, policy_version 280903 (0.0013) [2024-06-15 15:08:30,890][1653645] Updated weights for policy 0, policy_version 280962 (0.0034) [2024-06-15 15:08:30,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 575406080. Throughput: 0: 11138.8. Samples: 143928832. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:08:32,451][1653645] Updated weights for policy 0, policy_version 281027 (0.0022) [2024-06-15 15:08:33,812][1653645] Updated weights for policy 0, policy_version 281085 (0.0012) [2024-06-15 15:08:35,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.5, 300 sec: 44764.4). Total num frames: 575668224. Throughput: 0: 10979.5. Samples: 143954432. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:35,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:08:40,262][1653645] Updated weights for policy 0, policy_version 281150 (0.0026) [2024-06-15 15:08:40,958][1648982] Fps is (10 sec: 39319.0, 60 sec: 43690.2, 300 sec: 44431.8). Total num frames: 575799296. Throughput: 0: 11013.5. Samples: 144027648. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:40,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:08:42,147][1653645] Updated weights for policy 0, policy_version 281206 (0.0137) [2024-06-15 15:08:44,206][1653645] Updated weights for policy 0, policy_version 281250 (0.0011) [2024-06-15 15:08:45,111][1651596] Signal inference workers to stop experience collection... (14550 times) [2024-06-15 15:08:45,181][1653645] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-06-15 15:08:45,406][1651596] Signal inference workers to resume experience collection... (14550 times) [2024-06-15 15:08:45,407][1653645] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-06-15 15:08:45,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 576126976. Throughput: 0: 10649.6. Samples: 144080896. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:08:46,584][1653645] Updated weights for policy 0, policy_version 281340 (0.0013) [2024-06-15 15:08:50,958][1648982] Fps is (10 sec: 39324.5, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 576192512. Throughput: 0: 10672.3. Samples: 144114176. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:50,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 15:08:54,506][1653645] Updated weights for policy 0, policy_version 281428 (0.0017) [2024-06-15 15:08:55,619][1653645] Updated weights for policy 0, policy_version 281471 (0.0012) [2024-06-15 15:08:55,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 576454656. Throughput: 0: 10934.1. Samples: 144179712. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:08:55,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:08:56,613][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000281504_576520192.pth... [2024-06-15 15:08:56,731][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000276320_565903360.pth [2024-06-15 15:08:58,098][1653645] Updated weights for policy 0, policy_version 281554 (0.0109) [2024-06-15 15:09:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 576716800. Throughput: 0: 10569.9. Samples: 144237568. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:09:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:09:04,169][1653645] Updated weights for policy 0, policy_version 281603 (0.0027) [2024-06-15 15:09:05,615][1653645] Updated weights for policy 0, policy_version 281655 (0.0041) [2024-06-15 15:09:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 576847872. Throughput: 0: 10615.5. Samples: 144276992. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:09:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:09:06,761][1653645] Updated weights for policy 0, policy_version 281696 (0.0023) [2024-06-15 15:09:09,996][1653645] Updated weights for policy 0, policy_version 281792 (0.0014) [2024-06-15 15:09:10,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 577175552. Throughput: 0: 10535.8. Samples: 144337408. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:09:10,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:09:11,733][1653645] Updated weights for policy 0, policy_version 281853 (0.0095) [2024-06-15 15:09:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 44098.0). Total num frames: 577241088. Throughput: 0: 10479.0. Samples: 144400384. Policy #0 lag: (min: 36.0, avg: 113.6, max: 276.0) [2024-06-15 15:09:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:09:17,777][1653645] Updated weights for policy 0, policy_version 281910 (0.0019) [2024-06-15 15:09:19,739][1653645] Updated weights for policy 0, policy_version 281978 (0.0012) [2024-06-15 15:09:20,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 40959.9, 300 sec: 44097.9). Total num frames: 577503232. Throughput: 0: 10615.5. Samples: 144432128. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:09:21,899][1653645] Updated weights for policy 0, policy_version 282018 (0.0013) [2024-06-15 15:09:23,847][1653645] Updated weights for policy 0, policy_version 282112 (0.0021) [2024-06-15 15:09:25,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 577765376. Throughput: 0: 10376.7. Samples: 144494592. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:09:30,317][1653645] Updated weights for policy 0, policy_version 282176 (0.0011) [2024-06-15 15:09:30,960][1648982] Fps is (10 sec: 42591.2, 60 sec: 42051.0, 300 sec: 44097.7). Total num frames: 577929216. Throughput: 0: 10671.9. Samples: 144561152. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:30,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:09:32,053][1653645] Updated weights for policy 0, policy_version 282237 (0.0011) [2024-06-15 15:09:33,220][1651596] Signal inference workers to stop experience collection... (14600 times) [2024-06-15 15:09:33,301][1653645] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-06-15 15:09:33,498][1651596] Signal inference workers to resume experience collection... (14600 times) [2024-06-15 15:09:33,499][1653645] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-06-15 15:09:34,405][1653645] Updated weights for policy 0, policy_version 282294 (0.0013) [2024-06-15 15:09:35,468][1653645] Updated weights for policy 0, policy_version 282352 (0.0013) [2024-06-15 15:09:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 578289664. Throughput: 0: 10729.2. Samples: 144596992. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:09:40,958][1648982] Fps is (10 sec: 36051.3, 60 sec: 41506.6, 300 sec: 43542.6). Total num frames: 578289664. Throughput: 0: 10729.2. Samples: 144662528. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:09:41,654][1653645] Updated weights for policy 0, policy_version 282402 (0.0012) [2024-06-15 15:09:43,371][1653645] Updated weights for policy 0, policy_version 282465 (0.0013) [2024-06-15 15:09:45,921][1653645] Updated weights for policy 0, policy_version 282544 (0.0012) [2024-06-15 15:09:45,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 42052.2, 300 sec: 44097.9). Total num frames: 578650112. Throughput: 0: 10911.3. Samples: 144728576. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:45,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:09:47,151][1653645] Updated weights for policy 0, policy_version 282608 (0.0012) [2024-06-15 15:09:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.6, 300 sec: 43987.0). Total num frames: 578813952. Throughput: 0: 10763.4. Samples: 144761344. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:09:53,859][1653645] Updated weights for policy 0, policy_version 282674 (0.0012) [2024-06-15 15:09:55,404][1653645] Updated weights for policy 0, policy_version 282742 (0.0010) [2024-06-15 15:09:55,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 579076096. Throughput: 0: 11002.3. Samples: 144832512. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:09:55,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:09:57,003][1653645] Updated weights for policy 0, policy_version 282788 (0.0014) [2024-06-15 15:09:58,515][1653645] Updated weights for policy 0, policy_version 282877 (0.0013) [2024-06-15 15:10:00,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 579338240. Throughput: 0: 11093.3. Samples: 144899584. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:00,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:10:05,189][1653645] Updated weights for policy 0, policy_version 282933 (0.0014) [2024-06-15 15:10:05,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 579502080. Throughput: 0: 11275.4. Samples: 144939520. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:10:06,822][1653645] Updated weights for policy 0, policy_version 283000 (0.0012) [2024-06-15 15:10:08,362][1653645] Updated weights for policy 0, policy_version 283045 (0.0013) [2024-06-15 15:10:09,583][1653645] Updated weights for policy 0, policy_version 283108 (0.0092) [2024-06-15 15:10:10,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 579862528. Throughput: 0: 11332.2. Samples: 145004544. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:10:15,379][1651596] Signal inference workers to stop experience collection... (14650 times) [2024-06-15 15:10:15,411][1653645] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-06-15 15:10:15,579][1651596] Signal inference workers to resume experience collection... (14650 times) [2024-06-15 15:10:15,581][1653645] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-06-15 15:10:15,760][1653645] Updated weights for policy 0, policy_version 283172 (0.0012) [2024-06-15 15:10:15,958][1648982] Fps is (10 sec: 45873.5, 60 sec: 45328.8, 300 sec: 44097.9). Total num frames: 579960832. Throughput: 0: 11605.7. Samples: 145083392. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:10:18,039][1653645] Updated weights for policy 0, policy_version 283259 (0.0012) [2024-06-15 15:10:20,621][1653645] Updated weights for policy 0, policy_version 283331 (0.0013) [2024-06-15 15:10:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 46421.5, 300 sec: 44320.1). Total num frames: 580288512. Throughput: 0: 11286.8. Samples: 145104896. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:20,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:10:21,692][1653645] Updated weights for policy 0, policy_version 283388 (0.0013) [2024-06-15 15:10:25,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 580386816. Throughput: 0: 11332.3. Samples: 145172480. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:10:28,244][1653645] Updated weights for policy 0, policy_version 283444 (0.0014) [2024-06-15 15:10:29,743][1653645] Updated weights for policy 0, policy_version 283520 (0.0114) [2024-06-15 15:10:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 45330.5, 300 sec: 44098.0). Total num frames: 580648960. Throughput: 0: 11355.1. Samples: 145239552. Policy #0 lag: (min: 30.0, avg: 137.7, max: 292.0) [2024-06-15 15:10:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:10:32,684][1653645] Updated weights for policy 0, policy_version 283606 (0.0015) [2024-06-15 15:10:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 580911104. Throughput: 0: 11298.1. Samples: 145269760. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:10:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:10:39,167][1653645] Updated weights for policy 0, policy_version 283680 (0.0015) [2024-06-15 15:10:40,632][1653645] Updated weights for policy 0, policy_version 283744 (0.0013) [2024-06-15 15:10:40,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 47513.7, 300 sec: 44320.1). Total num frames: 581140480. Throughput: 0: 11366.4. Samples: 145344000. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:10:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:10:42,805][1653645] Updated weights for policy 0, policy_version 283795 (0.0019) [2024-06-15 15:10:44,730][1653645] Updated weights for policy 0, policy_version 283878 (0.0041) [2024-06-15 15:10:45,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 46421.4, 300 sec: 44431.2). Total num frames: 581435392. Throughput: 0: 11229.9. Samples: 145404928. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:10:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:10:50,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 44782.7, 300 sec: 43986.8). Total num frames: 581500928. Throughput: 0: 11252.5. Samples: 145445888. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:10:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:10:50,979][1653645] Updated weights for policy 0, policy_version 283937 (0.0014) [2024-06-15 15:10:52,741][1653645] Updated weights for policy 0, policy_version 284032 (0.0017) [2024-06-15 15:10:54,755][1651596] Signal inference workers to stop experience collection... (14700 times) [2024-06-15 15:10:54,870][1653645] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-06-15 15:10:54,959][1651596] Signal inference workers to resume experience collection... (14700 times) [2024-06-15 15:10:54,959][1653645] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-06-15 15:10:54,961][1653645] Updated weights for policy 0, policy_version 284080 (0.0012) [2024-06-15 15:10:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.4, 300 sec: 44209.0). Total num frames: 581861376. Throughput: 0: 11275.4. Samples: 145511936. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:10:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:10:56,455][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000284128_581894144.pth... [2024-06-15 15:10:56,613][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000278912_571211776.pth [2024-06-15 15:10:56,620][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000284128_581894144.pth [2024-06-15 15:10:57,284][1653645] Updated weights for policy 0, policy_version 284152 (0.0013) [2024-06-15 15:11:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 581959680. Throughput: 0: 10922.7. Samples: 145574912. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:00,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:11:03,143][1653645] Updated weights for policy 0, policy_version 284208 (0.0018) [2024-06-15 15:11:04,579][1653645] Updated weights for policy 0, policy_version 284287 (0.0012) [2024-06-15 15:11:05,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.1, 300 sec: 44097.9). Total num frames: 582254592. Throughput: 0: 11298.1. Samples: 145613312. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:11:07,085][1653645] Updated weights for policy 0, policy_version 284352 (0.0012) [2024-06-15 15:11:08,688][1653645] Updated weights for policy 0, policy_version 284410 (0.0012) [2024-06-15 15:11:10,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 582483968. Throughput: 0: 11093.4. Samples: 145671680. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:11:15,445][1653645] Updated weights for policy 0, policy_version 284481 (0.0021) [2024-06-15 15:11:15,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.2, 300 sec: 44097.9). Total num frames: 582647808. Throughput: 0: 11275.4. Samples: 145746944. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:15,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:11:16,804][1653645] Updated weights for policy 0, policy_version 284542 (0.0013) [2024-06-15 15:11:19,242][1653645] Updated weights for policy 0, policy_version 284608 (0.0093) [2024-06-15 15:11:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 583008256. Throughput: 0: 11173.0. Samples: 145772544. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:11:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 583008256. Throughput: 0: 10956.8. Samples: 145837056. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:11:27,039][1653645] Updated weights for policy 0, policy_version 284673 (0.0106) [2024-06-15 15:11:28,507][1653645] Updated weights for policy 0, policy_version 284741 (0.0015) [2024-06-15 15:11:30,223][1653645] Updated weights for policy 0, policy_version 284804 (0.0014) [2024-06-15 15:11:30,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 44783.0, 300 sec: 44097.9). Total num frames: 583335936. Throughput: 0: 11013.7. Samples: 145900544. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 15:11:31,933][1653645] Updated weights for policy 0, policy_version 284868 (0.0013) [2024-06-15 15:11:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 583532544. Throughput: 0: 10706.6. Samples: 145927680. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:11:39,504][1653645] Updated weights for policy 0, policy_version 284947 (0.0012) [2024-06-15 15:11:39,773][1651596] Signal inference workers to stop experience collection... (14750 times) [2024-06-15 15:11:39,838][1653645] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-06-15 15:11:39,992][1651596] Signal inference workers to resume experience collection... (14750 times) [2024-06-15 15:11:40,000][1653645] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-06-15 15:11:40,898][1653645] Updated weights for policy 0, policy_version 285010 (0.0011) [2024-06-15 15:11:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.4, 300 sec: 44097.9). Total num frames: 583696384. Throughput: 0: 10990.9. Samples: 146006528. Policy #0 lag: (min: 47.0, avg: 168.2, max: 303.0) [2024-06-15 15:11:40,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:11:43,404][1653645] Updated weights for policy 0, policy_version 285104 (0.0116) [2024-06-15 15:11:44,787][1653645] Updated weights for policy 0, policy_version 285170 (0.0012) [2024-06-15 15:11:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 584056832. Throughput: 0: 10729.3. Samples: 146057728. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:11:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:11:50,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42598.6, 300 sec: 43765.2). Total num frames: 584056832. Throughput: 0: 10808.9. Samples: 146099712. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:11:50,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 15:11:51,253][1653645] Updated weights for policy 0, policy_version 285205 (0.0011) [2024-06-15 15:11:53,758][1653645] Updated weights for policy 0, policy_version 285296 (0.0099) [2024-06-15 15:11:54,809][1653645] Updated weights for policy 0, policy_version 285333 (0.0011) [2024-06-15 15:11:55,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 584450048. Throughput: 0: 10797.5. Samples: 146157568. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:11:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:11:56,927][1653645] Updated weights for policy 0, policy_version 285432 (0.0013) [2024-06-15 15:12:00,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 584581120. Throughput: 0: 10774.7. Samples: 146231808. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:12:03,906][1653645] Updated weights for policy 0, policy_version 285489 (0.0033) [2024-06-15 15:12:05,359][1653645] Updated weights for policy 0, policy_version 285562 (0.0012) [2024-06-15 15:12:05,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 43764.7). Total num frames: 584843264. Throughput: 0: 10934.0. Samples: 146264576. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:12:07,230][1653645] Updated weights for policy 0, policy_version 285627 (0.0021) [2024-06-15 15:12:08,538][1653645] Updated weights for policy 0, policy_version 285689 (0.0012) [2024-06-15 15:12:10,957][1648982] Fps is (10 sec: 52430.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 585105408. Throughput: 0: 10854.4. Samples: 146325504. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:12:14,995][1653645] Updated weights for policy 0, policy_version 285732 (0.0011) [2024-06-15 15:12:15,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43690.4, 300 sec: 43876.4). Total num frames: 585269248. Throughput: 0: 11252.5. Samples: 146406912. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:15,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:12:16,435][1653645] Updated weights for policy 0, policy_version 285801 (0.0013) [2024-06-15 15:12:17,425][1651596] Signal inference workers to stop experience collection... (14800 times) [2024-06-15 15:12:17,464][1653645] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-06-15 15:12:17,726][1651596] Signal inference workers to resume experience collection... (14800 times) [2024-06-15 15:12:17,727][1653645] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-06-15 15:12:18,511][1653645] Updated weights for policy 0, policy_version 285888 (0.0014) [2024-06-15 15:12:19,772][1653645] Updated weights for policy 0, policy_version 285940 (0.0012) [2024-06-15 15:12:20,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 585629696. Throughput: 0: 11184.4. Samples: 146430976. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:12:25,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 585629696. Throughput: 0: 11059.2. Samples: 146504192. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:12:27,102][1653645] Updated weights for policy 0, policy_version 286000 (0.0017) [2024-06-15 15:12:29,723][1653645] Updated weights for policy 0, policy_version 286096 (0.0161) [2024-06-15 15:12:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 586022912. Throughput: 0: 11173.0. Samples: 146560512. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:12:31,436][1653645] Updated weights for policy 0, policy_version 286176 (0.0014) [2024-06-15 15:12:35,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 586153984. Throughput: 0: 10945.4. Samples: 146592256. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:12:38,781][1653645] Updated weights for policy 0, policy_version 286224 (0.0012) [2024-06-15 15:12:40,270][1653645] Updated weights for policy 0, policy_version 286275 (0.0012) [2024-06-15 15:12:40,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 586350592. Throughput: 0: 11252.6. Samples: 146663936. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:12:42,320][1653645] Updated weights for policy 0, policy_version 286369 (0.0017) [2024-06-15 15:12:43,971][1653645] Updated weights for policy 0, policy_version 286448 (0.0014) [2024-06-15 15:12:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 586678272. Throughput: 0: 10831.7. Samples: 146719232. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:12:50,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 586678272. Throughput: 0: 10922.7. Samples: 146756096. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:12:51,527][1653645] Updated weights for policy 0, policy_version 286496 (0.0032) [2024-06-15 15:12:53,163][1653645] Updated weights for policy 0, policy_version 286561 (0.0012) [2024-06-15 15:12:55,451][1653645] Updated weights for policy 0, policy_version 286658 (0.0120) [2024-06-15 15:12:55,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 587137024. Throughput: 0: 10808.8. Samples: 146811904. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:12:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:12:56,041][1651596] Signal inference workers to stop experience collection... (14850 times) [2024-06-15 15:12:56,109][1653645] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-06-15 15:12:56,340][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000286704_587169792.pth... [2024-06-15 15:12:56,343][1651596] Signal inference workers to resume experience collection... (14850 times) [2024-06-15 15:12:56,343][1653645] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-06-15 15:12:56,399][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000281504_576520192.pth [2024-06-15 15:12:56,705][1653645] Updated weights for policy 0, policy_version 286720 (0.0013) [2024-06-15 15:13:00,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 587202560. Throughput: 0: 10444.8. Samples: 146876928. Policy #0 lag: (min: 95.0, avg: 227.5, max: 367.0) [2024-06-15 15:13:00,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:13:04,358][1653645] Updated weights for policy 0, policy_version 286775 (0.0012) [2024-06-15 15:13:05,481][1653645] Updated weights for policy 0, policy_version 286816 (0.0012) [2024-06-15 15:13:05,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 587431936. Throughput: 0: 10877.2. Samples: 146920448. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:13:07,431][1653645] Updated weights for policy 0, policy_version 286896 (0.0090) [2024-06-15 15:13:09,455][1653645] Updated weights for policy 0, policy_version 286976 (0.0013) [2024-06-15 15:13:10,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 587726848. Throughput: 0: 10262.8. Samples: 146966016. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:13:15,958][1648982] Fps is (10 sec: 29490.9, 60 sec: 40960.1, 300 sec: 42987.2). Total num frames: 587726848. Throughput: 0: 10740.6. Samples: 147043840. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:13:17,135][1653645] Updated weights for policy 0, policy_version 287042 (0.0013) [2024-06-15 15:13:18,925][1653645] Updated weights for policy 0, policy_version 287107 (0.0024) [2024-06-15 15:13:20,688][1653645] Updated weights for policy 0, policy_version 287188 (0.0013) [2024-06-15 15:13:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 588185600. Throughput: 0: 10672.4. Samples: 147072512. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:13:25,982][1648982] Fps is (10 sec: 52303.0, 60 sec: 43673.2, 300 sec: 43539.0). Total num frames: 588251136. Throughput: 0: 10450.6. Samples: 147134464. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:25,983][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:13:28,130][1653645] Updated weights for policy 0, policy_version 287264 (0.0013) [2024-06-15 15:13:29,817][1653645] Updated weights for policy 0, policy_version 287328 (0.0011) [2024-06-15 15:13:30,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42052.2, 300 sec: 43653.7). Total num frames: 588546048. Throughput: 0: 10763.4. Samples: 147203584. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:30,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:13:31,320][1653645] Updated weights for policy 0, policy_version 287394 (0.0018) [2024-06-15 15:13:33,421][1653645] Updated weights for policy 0, policy_version 287487 (0.0148) [2024-06-15 15:13:35,958][1648982] Fps is (10 sec: 52555.5, 60 sec: 43690.7, 300 sec: 43987.0). Total num frames: 588775424. Throughput: 0: 10513.1. Samples: 147229184. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:13:40,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42052.3, 300 sec: 43209.3). Total num frames: 588873728. Throughput: 0: 11093.3. Samples: 147311104. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:13:41,127][1653645] Updated weights for policy 0, policy_version 287553 (0.0012) [2024-06-15 15:13:42,110][1651596] Signal inference workers to stop experience collection... (14900 times) [2024-06-15 15:13:42,219][1653645] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-06-15 15:13:42,221][1653645] Updated weights for policy 0, policy_version 287608 (0.0016) [2024-06-15 15:13:42,380][1651596] Signal inference workers to resume experience collection... (14900 times) [2024-06-15 15:13:42,381][1653645] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-06-15 15:13:43,494][1653645] Updated weights for policy 0, policy_version 287664 (0.0012) [2024-06-15 15:13:45,411][1653645] Updated weights for policy 0, policy_version 287741 (0.0015) [2024-06-15 15:13:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 589299712. Throughput: 0: 10774.8. Samples: 147361792. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:45,960][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 15:13:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 589299712. Throughput: 0: 10729.2. Samples: 147403264. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:50,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 15:13:53,069][1653645] Updated weights for policy 0, policy_version 287829 (0.0123) [2024-06-15 15:13:54,839][1653645] Updated weights for policy 0, policy_version 287904 (0.0013) [2024-06-15 15:13:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 589692928. Throughput: 0: 11218.5. Samples: 147470848. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:13:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:13:56,961][1653645] Updated weights for policy 0, policy_version 287968 (0.0012) [2024-06-15 15:14:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 589824000. Throughput: 0: 10956.8. Samples: 147536896. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:14:00,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 15:14:03,909][1653645] Updated weights for policy 0, policy_version 288021 (0.0114) [2024-06-15 15:14:05,424][1653645] Updated weights for policy 0, policy_version 288083 (0.0013) [2024-06-15 15:14:05,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 590020608. Throughput: 0: 11173.0. Samples: 147575296. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:14:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:14:07,150][1653645] Updated weights for policy 0, policy_version 288164 (0.0020) [2024-06-15 15:14:09,306][1653645] Updated weights for policy 0, policy_version 288228 (0.0083) [2024-06-15 15:14:09,964][1653645] Updated weights for policy 0, policy_version 288256 (0.0011) [2024-06-15 15:14:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 590348288. Throughput: 0: 10996.8. Samples: 147629056. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:14:10,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:14:15,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 590446592. Throughput: 0: 11104.7. Samples: 147703296. Policy #0 lag: (min: 11.0, avg: 62.4, max: 267.0) [2024-06-15 15:14:15,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:14:16,069][1653645] Updated weights for policy 0, policy_version 288319 (0.0013) [2024-06-15 15:14:18,080][1653645] Updated weights for policy 0, policy_version 288400 (0.0010) [2024-06-15 15:14:20,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 590741504. Throughput: 0: 11093.3. Samples: 147728384. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:14:21,282][1653645] Updated weights for policy 0, policy_version 288464 (0.0016) [2024-06-15 15:14:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43708.2, 300 sec: 43876.1). Total num frames: 590872576. Throughput: 0: 10820.2. Samples: 147798016. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:14:26,798][1651596] Signal inference workers to stop experience collection... (14950 times) [2024-06-15 15:14:26,812][1653645] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-06-15 15:14:26,976][1651596] Signal inference workers to resume experience collection... (14950 times) [2024-06-15 15:14:26,977][1653645] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-06-15 15:14:27,339][1653645] Updated weights for policy 0, policy_version 288544 (0.0095) [2024-06-15 15:14:28,853][1653645] Updated weights for policy 0, policy_version 288592 (0.0011) [2024-06-15 15:14:30,626][1653645] Updated weights for policy 0, policy_version 288658 (0.0011) [2024-06-15 15:14:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 591200256. Throughput: 0: 11116.1. Samples: 147862016. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:14:31,630][1653645] Updated weights for policy 0, policy_version 288698 (0.0011) [2024-06-15 15:14:33,704][1653645] Updated weights for policy 0, policy_version 288761 (0.0013) [2024-06-15 15:14:35,960][1648982] Fps is (10 sec: 52417.2, 60 sec: 43689.0, 300 sec: 44430.8). Total num frames: 591396864. Throughput: 0: 10910.7. Samples: 147894272. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:35,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:14:39,781][1653645] Updated weights for policy 0, policy_version 288828 (0.0015) [2024-06-15 15:14:40,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 44236.7, 300 sec: 43653.7). Total num frames: 591527936. Throughput: 0: 10934.0. Samples: 147962880. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:14:42,108][1653645] Updated weights for policy 0, policy_version 288880 (0.0013) [2024-06-15 15:14:43,582][1653645] Updated weights for policy 0, policy_version 288944 (0.0011) [2024-06-15 15:14:45,756][1653645] Updated weights for policy 0, policy_version 289017 (0.0019) [2024-06-15 15:14:45,962][1648982] Fps is (10 sec: 52416.8, 60 sec: 43687.3, 300 sec: 44430.5). Total num frames: 591921152. Throughput: 0: 10739.5. Samples: 148020224. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:45,963][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:14:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 591986688. Throughput: 0: 10672.3. Samples: 148055552. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:14:51,665][1653645] Updated weights for policy 0, policy_version 289084 (0.0041) [2024-06-15 15:14:54,608][1653645] Updated weights for policy 0, policy_version 289144 (0.0013) [2024-06-15 15:14:55,958][1648982] Fps is (10 sec: 36060.3, 60 sec: 43144.3, 300 sec: 43875.8). Total num frames: 592281600. Throughput: 0: 10968.1. Samples: 148122624. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:14:55,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:14:56,086][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000289216_592314368.pth... [2024-06-15 15:14:56,160][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000284128_581894144.pth [2024-06-15 15:14:57,059][1653645] Updated weights for policy 0, policy_version 289235 (0.0016) [2024-06-15 15:15:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 592445440. Throughput: 0: 10763.4. Samples: 148187648. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:15:00,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:15:02,372][1653645] Updated weights for policy 0, policy_version 289296 (0.0015) [2024-06-15 15:15:03,621][1653645] Updated weights for policy 0, policy_version 289344 (0.0026) [2024-06-15 15:15:05,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 592642048. Throughput: 0: 10968.2. Samples: 148221952. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:15:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:15:06,501][1653645] Updated weights for policy 0, policy_version 289408 (0.0012) [2024-06-15 15:15:09,376][1651596] Signal inference workers to stop experience collection... (15000 times) [2024-06-15 15:15:09,426][1653645] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-06-15 15:15:09,619][1651596] Signal inference workers to resume experience collection... (15000 times) [2024-06-15 15:15:09,620][1653645] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-06-15 15:15:09,836][1653645] Updated weights for policy 0, policy_version 289507 (0.0012) [2024-06-15 15:15:10,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 592969728. Throughput: 0: 10774.7. Samples: 148282880. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:15:10,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:15:14,364][1653645] Updated weights for policy 0, policy_version 289569 (0.0018) [2024-06-15 15:15:15,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44236.8, 300 sec: 43431.5). Total num frames: 593100800. Throughput: 0: 10956.8. Samples: 148355072. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:15:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:15:17,273][1653645] Updated weights for policy 0, policy_version 289622 (0.0014) [2024-06-15 15:15:18,704][1653645] Updated weights for policy 0, policy_version 289685 (0.0012) [2024-06-15 15:15:20,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 593395712. Throughput: 0: 10980.1. Samples: 148388352. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:15:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:15:21,036][1653645] Updated weights for policy 0, policy_version 289760 (0.0013) [2024-06-15 15:15:21,752][1653645] Updated weights for policy 0, policy_version 289792 (0.0012) [2024-06-15 15:15:25,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 593494016. Throughput: 0: 11013.6. Samples: 148458496. Policy #0 lag: (min: 23.0, avg: 130.3, max: 333.0) [2024-06-15 15:15:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:15:26,981][1653645] Updated weights for policy 0, policy_version 289855 (0.0015) [2024-06-15 15:15:28,825][1653645] Updated weights for policy 0, policy_version 289918 (0.0125) [2024-06-15 15:15:30,617][1653645] Updated weights for policy 0, policy_version 289981 (0.0032) [2024-06-15 15:15:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 593887232. Throughput: 0: 11196.9. Samples: 148524032. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:15:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:15:32,965][1653645] Updated weights for policy 0, policy_version 290032 (0.0016) [2024-06-15 15:15:35,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43692.2, 300 sec: 43653.6). Total num frames: 594018304. Throughput: 0: 11184.3. Samples: 148558848. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:15:35,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:15:37,411][1653645] Updated weights for policy 0, policy_version 290064 (0.0012) [2024-06-15 15:15:39,167][1653645] Updated weights for policy 0, policy_version 290131 (0.0084) [2024-06-15 15:15:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45875.1, 300 sec: 43542.5). Total num frames: 594280448. Throughput: 0: 11218.5. Samples: 148627456. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:15:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:15:41,094][1653645] Updated weights for policy 0, policy_version 290192 (0.0010) [2024-06-15 15:15:42,193][1653645] Updated weights for policy 0, policy_version 290240 (0.0012) [2024-06-15 15:15:45,275][1653645] Updated weights for policy 0, policy_version 290304 (0.0015) [2024-06-15 15:15:45,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43694.1, 300 sec: 44209.1). Total num frames: 594542592. Throughput: 0: 11218.5. Samples: 148692480. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:15:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:15:49,614][1653645] Updated weights for policy 0, policy_version 290363 (0.0013) [2024-06-15 15:15:50,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 45875.0, 300 sec: 43653.6). Total num frames: 594739200. Throughput: 0: 11468.7. Samples: 148738048. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:15:50,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:15:51,303][1653645] Updated weights for policy 0, policy_version 290430 (0.0067) [2024-06-15 15:15:53,275][1653645] Updated weights for policy 0, policy_version 290496 (0.0013) [2024-06-15 15:15:55,902][1651596] Signal inference workers to stop experience collection... (15050 times) [2024-06-15 15:15:55,948][1653645] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-06-15 15:15:55,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45329.3, 300 sec: 44209.1). Total num frames: 595001344. Throughput: 0: 11514.3. Samples: 148801024. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:15:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:15:56,142][1651596] Signal inference workers to resume experience collection... (15050 times) [2024-06-15 15:15:56,142][1653645] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-06-15 15:15:59,786][1653645] Updated weights for policy 0, policy_version 290578 (0.0015) [2024-06-15 15:16:00,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 45875.2, 300 sec: 43875.8). Total num frames: 595197952. Throughput: 0: 11502.9. Samples: 148872704. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:16:02,499][1653645] Updated weights for policy 0, policy_version 290644 (0.0014) [2024-06-15 15:16:04,085][1653645] Updated weights for policy 0, policy_version 290707 (0.0012) [2024-06-15 15:16:05,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 43986.9). Total num frames: 595460096. Throughput: 0: 11491.5. Samples: 148905472. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:16:07,732][1653645] Updated weights for policy 0, policy_version 290775 (0.0013) [2024-06-15 15:16:08,694][1653645] Updated weights for policy 0, policy_version 290814 (0.0012) [2024-06-15 15:16:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 595623936. Throughput: 0: 11377.8. Samples: 148970496. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:16:11,935][1653645] Updated weights for policy 0, policy_version 290875 (0.0012) [2024-06-15 15:16:15,395][1653645] Updated weights for policy 0, policy_version 290944 (0.0015) [2024-06-15 15:16:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46421.4, 300 sec: 43653.6). Total num frames: 595886080. Throughput: 0: 11434.7. Samples: 149038592. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:16:16,600][1653645] Updated weights for policy 0, policy_version 290998 (0.0012) [2024-06-15 15:16:20,049][1653645] Updated weights for policy 0, policy_version 291065 (0.0011) [2024-06-15 15:16:20,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 596115456. Throughput: 0: 11457.5. Samples: 149074432. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:16:22,919][1653645] Updated weights for policy 0, policy_version 291112 (0.0012) [2024-06-15 15:16:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 46421.6, 300 sec: 43875.8). Total num frames: 596279296. Throughput: 0: 11446.1. Samples: 149142528. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:25,970][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:16:26,596][1653645] Updated weights for policy 0, policy_version 291184 (0.0037) [2024-06-15 15:16:28,569][1653645] Updated weights for policy 0, policy_version 291257 (0.0013) [2024-06-15 15:16:30,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 596508672. Throughput: 0: 11366.3. Samples: 149203968. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:16:32,268][1653645] Updated weights for policy 0, policy_version 291328 (0.0015) [2024-06-15 15:16:35,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.4, 300 sec: 44320.1). Total num frames: 596770816. Throughput: 0: 11104.8. Samples: 149237760. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:16:37,800][1653645] Updated weights for policy 0, policy_version 291394 (0.0015) [2024-06-15 15:16:39,583][1653645] Updated weights for policy 0, policy_version 291472 (0.0013) [2024-06-15 15:16:40,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 597032960. Throughput: 0: 11093.3. Samples: 149300224. Policy #0 lag: (min: 15.0, avg: 144.0, max: 301.0) [2024-06-15 15:16:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:16:43,591][1651596] Signal inference workers to stop experience collection... (15100 times) [2024-06-15 15:16:43,661][1653645] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-06-15 15:16:43,822][1651596] Signal inference workers to resume experience collection... (15100 times) [2024-06-15 15:16:43,823][1653645] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-06-15 15:16:43,825][1653645] Updated weights for policy 0, policy_version 291552 (0.0024) [2024-06-15 15:16:45,676][1653645] Updated weights for policy 0, policy_version 291590 (0.0029) [2024-06-15 15:16:45,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 597196800. Throughput: 0: 11070.5. Samples: 149370880. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:16:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:16:49,636][1653645] Updated weights for policy 0, policy_version 291649 (0.0014) [2024-06-15 15:16:50,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44237.0, 300 sec: 43875.8). Total num frames: 597393408. Throughput: 0: 11104.7. Samples: 149405184. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:16:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:16:51,132][1653645] Updated weights for policy 0, policy_version 291709 (0.0073) [2024-06-15 15:16:53,020][1653645] Updated weights for policy 0, policy_version 291774 (0.0015) [2024-06-15 15:16:55,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 597622784. Throughput: 0: 10956.8. Samples: 149463552. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:16:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:16:56,228][1653645] Updated weights for policy 0, policy_version 291831 (0.0012) [2024-06-15 15:16:56,423][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000291840_597688320.pth... [2024-06-15 15:16:56,481][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000286704_587169792.pth [2024-06-15 15:16:57,865][1653645] Updated weights for policy 0, policy_version 291874 (0.0076) [2024-06-15 15:17:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 597819392. Throughput: 0: 11161.6. Samples: 149540864. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:17:02,139][1653645] Updated weights for policy 0, policy_version 291952 (0.0106) [2024-06-15 15:17:03,593][1653645] Updated weights for policy 0, policy_version 291988 (0.0012) [2024-06-15 15:17:05,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 598081536. Throughput: 0: 11036.4. Samples: 149571072. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:17:06,875][1653645] Updated weights for policy 0, policy_version 292049 (0.0012) [2024-06-15 15:17:08,653][1653645] Updated weights for policy 0, policy_version 292112 (0.0015) [2024-06-15 15:17:09,583][1653645] Updated weights for policy 0, policy_version 292160 (0.0015) [2024-06-15 15:17:10,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 598343680. Throughput: 0: 11036.4. Samples: 149639168. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:17:13,859][1653645] Updated weights for policy 0, policy_version 292224 (0.0012) [2024-06-15 15:17:15,874][1653645] Updated weights for policy 0, policy_version 292278 (0.0050) [2024-06-15 15:17:15,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 598573056. Throughput: 0: 11275.4. Samples: 149711360. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:17:19,240][1653645] Updated weights for policy 0, policy_version 292351 (0.0012) [2024-06-15 15:17:20,619][1653645] Updated weights for policy 0, policy_version 292416 (0.0015) [2024-06-15 15:17:20,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 598867968. Throughput: 0: 11229.9. Samples: 149743104. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:20,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 15:17:25,450][1653645] Updated weights for policy 0, policy_version 292472 (0.0013) [2024-06-15 15:17:25,958][1648982] Fps is (10 sec: 42596.4, 60 sec: 45328.7, 300 sec: 43986.8). Total num frames: 598999040. Throughput: 0: 11343.5. Samples: 149810688. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:25,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:17:28,879][1653645] Updated weights for policy 0, policy_version 292537 (0.0013) [2024-06-15 15:17:30,667][1653645] Updated weights for policy 0, policy_version 292576 (0.0011) [2024-06-15 15:17:30,765][1651596] Signal inference workers to stop experience collection... (15150 times) [2024-06-15 15:17:30,803][1653645] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-06-15 15:17:30,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 599195648. Throughput: 0: 11275.4. Samples: 149878272. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:17:31,022][1651596] Signal inference workers to resume experience collection... (15150 times) [2024-06-15 15:17:31,023][1653645] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-06-15 15:17:32,076][1653645] Updated weights for policy 0, policy_version 292640 (0.0080) [2024-06-15 15:17:35,886][1653645] Updated weights for policy 0, policy_version 292688 (0.0015) [2024-06-15 15:17:35,958][1648982] Fps is (10 sec: 42600.4, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 599425024. Throughput: 0: 11207.1. Samples: 149909504. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:35,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:17:37,003][1653645] Updated weights for policy 0, policy_version 292736 (0.0019) [2024-06-15 15:17:40,191][1653645] Updated weights for policy 0, policy_version 292798 (0.0012) [2024-06-15 15:17:40,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 599654400. Throughput: 0: 11412.0. Samples: 149977088. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:17:42,506][1653645] Updated weights for policy 0, policy_version 292852 (0.0017) [2024-06-15 15:17:43,999][1653645] Updated weights for policy 0, policy_version 292919 (0.0015) [2024-06-15 15:17:45,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 599916544. Throughput: 0: 11275.4. Samples: 150048256. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:17:47,381][1653645] Updated weights for policy 0, policy_version 292946 (0.0012) [2024-06-15 15:17:50,886][1653645] Updated weights for policy 0, policy_version 293008 (0.0013) [2024-06-15 15:17:50,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 44782.7, 300 sec: 43875.7). Total num frames: 600080384. Throughput: 0: 11366.4. Samples: 150082560. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:17:53,020][1653645] Updated weights for policy 0, policy_version 293060 (0.0015) [2024-06-15 15:17:54,731][1653645] Updated weights for policy 0, policy_version 293142 (0.0012) [2024-06-15 15:17:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 46967.6, 300 sec: 44875.5). Total num frames: 600440832. Throughput: 0: 11241.3. Samples: 150145024. Policy #0 lag: (min: 15.0, avg: 123.2, max: 271.0) [2024-06-15 15:17:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:17:58,815][1653645] Updated weights for policy 0, policy_version 293200 (0.0021) [2024-06-15 15:17:59,972][1653645] Updated weights for policy 0, policy_version 293244 (0.0012) [2024-06-15 15:18:00,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45875.0, 300 sec: 44542.2). Total num frames: 600571904. Throughput: 0: 11263.9. Samples: 150218240. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:18:02,916][1653645] Updated weights for policy 0, policy_version 293303 (0.0020) [2024-06-15 15:18:05,241][1653645] Updated weights for policy 0, policy_version 293376 (0.0040) [2024-06-15 15:18:05,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46421.5, 300 sec: 44542.3). Total num frames: 600866816. Throughput: 0: 11309.5. Samples: 150252032. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:18:06,722][1653645] Updated weights for policy 0, policy_version 293439 (0.0113) [2024-06-15 15:18:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 601030656. Throughput: 0: 11309.6. Samples: 150319616. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:10,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:18:11,382][1653645] Updated weights for policy 0, policy_version 293490 (0.0011) [2024-06-15 15:18:14,499][1653645] Updated weights for policy 0, policy_version 293552 (0.0011) [2024-06-15 15:18:15,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 44236.6, 300 sec: 44209.0). Total num frames: 601227264. Throughput: 0: 11263.9. Samples: 150385152. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:15,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:18:16,898][1651596] Signal inference workers to stop experience collection... (15200 times) [2024-06-15 15:18:17,029][1653645] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-06-15 15:18:17,101][1651596] Signal inference workers to resume experience collection... (15200 times) [2024-06-15 15:18:17,102][1653645] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-06-15 15:18:17,872][1653645] Updated weights for policy 0, policy_version 293632 (0.0147) [2024-06-15 15:18:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.5, 300 sec: 44879.1). Total num frames: 601489408. Throughput: 0: 11207.1. Samples: 150413824. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:18:22,034][1653645] Updated weights for policy 0, policy_version 293698 (0.0032) [2024-06-15 15:18:23,319][1653645] Updated weights for policy 0, policy_version 293753 (0.0016) [2024-06-15 15:18:25,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43691.0, 300 sec: 44320.1). Total num frames: 601620480. Throughput: 0: 11343.7. Samples: 150487552. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:18:26,335][1653645] Updated weights for policy 0, policy_version 293792 (0.0012) [2024-06-15 15:18:28,176][1653645] Updated weights for policy 0, policy_version 293872 (0.0089) [2024-06-15 15:18:29,333][1653645] Updated weights for policy 0, policy_version 293920 (0.0016) [2024-06-15 15:18:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 46967.3, 300 sec: 44875.5). Total num frames: 602013696. Throughput: 0: 11229.8. Samples: 150553600. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:30,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 15:18:33,349][1653645] Updated weights for policy 0, policy_version 293972 (0.0016) [2024-06-15 15:18:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 602144768. Throughput: 0: 11298.2. Samples: 150590976. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:18:37,635][1653645] Updated weights for policy 0, policy_version 294018 (0.0012) [2024-06-15 15:18:39,044][1653645] Updated weights for policy 0, policy_version 294096 (0.0014) [2024-06-15 15:18:40,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 602406912. Throughput: 0: 11400.5. Samples: 150658048. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:18:41,527][1653645] Updated weights for policy 0, policy_version 294182 (0.0013) [2024-06-15 15:18:44,916][1653645] Updated weights for policy 0, policy_version 294242 (0.0077) [2024-06-15 15:18:45,555][1653645] Updated weights for policy 0, policy_version 294272 (0.0011) [2024-06-15 15:18:45,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 602669056. Throughput: 0: 11173.0. Samples: 150721024. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:18:50,644][1653645] Updated weights for policy 0, policy_version 294336 (0.0199) [2024-06-15 15:18:50,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.3, 300 sec: 44431.2). Total num frames: 602800128. Throughput: 0: 11332.2. Samples: 150761984. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:18:51,934][1653645] Updated weights for policy 0, policy_version 294396 (0.0013) [2024-06-15 15:18:53,607][1653645] Updated weights for policy 0, policy_version 294461 (0.0014) [2024-06-15 15:18:55,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44236.6, 300 sec: 44986.5). Total num frames: 603095040. Throughput: 0: 11173.0. Samples: 150822400. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:18:55,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:18:56,612][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000294512_603160576.pth... [2024-06-15 15:18:56,664][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000289216_592314368.pth [2024-06-15 15:18:56,748][1653645] Updated weights for policy 0, policy_version 294514 (0.0011) [2024-06-15 15:18:56,978][1653645] Updated weights for policy 0, policy_version 294525 (0.0008) [2024-06-15 15:19:00,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44237.0, 300 sec: 44764.4). Total num frames: 603226112. Throughput: 0: 11548.5. Samples: 150904832. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:19:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:19:01,389][1651596] Signal inference workers to stop experience collection... (15250 times) [2024-06-15 15:19:01,483][1653645] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-06-15 15:19:01,603][1651596] Signal inference workers to resume experience collection... (15250 times) [2024-06-15 15:19:01,603][1653645] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-06-15 15:19:02,211][1653645] Updated weights for policy 0, policy_version 294608 (0.0012) [2024-06-15 15:19:03,820][1653645] Updated weights for policy 0, policy_version 294672 (0.0015) [2024-06-15 15:19:05,958][1648982] Fps is (10 sec: 49153.9, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 603586560. Throughput: 0: 11537.1. Samples: 150932992. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:19:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:19:07,164][1653645] Updated weights for policy 0, policy_version 294723 (0.0013) [2024-06-15 15:19:08,591][1653645] Updated weights for policy 0, policy_version 294783 (0.0029) [2024-06-15 15:19:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 44783.1, 300 sec: 44986.6). Total num frames: 603717632. Throughput: 0: 11309.5. Samples: 150996480. Policy #0 lag: (min: 14.0, avg: 124.5, max: 270.0) [2024-06-15 15:19:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:19:12,904][1653645] Updated weights for policy 0, policy_version 294848 (0.0079) [2024-06-15 15:19:13,835][1653645] Updated weights for policy 0, policy_version 294887 (0.0053) [2024-06-15 15:19:15,835][1653645] Updated weights for policy 0, policy_version 294970 (0.0013) [2024-06-15 15:19:15,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 604110848. Throughput: 0: 11434.7. Samples: 151068160. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:19:19,500][1653645] Updated weights for policy 0, policy_version 295024 (0.0017) [2024-06-15 15:19:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.4, 300 sec: 45319.8). Total num frames: 604241920. Throughput: 0: 11423.3. Samples: 151105024. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:19:23,578][1653645] Updated weights for policy 0, policy_version 295081 (0.0014) [2024-06-15 15:19:24,936][1653645] Updated weights for policy 0, policy_version 295120 (0.0013) [2024-06-15 15:19:25,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 47513.5, 300 sec: 44986.6). Total num frames: 604471296. Throughput: 0: 11537.1. Samples: 151177216. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:19:25,993][1653645] Updated weights for policy 0, policy_version 295168 (0.0012) [2024-06-15 15:19:29,590][1653645] Updated weights for policy 0, policy_version 295248 (0.0128) [2024-06-15 15:19:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.4, 300 sec: 45320.2). Total num frames: 604766208. Throughput: 0: 11548.5. Samples: 151240704. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:19:34,113][1653645] Updated weights for policy 0, policy_version 295297 (0.0013) [2024-06-15 15:19:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 604930048. Throughput: 0: 11685.0. Samples: 151287808. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:19:36,197][1653645] Updated weights for policy 0, policy_version 295394 (0.0074) [2024-06-15 15:19:37,610][1653645] Updated weights for policy 0, policy_version 295446 (0.0018) [2024-06-15 15:19:40,476][1653645] Updated weights for policy 0, policy_version 295491 (0.0011) [2024-06-15 15:19:40,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 46421.1, 300 sec: 44987.2). Total num frames: 605192192. Throughput: 0: 11764.6. Samples: 151351808. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:40,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 15:19:41,832][1653645] Updated weights for policy 0, policy_version 295548 (0.0013) [2024-06-15 15:19:45,884][1651596] Signal inference workers to stop experience collection... (15300 times) [2024-06-15 15:19:45,929][1653645] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-06-15 15:19:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.8, 300 sec: 45097.7). Total num frames: 605290496. Throughput: 0: 11537.1. Samples: 151424000. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:19:46,096][1651596] Signal inference workers to resume experience collection... (15300 times) [2024-06-15 15:19:46,097][1653645] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-06-15 15:19:47,046][1653645] Updated weights for policy 0, policy_version 295602 (0.0012) [2024-06-15 15:19:48,219][1653645] Updated weights for policy 0, policy_version 295650 (0.0011) [2024-06-15 15:19:50,078][1653645] Updated weights for policy 0, policy_version 295728 (0.0097) [2024-06-15 15:19:50,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 48059.5, 300 sec: 45430.9). Total num frames: 605683712. Throughput: 0: 11593.8. Samples: 151454720. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:19:52,713][1653645] Updated weights for policy 0, policy_version 295761 (0.0012) [2024-06-15 15:19:53,755][1653645] Updated weights for policy 0, policy_version 295804 (0.0012) [2024-06-15 15:19:55,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.3, 300 sec: 45319.8). Total num frames: 605814784. Throughput: 0: 11685.0. Samples: 151522304. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:19:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:19:58,378][1653645] Updated weights for policy 0, policy_version 295875 (0.0081) [2024-06-15 15:19:59,715][1653645] Updated weights for policy 0, policy_version 295929 (0.0013) [2024-06-15 15:20:00,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 48059.6, 300 sec: 45653.0). Total num frames: 606109696. Throughput: 0: 11594.0. Samples: 151589888. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:20:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:20:01,645][1653645] Updated weights for policy 0, policy_version 295984 (0.0036) [2024-06-15 15:20:04,885][1653645] Updated weights for policy 0, policy_version 296033 (0.0014) [2024-06-15 15:20:05,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 606339072. Throughput: 0: 11525.7. Samples: 151623680. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:20:05,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 15:20:08,552][1653645] Updated weights for policy 0, policy_version 296065 (0.0015) [2024-06-15 15:20:10,094][1653645] Updated weights for policy 0, policy_version 296122 (0.0012) [2024-06-15 15:20:10,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 606502912. Throughput: 0: 11411.9. Samples: 151690752. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:20:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:20:12,850][1653645] Updated weights for policy 0, policy_version 296194 (0.0014) [2024-06-15 15:20:14,090][1653645] Updated weights for policy 0, policy_version 296247 (0.0015) [2024-06-15 15:20:15,832][1653645] Updated weights for policy 0, policy_version 296265 (0.0012) [2024-06-15 15:20:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44237.0, 300 sec: 45319.8). Total num frames: 606765056. Throughput: 0: 11491.5. Samples: 151757824. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:20:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:20:20,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43690.7, 300 sec: 45319.9). Total num frames: 606863360. Throughput: 0: 11207.1. Samples: 151792128. Policy #0 lag: (min: 13.0, avg: 78.9, max: 269.0) [2024-06-15 15:20:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:20:20,969][1653645] Updated weights for policy 0, policy_version 296336 (0.0014) [2024-06-15 15:20:22,875][1653645] Updated weights for policy 0, policy_version 296402 (0.0045) [2024-06-15 15:20:25,205][1653645] Updated weights for policy 0, policy_version 296480 (0.0012) [2024-06-15 15:20:25,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 45208.7). Total num frames: 607223808. Throughput: 0: 11173.1. Samples: 151854592. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:25,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:20:28,066][1653645] Updated weights for policy 0, policy_version 296514 (0.0013) [2024-06-15 15:20:28,720][1651596] Signal inference workers to stop experience collection... (15350 times) [2024-06-15 15:20:28,753][1653645] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-06-15 15:20:28,948][1651596] Signal inference workers to resume experience collection... (15350 times) [2024-06-15 15:20:28,949][1653645] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-06-15 15:20:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 45319.9). Total num frames: 607387648. Throughput: 0: 10934.1. Samples: 151916032. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:20:33,982][1653645] Updated weights for policy 0, policy_version 296608 (0.0176) [2024-06-15 15:20:35,310][1653645] Updated weights for policy 0, policy_version 296672 (0.0012) [2024-06-15 15:20:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 45208.8). Total num frames: 607617024. Throughput: 0: 11127.6. Samples: 151955456. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:20:36,943][1653645] Updated weights for policy 0, policy_version 296707 (0.0013) [2024-06-15 15:20:39,680][1653645] Updated weights for policy 0, policy_version 296769 (0.0012) [2024-06-15 15:20:40,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 607911936. Throughput: 0: 10990.8. Samples: 152016896. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:40,959][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 15:20:45,141][1653645] Updated weights for policy 0, policy_version 296834 (0.0123) [2024-06-15 15:20:45,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 607977472. Throughput: 0: 11059.2. Samples: 152087552. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:20:47,172][1653645] Updated weights for policy 0, policy_version 296912 (0.0083) [2024-06-15 15:20:48,428][1653645] Updated weights for policy 0, policy_version 296960 (0.0014) [2024-06-15 15:20:50,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 43691.0, 300 sec: 45097.7). Total num frames: 608305152. Throughput: 0: 10717.9. Samples: 152105984. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:50,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:20:52,921][1653645] Updated weights for policy 0, policy_version 297056 (0.0028) [2024-06-15 15:20:55,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 608436224. Throughput: 0: 10763.4. Samples: 152175104. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:20:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:20:55,996][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000297088_608436224.pth... [2024-06-15 15:20:56,045][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000291840_597688320.pth [2024-06-15 15:20:57,990][1653645] Updated weights for policy 0, policy_version 297104 (0.0103) [2024-06-15 15:20:59,577][1653645] Updated weights for policy 0, policy_version 297168 (0.0012) [2024-06-15 15:21:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 44875.5). Total num frames: 608698368. Throughput: 0: 10740.6. Samples: 152241152. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:21:01,379][1653645] Updated weights for policy 0, policy_version 297217 (0.0105) [2024-06-15 15:21:03,704][1653645] Updated weights for policy 0, policy_version 297283 (0.0013) [2024-06-15 15:21:05,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.4, 300 sec: 45208.7). Total num frames: 608960512. Throughput: 0: 10740.5. Samples: 152275456. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:05,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:21:09,487][1653645] Updated weights for policy 0, policy_version 297352 (0.0013) [2024-06-15 15:21:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 609091584. Throughput: 0: 11036.4. Samples: 152351232. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:21:11,059][1653645] Updated weights for policy 0, policy_version 297409 (0.0013) [2024-06-15 15:21:12,044][1653645] Updated weights for policy 0, policy_version 297465 (0.0010) [2024-06-15 15:21:13,665][1653645] Updated weights for policy 0, policy_version 297534 (0.0018) [2024-06-15 15:21:14,287][1651596] Signal inference workers to stop experience collection... (15400 times) [2024-06-15 15:21:14,346][1653645] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-06-15 15:21:14,479][1651596] Signal inference workers to resume experience collection... (15400 times) [2024-06-15 15:21:14,480][1653645] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-06-15 15:21:15,491][1653645] Updated weights for policy 0, policy_version 297592 (0.0011) [2024-06-15 15:21:15,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 609484800. Throughput: 0: 11081.9. Samples: 152414720. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:15,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:21:20,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 609517568. Throughput: 0: 11070.6. Samples: 152453632. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:21:21,567][1653645] Updated weights for policy 0, policy_version 297651 (0.0024) [2024-06-15 15:21:22,902][1653645] Updated weights for policy 0, policy_version 297722 (0.0012) [2024-06-15 15:21:25,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43690.5, 300 sec: 45208.7). Total num frames: 609845248. Throughput: 0: 11207.1. Samples: 152521216. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:21:26,206][1653645] Updated weights for policy 0, policy_version 297789 (0.0102) [2024-06-15 15:21:27,500][1653645] Updated weights for policy 0, policy_version 297847 (0.0028) [2024-06-15 15:21:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 610009088. Throughput: 0: 11047.8. Samples: 152584704. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:21:33,461][1653645] Updated weights for policy 0, policy_version 297908 (0.0011) [2024-06-15 15:21:34,950][1653645] Updated weights for policy 0, policy_version 297984 (0.0015) [2024-06-15 15:21:35,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.5, 300 sec: 44875.5). Total num frames: 610271232. Throughput: 0: 11411.8. Samples: 152619520. Policy #0 lag: (min: 43.0, avg: 179.3, max: 299.0) [2024-06-15 15:21:35,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 15:21:37,731][1653645] Updated weights for policy 0, policy_version 298035 (0.0039) [2024-06-15 15:21:39,221][1653645] Updated weights for policy 0, policy_version 298103 (0.0011) [2024-06-15 15:21:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 45208.8). Total num frames: 610533376. Throughput: 0: 11275.4. Samples: 152682496. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:21:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:21:44,924][1653645] Updated weights for policy 0, policy_version 298163 (0.0012) [2024-06-15 15:21:45,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45875.1, 300 sec: 45208.7). Total num frames: 610729984. Throughput: 0: 11298.1. Samples: 152749568. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:21:45,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 15:21:46,392][1653645] Updated weights for policy 0, policy_version 298237 (0.0015) [2024-06-15 15:21:49,842][1653645] Updated weights for policy 0, policy_version 298303 (0.0126) [2024-06-15 15:21:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 610992128. Throughput: 0: 11332.4. Samples: 152785408. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:21:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:21:51,317][1653645] Updated weights for policy 0, policy_version 298365 (0.0012) [2024-06-15 15:21:55,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 611123200. Throughput: 0: 11161.6. Samples: 152853504. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:21:55,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:21:56,136][1653645] Updated weights for policy 0, policy_version 298424 (0.0015) [2024-06-15 15:21:56,880][1651596] Signal inference workers to stop experience collection... (15450 times) [2024-06-15 15:21:56,928][1653645] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-06-15 15:21:57,171][1651596] Signal inference workers to resume experience collection... (15450 times) [2024-06-15 15:21:57,172][1653645] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-06-15 15:21:57,429][1653645] Updated weights for policy 0, policy_version 298468 (0.0014) [2024-06-15 15:22:00,344][1653645] Updated weights for policy 0, policy_version 298512 (0.0014) [2024-06-15 15:22:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 611385344. Throughput: 0: 11286.8. Samples: 152922624. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:22:02,627][1653645] Updated weights for policy 0, policy_version 298593 (0.0012) [2024-06-15 15:22:05,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 611581952. Throughput: 0: 10979.6. Samples: 152947712. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:22:07,007][1653645] Updated weights for policy 0, policy_version 298642 (0.0014) [2024-06-15 15:22:07,852][1653645] Updated weights for policy 0, policy_version 298686 (0.0013) [2024-06-15 15:22:09,403][1653645] Updated weights for policy 0, policy_version 298736 (0.0010) [2024-06-15 15:22:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 611844096. Throughput: 0: 11059.2. Samples: 153018880. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:22:13,965][1653645] Updated weights for policy 0, policy_version 298818 (0.0106) [2024-06-15 15:22:15,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 612106240. Throughput: 0: 11138.8. Samples: 153085952. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:22:17,808][1653645] Updated weights for policy 0, policy_version 298882 (0.0013) [2024-06-15 15:22:20,120][1653645] Updated weights for policy 0, policy_version 298945 (0.0019) [2024-06-15 15:22:20,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 46421.1, 300 sec: 45097.7). Total num frames: 612302848. Throughput: 0: 11195.7. Samples: 153123328. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:22:24,517][1653645] Updated weights for policy 0, policy_version 299030 (0.0270) [2024-06-15 15:22:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44783.1, 300 sec: 45208.7). Total num frames: 612532224. Throughput: 0: 11275.4. Samples: 153189888. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:22:26,164][1653645] Updated weights for policy 0, policy_version 299104 (0.0013) [2024-06-15 15:22:30,592][1653645] Updated weights for policy 0, policy_version 299171 (0.0013) [2024-06-15 15:22:30,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45328.9, 300 sec: 45097.6). Total num frames: 612728832. Throughput: 0: 11195.7. Samples: 153253376. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:30,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 15:22:32,618][1653645] Updated weights for policy 0, policy_version 299201 (0.0012) [2024-06-15 15:22:33,776][1653645] Updated weights for policy 0, policy_version 299258 (0.0014) [2024-06-15 15:22:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44237.0, 300 sec: 44986.6). Total num frames: 612925440. Throughput: 0: 11218.5. Samples: 153290240. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:22:36,673][1653645] Updated weights for policy 0, policy_version 299319 (0.0196) [2024-06-15 15:22:37,962][1653645] Updated weights for policy 0, policy_version 299380 (0.0031) [2024-06-15 15:22:40,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 613187584. Throughput: 0: 11241.2. Samples: 153359360. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:22:41,173][1651596] Signal inference workers to stop experience collection... (15500 times) [2024-06-15 15:22:41,218][1653645] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-06-15 15:22:41,540][1651596] Signal inference workers to resume experience collection... (15500 times) [2024-06-15 15:22:41,541][1653645] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-06-15 15:22:41,543][1653645] Updated weights for policy 0, policy_version 299440 (0.0013) [2024-06-15 15:22:45,191][1653645] Updated weights for policy 0, policy_version 299488 (0.0011) [2024-06-15 15:22:45,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44783.1, 300 sec: 45208.8). Total num frames: 613416960. Throughput: 0: 11229.9. Samples: 153427968. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:45,960][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 15:22:46,749][1653645] Updated weights for policy 0, policy_version 299524 (0.0013) [2024-06-15 15:22:47,988][1653645] Updated weights for policy 0, policy_version 299580 (0.0012) [2024-06-15 15:22:49,456][1653645] Updated weights for policy 0, policy_version 299643 (0.0027) [2024-06-15 15:22:50,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 613679104. Throughput: 0: 11332.3. Samples: 153457664. Policy #0 lag: (min: 79.0, avg: 205.4, max: 311.0) [2024-06-15 15:22:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:22:52,608][1653645] Updated weights for policy 0, policy_version 299684 (0.0017) [2024-06-15 15:22:55,959][1648982] Fps is (10 sec: 39318.1, 60 sec: 44782.3, 300 sec: 44875.4). Total num frames: 613810176. Throughput: 0: 11263.8. Samples: 153525760. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:22:55,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:22:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000299712_613810176.pth... [2024-06-15 15:22:56,004][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000294512_603160576.pth [2024-06-15 15:22:57,382][1653645] Updated weights for policy 0, policy_version 299744 (0.0013) [2024-06-15 15:22:58,616][1653645] Updated weights for policy 0, policy_version 299795 (0.0011) [2024-06-15 15:22:59,383][1653645] Updated weights for policy 0, policy_version 299839 (0.0013) [2024-06-15 15:23:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 46421.4, 300 sec: 45097.6). Total num frames: 614170624. Throughput: 0: 11355.0. Samples: 153596928. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:23:01,051][1653645] Updated weights for policy 0, policy_version 299896 (0.0013) [2024-06-15 15:23:03,360][1653645] Updated weights for policy 0, policy_version 299938 (0.0017) [2024-06-15 15:23:05,958][1648982] Fps is (10 sec: 52433.3, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 614334464. Throughput: 0: 11332.3. Samples: 153633280. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:23:08,243][1653645] Updated weights for policy 0, policy_version 299984 (0.0014) [2024-06-15 15:23:10,302][1653645] Updated weights for policy 0, policy_version 300068 (0.0100) [2024-06-15 15:23:10,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 45875.0, 300 sec: 45319.8). Total num frames: 614596608. Throughput: 0: 11400.5. Samples: 153702912. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:10,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:23:11,586][1653645] Updated weights for policy 0, policy_version 300128 (0.0085) [2024-06-15 15:23:12,246][1653645] Updated weights for policy 0, policy_version 300156 (0.0011) [2024-06-15 15:23:15,234][1653645] Updated weights for policy 0, policy_version 300210 (0.0015) [2024-06-15 15:23:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 614858752. Throughput: 0: 11514.3. Samples: 153771520. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:23:20,176][1653645] Updated weights for policy 0, policy_version 300261 (0.0115) [2024-06-15 15:23:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 614989824. Throughput: 0: 11525.6. Samples: 153808896. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:20,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:23:21,843][1653645] Updated weights for policy 0, policy_version 300321 (0.0012) [2024-06-15 15:23:23,770][1653645] Updated weights for policy 0, policy_version 300384 (0.0115) [2024-06-15 15:23:23,815][1651596] Signal inference workers to stop experience collection... (15550 times) [2024-06-15 15:23:23,893][1653645] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-06-15 15:23:24,059][1651596] Signal inference workers to resume experience collection... (15550 times) [2024-06-15 15:23:24,059][1653645] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-06-15 15:23:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 615251968. Throughput: 0: 11138.8. Samples: 153860608. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:23:28,216][1653645] Updated weights for policy 0, policy_version 300479 (0.0014) [2024-06-15 15:23:30,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 615383040. Throughput: 0: 11229.9. Samples: 153933312. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:23:33,568][1653645] Updated weights for policy 0, policy_version 300544 (0.0012) [2024-06-15 15:23:35,028][1653645] Updated weights for policy 0, policy_version 300606 (0.0011) [2024-06-15 15:23:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 615645184. Throughput: 0: 11184.4. Samples: 153960960. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:23:36,931][1653645] Updated weights for policy 0, policy_version 300672 (0.0012) [2024-06-15 15:23:39,703][1653645] Updated weights for policy 0, policy_version 300732 (0.0033) [2024-06-15 15:23:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 615907328. Throughput: 0: 11104.9. Samples: 154025472. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:23:44,403][1653645] Updated weights for policy 0, policy_version 300795 (0.0015) [2024-06-15 15:23:45,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 44782.7, 300 sec: 45097.6). Total num frames: 616103936. Throughput: 0: 11172.9. Samples: 154099712. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:23:46,561][1653645] Updated weights for policy 0, policy_version 300862 (0.0047) [2024-06-15 15:23:50,443][1653645] Updated weights for policy 0, policy_version 300930 (0.0014) [2024-06-15 15:23:50,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 616333312. Throughput: 0: 10934.0. Samples: 154125312. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:23:55,162][1653645] Updated weights for policy 0, policy_version 301008 (0.0013) [2024-06-15 15:23:55,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.5, 300 sec: 44986.6). Total num frames: 616497152. Throughput: 0: 11104.8. Samples: 154202624. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:23:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:23:56,374][1653645] Updated weights for policy 0, policy_version 301054 (0.0011) [2024-06-15 15:23:58,082][1653645] Updated weights for policy 0, policy_version 301113 (0.0176) [2024-06-15 15:23:59,827][1653645] Updated weights for policy 0, policy_version 301172 (0.0012) [2024-06-15 15:24:00,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 616824832. Throughput: 0: 10899.9. Samples: 154262016. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:24:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:24:03,149][1653645] Updated weights for policy 0, policy_version 301240 (0.0012) [2024-06-15 15:24:05,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 616955904. Throughput: 0: 10843.1. Samples: 154296832. Policy #0 lag: (min: 38.0, avg: 158.8, max: 294.0) [2024-06-15 15:24:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:24:07,037][1653645] Updated weights for policy 0, policy_version 301284 (0.0011) [2024-06-15 15:24:07,664][1653645] Updated weights for policy 0, policy_version 301309 (0.0012) [2024-06-15 15:24:10,203][1653645] Updated weights for policy 0, policy_version 301381 (0.0019) [2024-06-15 15:24:10,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 44783.1, 300 sec: 44653.4). Total num frames: 617283584. Throughput: 0: 11366.4. Samples: 154372096. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:10,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:24:12,824][1651596] Signal inference workers to stop experience collection... (15600 times) [2024-06-15 15:24:12,845][1653645] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-06-15 15:24:13,048][1651596] Signal inference workers to resume experience collection... (15600 times) [2024-06-15 15:24:13,049][1653645] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-06-15 15:24:13,227][1653645] Updated weights for policy 0, policy_version 301457 (0.0012) [2024-06-15 15:24:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 617480192. Throughput: 0: 11264.0. Samples: 154440192. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:24:17,636][1653645] Updated weights for policy 0, policy_version 301520 (0.0012) [2024-06-15 15:24:18,852][1653645] Updated weights for policy 0, policy_version 301566 (0.0012) [2024-06-15 15:24:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.1, 300 sec: 44764.4). Total num frames: 617676800. Throughput: 0: 11446.0. Samples: 154476032. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:24:21,566][1653645] Updated weights for policy 0, policy_version 301634 (0.0013) [2024-06-15 15:24:24,538][1653645] Updated weights for policy 0, policy_version 301697 (0.0013) [2024-06-15 15:24:25,931][1653645] Updated weights for policy 0, policy_version 301757 (0.0015) [2024-06-15 15:24:25,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.2, 300 sec: 44764.4). Total num frames: 617971712. Throughput: 0: 11503.0. Samples: 154543104. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:24:30,272][1653645] Updated weights for policy 0, policy_version 301797 (0.0030) [2024-06-15 15:24:30,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 45875.3, 300 sec: 44764.4). Total num frames: 618135552. Throughput: 0: 11332.3. Samples: 154609664. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:24:32,993][1653645] Updated weights for policy 0, policy_version 301872 (0.0013) [2024-06-15 15:24:34,679][1653645] Updated weights for policy 0, policy_version 301952 (0.0024) [2024-06-15 15:24:35,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45875.2, 300 sec: 44764.5). Total num frames: 618397696. Throughput: 0: 11389.1. Samples: 154637824. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:24:38,291][1653645] Updated weights for policy 0, policy_version 302006 (0.0013) [2024-06-15 15:24:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 618528768. Throughput: 0: 11229.9. Samples: 154707968. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:24:42,009][1653645] Updated weights for policy 0, policy_version 302064 (0.0013) [2024-06-15 15:24:44,708][1653645] Updated weights for policy 0, policy_version 302130 (0.0013) [2024-06-15 15:24:45,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 45875.3, 300 sec: 44653.4). Total num frames: 618856448. Throughput: 0: 11264.0. Samples: 154768896. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:45,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:24:46,104][1653645] Updated weights for policy 0, policy_version 302197 (0.0112) [2024-06-15 15:24:49,689][1653645] Updated weights for policy 0, policy_version 302241 (0.0095) [2024-06-15 15:24:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 619053056. Throughput: 0: 11377.8. Samples: 154808832. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:24:53,052][1653645] Updated weights for policy 0, policy_version 302288 (0.0012) [2024-06-15 15:24:55,826][1653645] Updated weights for policy 0, policy_version 302368 (0.0015) [2024-06-15 15:24:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 44542.2). Total num frames: 619249664. Throughput: 0: 11161.6. Samples: 154874368. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:24:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:24:56,164][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000302384_619282432.pth... [2024-06-15 15:24:56,233][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000297088_608436224.pth [2024-06-15 15:24:56,840][1653645] Updated weights for policy 0, policy_version 302417 (0.0013) [2024-06-15 15:24:57,243][1651596] Signal inference workers to stop experience collection... (15650 times) [2024-06-15 15:24:57,363][1653645] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-06-15 15:24:57,507][1651596] Signal inference workers to resume experience collection... (15650 times) [2024-06-15 15:24:57,508][1653645] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-06-15 15:24:57,807][1653645] Updated weights for policy 0, policy_version 302458 (0.0020) [2024-06-15 15:25:00,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 619479040. Throughput: 0: 11286.8. Samples: 154948096. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:25:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:25:01,665][1653645] Updated weights for policy 0, policy_version 302523 (0.0037) [2024-06-15 15:25:05,068][1653645] Updated weights for policy 0, policy_version 302586 (0.0013) [2024-06-15 15:25:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45875.0, 300 sec: 44764.4). Total num frames: 619708416. Throughput: 0: 11161.6. Samples: 154978304. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:25:05,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:25:08,410][1653645] Updated weights for policy 0, policy_version 302640 (0.0014) [2024-06-15 15:25:10,061][1653645] Updated weights for policy 0, policy_version 302704 (0.0020) [2024-06-15 15:25:10,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 619970560. Throughput: 0: 11047.8. Samples: 155040256. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:25:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:25:13,551][1653645] Updated weights for policy 0, policy_version 302777 (0.0013) [2024-06-15 15:25:15,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 620101632. Throughput: 0: 10945.4. Samples: 155102208. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 15:25:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:25:17,193][1653645] Updated weights for policy 0, policy_version 302842 (0.0013) [2024-06-15 15:25:20,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 620298240. Throughput: 0: 11104.7. Samples: 155137536. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:25:21,004][1653645] Updated weights for policy 0, policy_version 302882 (0.0053) [2024-06-15 15:25:23,280][1653645] Updated weights for policy 0, policy_version 302973 (0.0106) [2024-06-15 15:25:25,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43144.3, 300 sec: 44653.3). Total num frames: 620560384. Throughput: 0: 10831.6. Samples: 155195392. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:25:26,479][1653645] Updated weights for policy 0, policy_version 303035 (0.0013) [2024-06-15 15:25:30,047][1653645] Updated weights for policy 0, policy_version 303100 (0.0011) [2024-06-15 15:25:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 620756992. Throughput: 0: 10945.5. Samples: 155261440. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:25:33,654][1653645] Updated weights for policy 0, policy_version 303168 (0.0070) [2024-06-15 15:25:35,298][1653645] Updated weights for policy 0, policy_version 303232 (0.0010) [2024-06-15 15:25:35,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 621019136. Throughput: 0: 10752.0. Samples: 155292672. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:25:40,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 621150208. Throughput: 0: 10683.8. Samples: 155355136. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:25:41,170][1653645] Updated weights for policy 0, policy_version 303297 (0.0075) [2024-06-15 15:25:44,302][1653645] Updated weights for policy 0, policy_version 303376 (0.0013) [2024-06-15 15:25:45,797][1651596] Signal inference workers to stop experience collection... (15700 times) [2024-06-15 15:25:45,826][1653645] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-06-15 15:25:45,974][1648982] Fps is (10 sec: 39290.1, 60 sec: 42592.8, 300 sec: 44430.0). Total num frames: 621412352. Throughput: 0: 10477.0. Samples: 155419648. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:45,988][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:25:46,132][1651596] Signal inference workers to resume experience collection... (15700 times) [2024-06-15 15:25:46,133][1653645] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-06-15 15:25:46,341][1653645] Updated weights for policy 0, policy_version 303444 (0.0014) [2024-06-15 15:25:47,176][1653645] Updated weights for policy 0, policy_version 303487 (0.0011) [2024-06-15 15:25:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 621674496. Throughput: 0: 10615.5. Samples: 155456000. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:25:53,307][1653645] Updated weights for policy 0, policy_version 303569 (0.0016) [2024-06-15 15:25:54,305][1653645] Updated weights for policy 0, policy_version 303614 (0.0028) [2024-06-15 15:25:55,958][1648982] Fps is (10 sec: 42632.6, 60 sec: 43144.7, 300 sec: 44542.3). Total num frames: 621838336. Throughput: 0: 10820.3. Samples: 155527168. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:25:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:25:56,649][1653645] Updated weights for policy 0, policy_version 303670 (0.0025) [2024-06-15 15:25:58,164][1653645] Updated weights for policy 0, policy_version 303728 (0.0014) [2024-06-15 15:26:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 622067712. Throughput: 0: 10934.0. Samples: 155594240. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:00,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:26:01,136][1653645] Updated weights for policy 0, policy_version 303760 (0.0013) [2024-06-15 15:26:02,237][1653645] Updated weights for policy 0, policy_version 303807 (0.0012) [2024-06-15 15:26:05,452][1653645] Updated weights for policy 0, policy_version 303871 (0.0012) [2024-06-15 15:26:05,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 622329856. Throughput: 0: 10945.4. Samples: 155630080. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:26:08,601][1653645] Updated weights for policy 0, policy_version 303941 (0.0036) [2024-06-15 15:26:09,797][1653645] Updated weights for policy 0, policy_version 303995 (0.0012) [2024-06-15 15:26:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 622592000. Throughput: 0: 11127.5. Samples: 155696128. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:26:13,294][1653645] Updated weights for policy 0, policy_version 304032 (0.0131) [2024-06-15 15:26:15,781][1653645] Updated weights for policy 0, policy_version 304096 (0.0012) [2024-06-15 15:26:15,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 622788608. Throughput: 0: 11377.7. Samples: 155773440. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:26:18,083][1653645] Updated weights for policy 0, policy_version 304145 (0.0012) [2024-06-15 15:26:18,970][1653645] Updated weights for policy 0, policy_version 304191 (0.0013) [2024-06-15 15:26:20,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45875.3, 300 sec: 44764.5). Total num frames: 623050752. Throughput: 0: 11343.7. Samples: 155803136. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:26:21,145][1653645] Updated weights for policy 0, policy_version 304240 (0.0013) [2024-06-15 15:26:25,165][1653645] Updated weights for policy 0, policy_version 304304 (0.0014) [2024-06-15 15:26:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 623247360. Throughput: 0: 11457.4. Samples: 155870720. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:25,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:26:27,957][1653645] Updated weights for policy 0, policy_version 304354 (0.0012) [2024-06-15 15:26:30,890][1653645] Updated weights for policy 0, policy_version 304418 (0.0011) [2024-06-15 15:26:30,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 623443968. Throughput: 0: 11448.1. Samples: 155934720. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:26:32,606][1651596] Signal inference workers to stop experience collection... (15750 times) [2024-06-15 15:26:32,732][1653645] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-06-15 15:26:32,763][1651596] Signal inference workers to resume experience collection... (15750 times) [2024-06-15 15:26:32,765][1653645] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-06-15 15:26:32,863][1653645] Updated weights for policy 0, policy_version 304480 (0.0095) [2024-06-15 15:26:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 623640576. Throughput: 0: 11332.2. Samples: 155965952. Policy #0 lag: (min: 12.0, avg: 97.9, max: 268.0) [2024-06-15 15:26:35,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:26:36,405][1653645] Updated weights for policy 0, policy_version 304544 (0.0011) [2024-06-15 15:26:37,037][1653645] Updated weights for policy 0, policy_version 304576 (0.0011) [2024-06-15 15:26:40,120][1653645] Updated weights for policy 0, policy_version 304634 (0.0015) [2024-06-15 15:26:40,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 623902720. Throughput: 0: 11434.7. Samples: 156041728. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:26:40,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:26:42,813][1653645] Updated weights for policy 0, policy_version 304676 (0.0017) [2024-06-15 15:26:44,623][1653645] Updated weights for policy 0, policy_version 304768 (0.0014) [2024-06-15 15:26:45,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45881.4, 300 sec: 44653.3). Total num frames: 624164864. Throughput: 0: 11286.8. Samples: 156102144. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:26:45,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 15:26:48,670][1653645] Updated weights for policy 0, policy_version 304830 (0.0012) [2024-06-15 15:26:50,532][1653645] Updated weights for policy 0, policy_version 304867 (0.0011) [2024-06-15 15:26:50,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 624394240. Throughput: 0: 11343.6. Samples: 156140544. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:26:50,958][1648982] Avg episode reward: [(0, '36.810')] [2024-06-15 15:26:54,167][1653645] Updated weights for policy 0, policy_version 304928 (0.0124) [2024-06-15 15:26:55,633][1653645] Updated weights for policy 0, policy_version 304983 (0.0012) [2024-06-15 15:26:55,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 624623616. Throughput: 0: 11355.0. Samples: 156207104. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:26:55,958][1648982] Avg episode reward: [(0, '36.490')] [2024-06-15 15:26:56,252][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000305024_624689152.pth... [2024-06-15 15:26:56,366][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000299712_613810176.pth [2024-06-15 15:26:58,528][1653645] Updated weights for policy 0, policy_version 305030 (0.0022) [2024-06-15 15:27:00,901][1653645] Updated weights for policy 0, policy_version 305094 (0.0014) [2024-06-15 15:27:00,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 45875.0, 300 sec: 44875.4). Total num frames: 624820224. Throughput: 0: 11320.8. Samples: 156282880. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:00,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:27:02,245][1653645] Updated weights for policy 0, policy_version 305152 (0.0013) [2024-06-15 15:27:05,615][1653645] Updated weights for policy 0, policy_version 305212 (0.0014) [2024-06-15 15:27:05,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 625082368. Throughput: 0: 11468.8. Samples: 156319232. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:27:07,222][1653645] Updated weights for policy 0, policy_version 305269 (0.0014) [2024-06-15 15:27:10,306][1653645] Updated weights for policy 0, policy_version 305312 (0.0013) [2024-06-15 15:27:10,958][1648982] Fps is (10 sec: 49153.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 625311744. Throughput: 0: 11400.6. Samples: 156383744. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:27:13,992][1653645] Updated weights for policy 0, policy_version 305392 (0.0013) [2024-06-15 15:27:15,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 625475584. Throughput: 0: 11423.3. Samples: 156448768. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:27:17,903][1653645] Updated weights for policy 0, policy_version 305456 (0.0014) [2024-06-15 15:27:18,680][1651596] Signal inference workers to stop experience collection... (15800 times) [2024-06-15 15:27:18,836][1653645] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-06-15 15:27:18,954][1651596] Signal inference workers to resume experience collection... (15800 times) [2024-06-15 15:27:18,955][1653645] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-06-15 15:27:20,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 625737728. Throughput: 0: 11411.9. Samples: 156479488. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:20,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:27:21,903][1653645] Updated weights for policy 0, policy_version 305552 (0.0115) [2024-06-15 15:27:22,906][1653645] Updated weights for policy 0, policy_version 305595 (0.0011) [2024-06-15 15:27:25,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 625967104. Throughput: 0: 11218.4. Samples: 156546560. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:27:26,073][1653645] Updated weights for policy 0, policy_version 305656 (0.0012) [2024-06-15 15:27:29,374][1653645] Updated weights for policy 0, policy_version 305721 (0.0011) [2024-06-15 15:27:30,581][1653645] Updated weights for policy 0, policy_version 305764 (0.0012) [2024-06-15 15:27:30,959][1648982] Fps is (10 sec: 49148.1, 60 sec: 46420.7, 300 sec: 45097.5). Total num frames: 626229248. Throughput: 0: 11423.1. Samples: 156616192. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:30,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:27:32,314][1653645] Updated weights for policy 0, policy_version 305809 (0.0013) [2024-06-15 15:27:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 626393088. Throughput: 0: 11229.8. Samples: 156645888. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:27:36,544][1653645] Updated weights for policy 0, policy_version 305888 (0.0034) [2024-06-15 15:27:40,718][1653645] Updated weights for policy 0, policy_version 305936 (0.0016) [2024-06-15 15:27:40,958][1648982] Fps is (10 sec: 32770.9, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 626556928. Throughput: 0: 11571.2. Samples: 156727808. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:27:43,147][1653645] Updated weights for policy 0, policy_version 306037 (0.0135) [2024-06-15 15:27:44,023][1653645] Updated weights for policy 0, policy_version 306071 (0.0014) [2024-06-15 15:27:45,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 626917376. Throughput: 0: 11173.1. Samples: 156785664. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:45,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 15:27:47,098][1653645] Updated weights for policy 0, policy_version 306113 (0.0011) [2024-06-15 15:27:50,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 44875.6). Total num frames: 627048448. Throughput: 0: 11116.0. Samples: 156819456. Policy #0 lag: (min: 15.0, avg: 126.2, max: 271.0) [2024-06-15 15:27:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:27:52,466][1653645] Updated weights for policy 0, policy_version 306192 (0.0015) [2024-06-15 15:27:54,239][1653645] Updated weights for policy 0, policy_version 306258 (0.0018) [2024-06-15 15:27:55,679][1653645] Updated weights for policy 0, policy_version 306323 (0.0012) [2024-06-15 15:27:55,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 627376128. Throughput: 0: 11298.0. Samples: 156892160. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:27:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:27:59,106][1653645] Updated weights for policy 0, policy_version 306385 (0.0013) [2024-06-15 15:28:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 627572736. Throughput: 0: 11298.2. Samples: 156957184. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:28:03,876][1653645] Updated weights for policy 0, policy_version 306434 (0.0024) [2024-06-15 15:28:04,611][1651596] Signal inference workers to stop experience collection... (15850 times) [2024-06-15 15:28:04,680][1653645] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-06-15 15:28:04,986][1651596] Signal inference workers to resume experience collection... (15850 times) [2024-06-15 15:28:04,987][1653645] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-06-15 15:28:05,774][1653645] Updated weights for policy 0, policy_version 306515 (0.0013) [2024-06-15 15:28:05,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 627769344. Throughput: 0: 11503.0. Samples: 156997120. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:28:07,213][1653645] Updated weights for policy 0, policy_version 306576 (0.0013) [2024-06-15 15:28:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 627965952. Throughput: 0: 11184.4. Samples: 157049856. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:28:11,799][1653645] Updated weights for policy 0, policy_version 306641 (0.0012) [2024-06-15 15:28:12,773][1653645] Updated weights for policy 0, policy_version 306685 (0.0011) [2024-06-15 15:28:15,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 628097024. Throughput: 0: 11207.4. Samples: 157120512. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:15,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:28:18,036][1653645] Updated weights for policy 0, policy_version 306753 (0.0014) [2024-06-15 15:28:19,678][1653645] Updated weights for policy 0, policy_version 306817 (0.0015) [2024-06-15 15:28:20,645][1653645] Updated weights for policy 0, policy_version 306867 (0.0013) [2024-06-15 15:28:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 628490240. Throughput: 0: 11138.9. Samples: 157147136. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:28:24,346][1653645] Updated weights for policy 0, policy_version 306928 (0.0064) [2024-06-15 15:28:25,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 628621312. Throughput: 0: 10729.2. Samples: 157210624. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:28:30,249][1653645] Updated weights for policy 0, policy_version 307008 (0.0012) [2024-06-15 15:28:30,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43145.1, 300 sec: 44653.3). Total num frames: 628817920. Throughput: 0: 10956.8. Samples: 157278720. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:28:31,619][1653645] Updated weights for policy 0, policy_version 307072 (0.0013) [2024-06-15 15:28:33,065][1653645] Updated weights for policy 0, policy_version 307130 (0.0016) [2024-06-15 15:28:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 629047296. Throughput: 0: 10843.0. Samples: 157307392. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:28:36,321][1653645] Updated weights for policy 0, policy_version 307184 (0.0028) [2024-06-15 15:28:40,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 629145600. Throughput: 0: 10843.0. Samples: 157380096. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:40,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:28:41,934][1653645] Updated weights for policy 0, policy_version 307252 (0.0013) [2024-06-15 15:28:43,055][1653645] Updated weights for policy 0, policy_version 307301 (0.0013) [2024-06-15 15:28:44,494][1651596] Signal inference workers to stop experience collection... (15900 times) [2024-06-15 15:28:44,560][1653645] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-06-15 15:28:44,830][1651596] Signal inference workers to resume experience collection... (15900 times) [2024-06-15 15:28:44,831][1653645] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-06-15 15:28:44,832][1653645] Updated weights for policy 0, policy_version 307376 (0.0068) [2024-06-15 15:28:45,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 629538816. Throughput: 0: 10752.0. Samples: 157441024. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:28:46,928][1653645] Updated weights for policy 0, policy_version 307411 (0.0011) [2024-06-15 15:28:50,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.5, 300 sec: 44653.3). Total num frames: 629669888. Throughput: 0: 10604.0. Samples: 157474304. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:50,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:28:52,840][1653645] Updated weights for policy 0, policy_version 307457 (0.0014) [2024-06-15 15:28:54,834][1653645] Updated weights for policy 0, policy_version 307552 (0.0011) [2024-06-15 15:28:55,958][1648982] Fps is (10 sec: 42596.7, 60 sec: 43144.5, 300 sec: 44542.2). Total num frames: 629964800. Throughput: 0: 11036.4. Samples: 157546496. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:28:55,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:28:56,038][1653645] Updated weights for policy 0, policy_version 307604 (0.0012) [2024-06-15 15:28:56,282][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000307616_629997568.pth... [2024-06-15 15:28:56,384][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000302384_619282432.pth [2024-06-15 15:28:58,670][1653645] Updated weights for policy 0, policy_version 307651 (0.0013) [2024-06-15 15:28:59,631][1653645] Updated weights for policy 0, policy_version 307706 (0.0014) [2024-06-15 15:29:00,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.5, 300 sec: 44875.4). Total num frames: 630194176. Throughput: 0: 10854.3. Samples: 157608960. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:29:00,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 15:29:05,634][1653645] Updated weights for policy 0, policy_version 307760 (0.0011) [2024-06-15 15:29:05,958][1648982] Fps is (10 sec: 32769.2, 60 sec: 42052.2, 300 sec: 44098.0). Total num frames: 630292480. Throughput: 0: 11161.6. Samples: 157649408. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 15:29:05,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 15:29:07,682][1653645] Updated weights for policy 0, policy_version 307843 (0.0010) [2024-06-15 15:29:08,885][1653645] Updated weights for policy 0, policy_version 307904 (0.0141) [2024-06-15 15:29:10,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 630587392. Throughput: 0: 11036.4. Samples: 157707264. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:10,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 15:29:12,080][1653645] Updated weights for policy 0, policy_version 307962 (0.0124) [2024-06-15 15:29:15,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 630718464. Throughput: 0: 11116.1. Samples: 157778944. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:15,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:29:17,181][1653645] Updated weights for policy 0, policy_version 308016 (0.0014) [2024-06-15 15:29:19,576][1653645] Updated weights for policy 0, policy_version 308112 (0.0132) [2024-06-15 15:29:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 631111680. Throughput: 0: 11093.3. Samples: 157806592. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:20,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 15:29:23,660][1653645] Updated weights for policy 0, policy_version 308192 (0.0041) [2024-06-15 15:29:25,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 631242752. Throughput: 0: 10706.5. Samples: 157861888. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:25,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:29:28,816][1653645] Updated weights for policy 0, policy_version 308225 (0.0012) [2024-06-15 15:29:30,449][1651596] Signal inference workers to stop experience collection... (15950 times) [2024-06-15 15:29:30,500][1653645] Updated weights for policy 0, policy_version 308292 (0.0013) [2024-06-15 15:29:30,528][1653645] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-06-15 15:29:30,666][1651596] Signal inference workers to resume experience collection... (15950 times) [2024-06-15 15:29:30,667][1653645] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-06-15 15:29:30,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 631406592. Throughput: 0: 11013.7. Samples: 157936640. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:29:32,482][1653645] Updated weights for policy 0, policy_version 308384 (0.0012) [2024-06-15 15:29:34,846][1653645] Updated weights for policy 0, policy_version 308433 (0.0013) [2024-06-15 15:29:35,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 631767040. Throughput: 0: 10888.6. Samples: 157964288. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:29:39,954][1653645] Updated weights for policy 0, policy_version 308482 (0.0012) [2024-06-15 15:29:40,946][1653645] Updated weights for policy 0, policy_version 308543 (0.0012) [2024-06-15 15:29:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45329.2, 300 sec: 44098.0). Total num frames: 631865344. Throughput: 0: 11002.4. Samples: 158041600. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:40,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:29:43,397][1653645] Updated weights for policy 0, policy_version 308624 (0.0012) [2024-06-15 15:29:45,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 43690.3, 300 sec: 44431.1). Total num frames: 632160256. Throughput: 0: 10945.4. Samples: 158101504. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:29:46,934][1653645] Updated weights for policy 0, policy_version 308704 (0.0015) [2024-06-15 15:29:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 632291328. Throughput: 0: 10831.6. Samples: 158136832. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:29:51,772][1653645] Updated weights for policy 0, policy_version 308755 (0.0012) [2024-06-15 15:29:53,701][1653645] Updated weights for policy 0, policy_version 308817 (0.0024) [2024-06-15 15:29:55,880][1653645] Updated weights for policy 0, policy_version 308912 (0.0153) [2024-06-15 15:29:55,960][1648982] Fps is (10 sec: 49153.9, 60 sec: 44783.2, 300 sec: 44653.3). Total num frames: 632651776. Throughput: 0: 11138.9. Samples: 158208512. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:29:55,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:29:58,021][1653645] Updated weights for policy 0, policy_version 308968 (0.0017) [2024-06-15 15:30:00,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 632815616. Throughput: 0: 10945.4. Samples: 158271488. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:30:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:30:03,592][1653645] Updated weights for policy 0, policy_version 309010 (0.0013) [2024-06-15 15:30:05,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 632979456. Throughput: 0: 11355.0. Samples: 158317568. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:30:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:30:05,967][1653645] Updated weights for policy 0, policy_version 309088 (0.0013) [2024-06-15 15:30:07,871][1653645] Updated weights for policy 0, policy_version 309156 (0.0026) [2024-06-15 15:30:08,741][1653645] Updated weights for policy 0, policy_version 309200 (0.0014) [2024-06-15 15:30:09,536][1653645] Updated weights for policy 0, policy_version 309240 (0.0013) [2024-06-15 15:30:10,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 45875.0, 300 sec: 44875.4). Total num frames: 633339904. Throughput: 0: 11252.5. Samples: 158368256. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:30:10,959][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 15:30:14,682][1651596] Signal inference workers to stop experience collection... (16000 times) [2024-06-15 15:30:14,720][1653645] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-06-15 15:30:14,863][1651596] Signal inference workers to resume experience collection... (16000 times) [2024-06-15 15:30:14,864][1653645] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-06-15 15:30:15,821][1653645] Updated weights for policy 0, policy_version 309309 (0.0014) [2024-06-15 15:30:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 633470976. Throughput: 0: 11411.9. Samples: 158450176. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:30:15,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:30:19,221][1653645] Updated weights for policy 0, policy_version 309392 (0.0013) [2024-06-15 15:30:20,958][1648982] Fps is (10 sec: 42599.9, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 633765888. Throughput: 0: 11423.3. Samples: 158478336. Policy #0 lag: (min: 95.0, avg: 147.6, max: 335.0) [2024-06-15 15:30:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:30:21,109][1653645] Updated weights for policy 0, policy_version 309474 (0.0095) [2024-06-15 15:30:25,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 633864192. Throughput: 0: 11093.3. Samples: 158540800. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:25,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 15:30:26,924][1653645] Updated weights for policy 0, policy_version 309506 (0.0015) [2024-06-15 15:30:28,459][1653645] Updated weights for policy 0, policy_version 309565 (0.0013) [2024-06-15 15:30:30,653][1653645] Updated weights for policy 0, policy_version 309616 (0.0014) [2024-06-15 15:30:30,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 634093568. Throughput: 0: 11264.1. Samples: 158608384. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:30:32,633][1653645] Updated weights for policy 0, policy_version 309697 (0.0013) [2024-06-15 15:30:34,023][1653645] Updated weights for policy 0, policy_version 309756 (0.0012) [2024-06-15 15:30:35,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 634388480. Throughput: 0: 10968.2. Samples: 158630400. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:30:40,354][1653645] Updated weights for policy 0, policy_version 309814 (0.0013) [2024-06-15 15:30:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44432.4). Total num frames: 634519552. Throughput: 0: 11059.2. Samples: 158706176. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:30:42,741][1653645] Updated weights for policy 0, policy_version 309857 (0.0015) [2024-06-15 15:30:44,235][1653645] Updated weights for policy 0, policy_version 309920 (0.0011) [2024-06-15 15:30:45,533][1653645] Updated weights for policy 0, policy_version 309984 (0.0012) [2024-06-15 15:30:45,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.4, 300 sec: 44764.4). Total num frames: 634880000. Throughput: 0: 10922.7. Samples: 158763008. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:30:50,693][1653645] Updated weights for policy 0, policy_version 310032 (0.0013) [2024-06-15 15:30:50,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 634978304. Throughput: 0: 10865.8. Samples: 158806528. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:30:51,593][1653645] Updated weights for policy 0, policy_version 310080 (0.0014) [2024-06-15 15:30:55,246][1651596] Signal inference workers to stop experience collection... (16050 times) [2024-06-15 15:30:55,276][1653645] Updated weights for policy 0, policy_version 310163 (0.0014) [2024-06-15 15:30:55,292][1653645] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-06-15 15:30:55,439][1651596] Signal inference workers to resume experience collection... (16050 times) [2024-06-15 15:30:55,457][1653645] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-06-15 15:30:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 635273216. Throughput: 0: 11218.6. Samples: 158873088. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:30:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:30:56,534][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000310224_635338752.pth... [2024-06-15 15:30:56,543][1653645] Updated weights for policy 0, policy_version 310224 (0.0014) [2024-06-15 15:30:56,662][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000305024_624689152.pth [2024-06-15 15:30:56,668][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000310224_635338752.pth [2024-06-15 15:30:57,565][1653645] Updated weights for policy 0, policy_version 310272 (0.0012) [2024-06-15 15:31:00,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 635437056. Throughput: 0: 10899.9. Samples: 158940672. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:31:03,441][1653645] Updated weights for policy 0, policy_version 310322 (0.0022) [2024-06-15 15:31:05,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 635600896. Throughput: 0: 11002.3. Samples: 158973440. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:31:05,980][1653645] Updated weights for policy 0, policy_version 310368 (0.0015) [2024-06-15 15:31:07,785][1653645] Updated weights for policy 0, policy_version 310451 (0.0013) [2024-06-15 15:31:09,123][1653645] Updated weights for policy 0, policy_version 310512 (0.0058) [2024-06-15 15:31:10,960][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.9, 300 sec: 44653.3). Total num frames: 635961344. Throughput: 0: 10956.9. Samples: 159033856. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:10,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:31:13,889][1653645] Updated weights for policy 0, policy_version 310560 (0.0012) [2024-06-15 15:31:15,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 636092416. Throughput: 0: 11104.7. Samples: 159108096. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:31:17,900][1653645] Updated weights for policy 0, policy_version 310640 (0.0013) [2024-06-15 15:31:18,789][1653645] Updated weights for policy 0, policy_version 310679 (0.0017) [2024-06-15 15:31:20,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 44783.0, 300 sec: 44764.5). Total num frames: 636452864. Throughput: 0: 11332.3. Samples: 159140352. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:31:21,045][1653645] Updated weights for policy 0, policy_version 310773 (0.0012) [2024-06-15 15:31:25,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 636518400. Throughput: 0: 11093.3. Samples: 159205376. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:25,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:31:26,049][1653645] Updated weights for policy 0, policy_version 310816 (0.0011) [2024-06-15 15:31:29,984][1653645] Updated weights for policy 0, policy_version 310881 (0.0015) [2024-06-15 15:31:30,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 636747776. Throughput: 0: 11229.9. Samples: 159268352. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:31:31,976][1653645] Updated weights for policy 0, policy_version 310960 (0.0012) [2024-06-15 15:31:33,780][1653645] Updated weights for policy 0, policy_version 311030 (0.0013) [2024-06-15 15:31:35,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 637009920. Throughput: 0: 10717.8. Samples: 159288832. Policy #0 lag: (min: 64.0, avg: 189.0, max: 275.0) [2024-06-15 15:31:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:31:39,538][1651596] Signal inference workers to stop experience collection... (16100 times) [2024-06-15 15:31:39,584][1653645] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-06-15 15:31:39,727][1651596] Signal inference workers to resume experience collection... (16100 times) [2024-06-15 15:31:39,728][1653645] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-06-15 15:31:39,838][1653645] Updated weights for policy 0, policy_version 311095 (0.0012) [2024-06-15 15:31:40,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 637140992. Throughput: 0: 10706.5. Samples: 159354880. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:31:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:31:43,841][1653645] Updated weights for policy 0, policy_version 311158 (0.0019) [2024-06-15 15:31:45,218][1653645] Updated weights for policy 0, policy_version 311216 (0.0011) [2024-06-15 15:31:45,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 637435904. Throughput: 0: 10513.1. Samples: 159413760. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:31:45,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:31:46,715][1653645] Updated weights for policy 0, policy_version 311280 (0.0012) [2024-06-15 15:31:50,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 637599744. Throughput: 0: 10570.0. Samples: 159449088. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:31:50,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:31:51,220][1653645] Updated weights for policy 0, policy_version 311352 (0.0013) [2024-06-15 15:31:55,652][1653645] Updated weights for policy 0, policy_version 311397 (0.0012) [2024-06-15 15:31:55,958][1648982] Fps is (10 sec: 32767.0, 60 sec: 41506.0, 300 sec: 43875.8). Total num frames: 637763584. Throughput: 0: 10820.2. Samples: 159520768. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:31:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:31:57,356][1653645] Updated weights for policy 0, policy_version 311474 (0.0015) [2024-06-15 15:32:00,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 638058496. Throughput: 0: 10535.8. Samples: 159582208. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:32:02,461][1653645] Updated weights for policy 0, policy_version 311572 (0.0022) [2024-06-15 15:32:05,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 638189568. Throughput: 0: 10501.7. Samples: 159612928. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:32:08,037][1653645] Updated weights for policy 0, policy_version 311670 (0.0042) [2024-06-15 15:32:09,351][1653645] Updated weights for policy 0, policy_version 311744 (0.0018) [2024-06-15 15:32:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43144.4, 300 sec: 44320.1). Total num frames: 638550016. Throughput: 0: 10558.6. Samples: 159680512. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:10,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 15:32:14,206][1653645] Updated weights for policy 0, policy_version 311810 (0.0019) [2024-06-15 15:32:15,485][1653645] Updated weights for policy 0, policy_version 311859 (0.0014) [2024-06-15 15:32:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 638713856. Throughput: 0: 10615.5. Samples: 159746048. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:15,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 15:32:18,718][1653645] Updated weights for policy 0, policy_version 311888 (0.0024) [2024-06-15 15:32:20,444][1653645] Updated weights for policy 0, policy_version 311971 (0.0013) [2024-06-15 15:32:20,762][1651596] Signal inference workers to stop experience collection... (16150 times) [2024-06-15 15:32:20,809][1653645] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-06-15 15:32:20,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 638943232. Throughput: 0: 11082.0. Samples: 159787520. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:32:20,987][1651596] Signal inference workers to resume experience collection... (16150 times) [2024-06-15 15:32:20,988][1653645] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-06-15 15:32:21,651][1653645] Updated weights for policy 0, policy_version 312032 (0.0012) [2024-06-15 15:32:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.9, 300 sec: 43764.8). Total num frames: 639139840. Throughput: 0: 11093.3. Samples: 159854080. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:32:26,342][1653645] Updated weights for policy 0, policy_version 312097 (0.0026) [2024-06-15 15:32:27,029][1653645] Updated weights for policy 0, policy_version 312128 (0.0010) [2024-06-15 15:32:30,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 639369216. Throughput: 0: 11400.5. Samples: 159926784. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:32:31,143][1653645] Updated weights for policy 0, policy_version 312197 (0.0020) [2024-06-15 15:32:33,406][1653645] Updated weights for policy 0, policy_version 312304 (0.0125) [2024-06-15 15:32:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 639631360. Throughput: 0: 11081.9. Samples: 159947776. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:32:38,488][1653645] Updated weights for policy 0, policy_version 312371 (0.0014) [2024-06-15 15:32:40,960][1648982] Fps is (10 sec: 39313.7, 60 sec: 43689.1, 300 sec: 43542.2). Total num frames: 639762432. Throughput: 0: 11149.8. Samples: 160022528. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:40,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:32:42,354][1653645] Updated weights for policy 0, policy_version 312416 (0.0014) [2024-06-15 15:32:44,463][1653645] Updated weights for policy 0, policy_version 312512 (0.0215) [2024-06-15 15:32:45,679][1653645] Updated weights for policy 0, policy_version 312570 (0.0016) [2024-06-15 15:32:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 640155648. Throughput: 0: 11036.5. Samples: 160078848. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:32:50,450][1653645] Updated weights for policy 0, policy_version 312634 (0.0075) [2024-06-15 15:32:50,958][1648982] Fps is (10 sec: 52439.7, 60 sec: 44782.8, 300 sec: 43764.8). Total num frames: 640286720. Throughput: 0: 11252.6. Samples: 160119296. Policy #0 lag: (min: 6.0, avg: 96.2, max: 262.0) [2024-06-15 15:32:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:32:54,709][1653645] Updated weights for policy 0, policy_version 312690 (0.0015) [2024-06-15 15:32:55,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 640483328. Throughput: 0: 11309.5. Samples: 160189440. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:32:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:32:56,504][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000312768_640548864.pth... [2024-06-15 15:32:56,505][1653645] Updated weights for policy 0, policy_version 312768 (0.0013) [2024-06-15 15:32:56,636][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000307616_629997568.pth [2024-06-15 15:32:58,330][1653645] Updated weights for policy 0, policy_version 312830 (0.0046) [2024-06-15 15:33:00,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44237.0, 300 sec: 43875.8). Total num frames: 640712704. Throughput: 0: 11195.7. Samples: 160249856. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:33:01,905][1653645] Updated weights for policy 0, policy_version 312895 (0.0079) [2024-06-15 15:33:05,960][1648982] Fps is (10 sec: 32760.0, 60 sec: 43688.7, 300 sec: 43542.2). Total num frames: 640811008. Throughput: 0: 10956.1. Samples: 160280576. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:05,961][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:33:06,870][1651596] Signal inference workers to stop experience collection... (16200 times) [2024-06-15 15:33:06,902][1653645] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-06-15 15:33:07,085][1651596] Signal inference workers to resume experience collection... (16200 times) [2024-06-15 15:33:07,086][1653645] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-06-15 15:33:08,549][1653645] Updated weights for policy 0, policy_version 312979 (0.0014) [2024-06-15 15:33:10,680][1653645] Updated weights for policy 0, policy_version 313059 (0.0011) [2024-06-15 15:33:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 641171456. Throughput: 0: 10843.0. Samples: 160342016. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:33:12,927][1653645] Updated weights for policy 0, policy_version 313104 (0.0011) [2024-06-15 15:33:13,904][1653645] Updated weights for policy 0, policy_version 313150 (0.0014) [2024-06-15 15:33:15,958][1648982] Fps is (10 sec: 52442.6, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 641335296. Throughput: 0: 10740.6. Samples: 160410112. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:33:20,265][1653645] Updated weights for policy 0, policy_version 313217 (0.0012) [2024-06-15 15:33:20,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42598.4, 300 sec: 43653.7). Total num frames: 641499136. Throughput: 0: 11093.4. Samples: 160446976. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:33:21,529][1653645] Updated weights for policy 0, policy_version 313265 (0.0058) [2024-06-15 15:33:23,308][1653645] Updated weights for policy 0, policy_version 313335 (0.0012) [2024-06-15 15:33:25,826][1653645] Updated weights for policy 0, policy_version 313401 (0.0012) [2024-06-15 15:33:25,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 44209.1). Total num frames: 641859584. Throughput: 0: 10695.6. Samples: 160503808. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:33:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 43542.6). Total num frames: 641892352. Throughput: 0: 10979.6. Samples: 160572928. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:33:32,372][1653645] Updated weights for policy 0, policy_version 313477 (0.0014) [2024-06-15 15:33:33,924][1653645] Updated weights for policy 0, policy_version 313536 (0.0012) [2024-06-15 15:33:35,240][1653645] Updated weights for policy 0, policy_version 313591 (0.0012) [2024-06-15 15:33:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 642252800. Throughput: 0: 10638.2. Samples: 160598016. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:33:38,237][1653645] Updated weights for policy 0, policy_version 313664 (0.0015) [2024-06-15 15:33:40,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43692.2, 300 sec: 43542.6). Total num frames: 642383872. Throughput: 0: 10558.6. Samples: 160664576. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:33:45,322][1653645] Updated weights for policy 0, policy_version 313760 (0.0098) [2024-06-15 15:33:45,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 40959.9, 300 sec: 43875.8). Total num frames: 642613248. Throughput: 0: 10524.4. Samples: 160723456. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:33:46,765][1653645] Updated weights for policy 0, policy_version 313808 (0.0041) [2024-06-15 15:33:47,198][1651596] Signal inference workers to stop experience collection... (16250 times) [2024-06-15 15:33:47,260][1653645] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-06-15 15:33:47,498][1651596] Signal inference workers to resume experience collection... (16250 times) [2024-06-15 15:33:47,499][1653645] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-06-15 15:33:49,923][1653645] Updated weights for policy 0, policy_version 313872 (0.0018) [2024-06-15 15:33:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43875.9). Total num frames: 642908160. Throughput: 0: 10502.3. Samples: 160753152. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:50,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:33:55,425][1653645] Updated weights for policy 0, policy_version 313939 (0.0013) [2024-06-15 15:33:55,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 41505.9, 300 sec: 43320.3). Total num frames: 642973696. Throughput: 0: 10729.1. Samples: 160824832. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:33:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:33:57,713][1653645] Updated weights for policy 0, policy_version 314018 (0.0012) [2024-06-15 15:33:59,263][1653645] Updated weights for policy 0, policy_version 314049 (0.0011) [2024-06-15 15:34:00,623][1653645] Updated weights for policy 0, policy_version 314104 (0.0012) [2024-06-15 15:34:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 643301376. Throughput: 0: 10456.2. Samples: 160880640. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:34:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:34:05,958][1648982] Fps is (10 sec: 45878.1, 60 sec: 43692.6, 300 sec: 43542.6). Total num frames: 643432448. Throughput: 0: 10285.5. Samples: 160909824. Policy #0 lag: (min: 13.0, avg: 75.1, max: 269.0) [2024-06-15 15:34:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:34:07,980][1653645] Updated weights for policy 0, policy_version 314177 (0.0022) [2024-06-15 15:34:09,875][1653645] Updated weights for policy 0, policy_version 314256 (0.0182) [2024-06-15 15:34:10,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 41506.1, 300 sec: 43875.8). Total num frames: 643661824. Throughput: 0: 10490.3. Samples: 160975872. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:34:11,143][1653645] Updated weights for policy 0, policy_version 314300 (0.0012) [2024-06-15 15:34:12,631][1653645] Updated weights for policy 0, policy_version 314339 (0.0012) [2024-06-15 15:34:14,939][1653645] Updated weights for policy 0, policy_version 314384 (0.0052) [2024-06-15 15:34:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 643956736. Throughput: 0: 10319.7. Samples: 161037312. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:34:20,311][1653645] Updated weights for policy 0, policy_version 314452 (0.0015) [2024-06-15 15:34:20,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 42598.2, 300 sec: 43431.4). Total num frames: 644055040. Throughput: 0: 10626.8. Samples: 161076224. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:34:21,796][1653645] Updated weights for policy 0, policy_version 314514 (0.0092) [2024-06-15 15:34:22,894][1653645] Updated weights for policy 0, policy_version 314559 (0.0012) [2024-06-15 15:34:24,012][1653645] Updated weights for policy 0, policy_version 314594 (0.0013) [2024-06-15 15:34:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 43875.8). Total num frames: 644349952. Throughput: 0: 10626.8. Samples: 161142784. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:34:26,054][1653645] Updated weights for policy 0, policy_version 314627 (0.0013) [2024-06-15 15:34:30,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 644481024. Throughput: 0: 10809.0. Samples: 161209856. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:34:32,128][1653645] Updated weights for policy 0, policy_version 314704 (0.0186) [2024-06-15 15:34:34,404][1651596] Signal inference workers to stop experience collection... (16300 times) [2024-06-15 15:34:34,450][1653645] Updated weights for policy 0, policy_version 314786 (0.0014) [2024-06-15 15:34:34,476][1653645] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-06-15 15:34:34,691][1651596] Signal inference workers to resume experience collection... (16300 times) [2024-06-15 15:34:34,692][1653645] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-06-15 15:34:35,630][1653645] Updated weights for policy 0, policy_version 314832 (0.0012) [2024-06-15 15:34:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 43764.7). Total num frames: 644775936. Throughput: 0: 10888.5. Samples: 161243136. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:34:39,152][1653645] Updated weights for policy 0, policy_version 314912 (0.0236) [2024-06-15 15:34:40,958][1648982] Fps is (10 sec: 52426.0, 60 sec: 43690.3, 300 sec: 43542.6). Total num frames: 645005312. Throughput: 0: 10513.1. Samples: 161297920. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:40,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:34:44,724][1653645] Updated weights for policy 0, policy_version 314976 (0.0013) [2024-06-15 15:34:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42052.4, 300 sec: 43542.6). Total num frames: 645136384. Throughput: 0: 10763.4. Samples: 161364992. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:34:45,981][1653645] Updated weights for policy 0, policy_version 315024 (0.0051) [2024-06-15 15:34:48,631][1653645] Updated weights for policy 0, policy_version 315093 (0.0074) [2024-06-15 15:34:50,958][1648982] Fps is (10 sec: 39323.7, 60 sec: 41506.1, 300 sec: 43209.3). Total num frames: 645398528. Throughput: 0: 10774.8. Samples: 161394688. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:34:51,783][1653645] Updated weights for policy 0, policy_version 315158 (0.0012) [2024-06-15 15:34:55,961][1648982] Fps is (10 sec: 39308.8, 60 sec: 42596.5, 300 sec: 43097.8). Total num frames: 645529600. Throughput: 0: 10808.1. Samples: 161462272. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:34:55,962][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:34:56,529][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000315232_645595136.pth... [2024-06-15 15:34:56,529][1653645] Updated weights for policy 0, policy_version 315232 (0.0011) [2024-06-15 15:34:56,629][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000310224_635338752.pth [2024-06-15 15:34:57,738][1653645] Updated weights for policy 0, policy_version 315269 (0.0013) [2024-06-15 15:34:59,213][1653645] Updated weights for policy 0, policy_version 315321 (0.0013) [2024-06-15 15:35:00,833][1653645] Updated weights for policy 0, policy_version 315361 (0.0110) [2024-06-15 15:35:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 43653.6). Total num frames: 645857280. Throughput: 0: 10854.4. Samples: 161525760. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:35:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:35:03,719][1653645] Updated weights for policy 0, policy_version 315424 (0.0013) [2024-06-15 15:35:05,958][1648982] Fps is (10 sec: 52446.0, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 646053888. Throughput: 0: 10717.9. Samples: 161558528. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:35:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:35:08,333][1653645] Updated weights for policy 0, policy_version 315457 (0.0012) [2024-06-15 15:35:09,531][1653645] Updated weights for policy 0, policy_version 315509 (0.0011) [2024-06-15 15:35:10,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 42598.2, 300 sec: 43209.3). Total num frames: 646217728. Throughput: 0: 10729.2. Samples: 161625600. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:35:10,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:35:11,613][1653645] Updated weights for policy 0, policy_version 315579 (0.0016) [2024-06-15 15:35:13,136][1653645] Updated weights for policy 0, policy_version 315648 (0.0015) [2024-06-15 15:35:15,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 646512640. Throughput: 0: 10649.6. Samples: 161689088. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:35:15,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 15:35:16,354][1653645] Updated weights for policy 0, policy_version 315707 (0.0011) [2024-06-15 15:35:20,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 646676480. Throughput: 0: 10740.6. Samples: 161726464. Policy #0 lag: (min: 31.0, avg: 108.5, max: 287.0) [2024-06-15 15:35:20,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:35:21,156][1653645] Updated weights for policy 0, policy_version 315773 (0.0013) [2024-06-15 15:35:22,863][1651596] Signal inference workers to stop experience collection... (16350 times) [2024-06-15 15:35:22,906][1653645] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-06-15 15:35:22,910][1653645] Updated weights for policy 0, policy_version 315811 (0.0013) [2024-06-15 15:35:23,188][1651596] Signal inference workers to resume experience collection... (16350 times) [2024-06-15 15:35:23,189][1653645] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-06-15 15:35:24,943][1653645] Updated weights for policy 0, policy_version 315873 (0.0025) [2024-06-15 15:35:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 646971392. Throughput: 0: 10945.5. Samples: 161790464. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:25,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 15:35:27,103][1653645] Updated weights for policy 0, policy_version 315936 (0.0014) [2024-06-15 15:35:27,942][1653645] Updated weights for policy 0, policy_version 315968 (0.0011) [2024-06-15 15:35:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 647102464. Throughput: 0: 10979.6. Samples: 161859072. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:35:33,076][1653645] Updated weights for policy 0, policy_version 316028 (0.0011) [2024-06-15 15:35:34,392][1653645] Updated weights for policy 0, policy_version 316083 (0.0024) [2024-06-15 15:35:35,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 647364608. Throughput: 0: 11104.7. Samples: 161894400. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:35:38,062][1653645] Updated weights for policy 0, policy_version 316176 (0.0013) [2024-06-15 15:35:40,962][1648982] Fps is (10 sec: 52404.2, 60 sec: 43687.6, 300 sec: 43208.6). Total num frames: 647626752. Throughput: 0: 10899.6. Samples: 161952768. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:40,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:35:43,947][1653645] Updated weights for policy 0, policy_version 316240 (0.0013) [2024-06-15 15:35:45,461][1653645] Updated weights for policy 0, policy_version 316304 (0.0053) [2024-06-15 15:35:45,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 647823360. Throughput: 0: 11104.7. Samples: 162025472. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:45,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:35:48,910][1653645] Updated weights for policy 0, policy_version 316368 (0.0014) [2024-06-15 15:35:50,736][1653645] Updated weights for policy 0, policy_version 316435 (0.0025) [2024-06-15 15:35:50,958][1648982] Fps is (10 sec: 42618.1, 60 sec: 44236.7, 300 sec: 43320.4). Total num frames: 648052736. Throughput: 0: 11138.8. Samples: 162059776. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:35:55,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 43692.9, 300 sec: 43098.2). Total num frames: 648151040. Throughput: 0: 10979.6. Samples: 162119680. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:35:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:35:56,296][1653645] Updated weights for policy 0, policy_version 316512 (0.0016) [2024-06-15 15:35:57,789][1653645] Updated weights for policy 0, policy_version 316576 (0.0015) [2024-06-15 15:36:00,809][1653645] Updated weights for policy 0, policy_version 316610 (0.0013) [2024-06-15 15:36:00,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 42598.4, 300 sec: 43431.5). Total num frames: 648413184. Throughput: 0: 11161.6. Samples: 162191360. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:36:02,673][1653645] Updated weights for policy 0, policy_version 316688 (0.0014) [2024-06-15 15:36:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 648675328. Throughput: 0: 10843.0. Samples: 162214400. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:36:08,330][1653645] Updated weights for policy 0, policy_version 316752 (0.0013) [2024-06-15 15:36:10,086][1651596] Signal inference workers to stop experience collection... (16400 times) [2024-06-15 15:36:10,150][1653645] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-06-15 15:36:10,294][1651596] Signal inference workers to resume experience collection... (16400 times) [2024-06-15 15:36:10,295][1653645] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-06-15 15:36:10,297][1653645] Updated weights for policy 0, policy_version 316832 (0.0013) [2024-06-15 15:36:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 44783.1, 300 sec: 43431.5). Total num frames: 648904704. Throughput: 0: 10956.8. Samples: 162283520. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:36:13,733][1653645] Updated weights for policy 0, policy_version 316928 (0.0166) [2024-06-15 15:36:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44782.8, 300 sec: 43209.3). Total num frames: 649199616. Throughput: 0: 10763.4. Samples: 162343424. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:36:19,729][1653645] Updated weights for policy 0, policy_version 316996 (0.0018) [2024-06-15 15:36:20,842][1653645] Updated weights for policy 0, policy_version 317053 (0.0115) [2024-06-15 15:36:20,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.9, 300 sec: 43431.5). Total num frames: 649330688. Throughput: 0: 10797.5. Samples: 162380288. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:36:23,008][1653645] Updated weights for policy 0, policy_version 317109 (0.0082) [2024-06-15 15:36:24,961][1653645] Updated weights for policy 0, policy_version 317152 (0.0011) [2024-06-15 15:36:25,961][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 649592832. Throughput: 0: 11162.8. Samples: 162455040. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:25,962][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:36:26,102][1653645] Updated weights for policy 0, policy_version 317200 (0.0012) [2024-06-15 15:36:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 649723904. Throughput: 0: 10911.3. Samples: 162516480. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:36:31,320][1653645] Updated weights for policy 0, policy_version 317264 (0.0015) [2024-06-15 15:36:34,079][1653645] Updated weights for policy 0, policy_version 317330 (0.0012) [2024-06-15 15:36:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 649986048. Throughput: 0: 10945.4. Samples: 162552320. Policy #0 lag: (min: 63.0, avg: 169.2, max: 319.0) [2024-06-15 15:36:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:36:36,262][1653645] Updated weights for policy 0, policy_version 317392 (0.0011) [2024-06-15 15:36:38,348][1653645] Updated weights for policy 0, policy_version 317472 (0.0011) [2024-06-15 15:36:40,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43694.1, 300 sec: 43431.5). Total num frames: 650248192. Throughput: 0: 10934.1. Samples: 162611712. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:36:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:36:42,831][1653645] Updated weights for policy 0, policy_version 317520 (0.0013) [2024-06-15 15:36:45,827][1653645] Updated weights for policy 0, policy_version 317586 (0.0014) [2024-06-15 15:36:45,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 43431.5). Total num frames: 650412032. Throughput: 0: 11002.3. Samples: 162686464. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:36:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:36:48,593][1653645] Updated weights for policy 0, policy_version 317648 (0.0012) [2024-06-15 15:36:50,333][1653645] Updated weights for policy 0, policy_version 317712 (0.0011) [2024-06-15 15:36:50,958][1648982] Fps is (10 sec: 45873.4, 60 sec: 44236.6, 300 sec: 43875.8). Total num frames: 650706944. Throughput: 0: 11355.0. Samples: 162725376. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:36:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:36:51,457][1653645] Updated weights for policy 0, policy_version 317760 (0.0020) [2024-06-15 15:36:54,032][1651596] Signal inference workers to stop experience collection... (16450 times) [2024-06-15 15:36:54,069][1653645] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-06-15 15:36:54,266][1651596] Signal inference workers to resume experience collection... (16450 times) [2024-06-15 15:36:54,268][1653645] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-06-15 15:36:54,509][1653645] Updated weights for policy 0, policy_version 317824 (0.0015) [2024-06-15 15:36:55,957][1648982] Fps is (10 sec: 49152.4, 60 sec: 45875.5, 300 sec: 43542.6). Total num frames: 650903552. Throughput: 0: 11104.8. Samples: 162783232. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:36:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:36:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000317824_650903552.pth... [2024-06-15 15:36:56,065][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000312768_640548864.pth [2024-06-15 15:36:58,115][1653645] Updated weights for policy 0, policy_version 317880 (0.0012) [2024-06-15 15:37:00,953][1653645] Updated weights for policy 0, policy_version 317940 (0.0037) [2024-06-15 15:37:00,958][1648982] Fps is (10 sec: 42600.1, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 651132928. Throughput: 0: 11491.6. Samples: 162860544. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:37:02,647][1653645] Updated weights for policy 0, policy_version 318011 (0.0012) [2024-06-15 15:37:05,865][1653645] Updated weights for policy 0, policy_version 318078 (0.0012) [2024-06-15 15:37:05,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45875.4, 300 sec: 43653.7). Total num frames: 651427840. Throughput: 0: 11286.8. Samples: 162888192. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:05,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:37:10,175][1653645] Updated weights for policy 0, policy_version 318130 (0.0012) [2024-06-15 15:37:10,958][1648982] Fps is (10 sec: 42595.1, 60 sec: 44236.3, 300 sec: 43542.4). Total num frames: 651558912. Throughput: 0: 11127.3. Samples: 162955776. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:10,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:37:13,254][1653645] Updated weights for policy 0, policy_version 318192 (0.0015) [2024-06-15 15:37:14,850][1653645] Updated weights for policy 0, policy_version 318257 (0.0012) [2024-06-15 15:37:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 651821056. Throughput: 0: 11070.6. Samples: 163014656. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:15,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:37:17,523][1653645] Updated weights for policy 0, policy_version 318320 (0.0019) [2024-06-15 15:37:20,958][1648982] Fps is (10 sec: 39324.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 651952128. Throughput: 0: 10990.9. Samples: 163046912. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:20,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:37:21,871][1653645] Updated weights for policy 0, policy_version 318384 (0.0051) [2024-06-15 15:37:24,643][1653645] Updated weights for policy 0, policy_version 318450 (0.0015) [2024-06-15 15:37:25,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 652312576. Throughput: 0: 11423.3. Samples: 163125760. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:37:28,024][1653645] Updated weights for policy 0, policy_version 318544 (0.0013) [2024-06-15 15:37:28,908][1653645] Updated weights for policy 0, policy_version 318590 (0.0012) [2024-06-15 15:37:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.2, 300 sec: 43542.6). Total num frames: 652476416. Throughput: 0: 11173.0. Samples: 163189248. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:37:33,595][1653645] Updated weights for policy 0, policy_version 318649 (0.0014) [2024-06-15 15:37:35,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 44236.7, 300 sec: 43653.9). Total num frames: 652640256. Throughput: 0: 11070.6. Samples: 163223552. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:37:35,993][1653645] Updated weights for policy 0, policy_version 318688 (0.0013) [2024-06-15 15:37:37,627][1653645] Updated weights for policy 0, policy_version 318755 (0.0031) [2024-06-15 15:37:38,942][1651596] Signal inference workers to stop experience collection... (16500 times) [2024-06-15 15:37:39,030][1653645] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-06-15 15:37:39,033][1653645] Updated weights for policy 0, policy_version 318790 (0.0014) [2024-06-15 15:37:39,171][1651596] Signal inference workers to resume experience collection... (16500 times) [2024-06-15 15:37:39,178][1653645] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-06-15 15:37:40,171][1653645] Updated weights for policy 0, policy_version 318848 (0.0015) [2024-06-15 15:37:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 43542.6). Total num frames: 653000704. Throughput: 0: 11332.2. Samples: 163293184. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:37:45,809][1653645] Updated weights for policy 0, policy_version 318909 (0.0015) [2024-06-15 15:37:45,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.0, 300 sec: 43542.6). Total num frames: 653131776. Throughput: 0: 11161.6. Samples: 163362816. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:37:47,987][1653645] Updated weights for policy 0, policy_version 318961 (0.0011) [2024-06-15 15:37:49,307][1653645] Updated weights for policy 0, policy_version 319032 (0.0012) [2024-06-15 15:37:50,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 653459456. Throughput: 0: 11184.3. Samples: 163391488. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:50,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:37:51,522][1653645] Updated weights for policy 0, policy_version 319097 (0.0120) [2024-06-15 15:37:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.5, 300 sec: 43431.5). Total num frames: 653524992. Throughput: 0: 11184.5. Samples: 163459072. Policy #0 lag: (min: 63.0, avg: 188.0, max: 319.0) [2024-06-15 15:37:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:37:57,856][1653645] Updated weights for policy 0, policy_version 319159 (0.0013) [2024-06-15 15:38:00,229][1653645] Updated weights for policy 0, policy_version 319221 (0.0012) [2024-06-15 15:38:00,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44782.8, 300 sec: 44098.3). Total num frames: 653819904. Throughput: 0: 11264.0. Samples: 163521536. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:38:01,493][1653645] Updated weights for policy 0, policy_version 319283 (0.0014) [2024-06-15 15:38:04,189][1653645] Updated weights for policy 0, policy_version 319329 (0.0072) [2024-06-15 15:38:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 654049280. Throughput: 0: 11377.8. Samples: 163558912. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:05,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 15:38:09,438][1653645] Updated weights for policy 0, policy_version 319392 (0.0011) [2024-06-15 15:38:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44237.2, 300 sec: 43653.6). Total num frames: 654213120. Throughput: 0: 11059.2. Samples: 163623424. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:10,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:38:11,184][1653645] Updated weights for policy 0, policy_version 319456 (0.0016) [2024-06-15 15:38:12,479][1653645] Updated weights for policy 0, policy_version 319520 (0.0021) [2024-06-15 15:38:15,550][1653645] Updated weights for policy 0, policy_version 319572 (0.0012) [2024-06-15 15:38:15,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 654508032. Throughput: 0: 11229.9. Samples: 163694592. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:38:20,422][1653645] Updated weights for policy 0, policy_version 319648 (0.0132) [2024-06-15 15:38:20,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45875.1, 300 sec: 43542.5). Total num frames: 654704640. Throughput: 0: 11241.3. Samples: 163729408. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:38:22,909][1653645] Updated weights for policy 0, policy_version 319730 (0.0013) [2024-06-15 15:38:23,689][1651596] Signal inference workers to stop experience collection... (16550 times) [2024-06-15 15:38:23,746][1653645] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-06-15 15:38:23,942][1651596] Signal inference workers to resume experience collection... (16550 times) [2024-06-15 15:38:23,943][1653645] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-06-15 15:38:24,657][1653645] Updated weights for policy 0, policy_version 319801 (0.0014) [2024-06-15 15:38:25,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 654966784. Throughput: 0: 10968.1. Samples: 163786752. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:25,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:38:27,136][1653645] Updated weights for policy 0, policy_version 319828 (0.0011) [2024-06-15 15:38:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 655097856. Throughput: 0: 11070.6. Samples: 163860992. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:30,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 15:38:32,457][1653645] Updated weights for policy 0, policy_version 319888 (0.0031) [2024-06-15 15:38:34,119][1653645] Updated weights for policy 0, policy_version 319968 (0.0017) [2024-06-15 15:38:35,192][1653645] Updated weights for policy 0, policy_version 320016 (0.0012) [2024-06-15 15:38:35,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 46967.4, 300 sec: 44320.1). Total num frames: 655458304. Throughput: 0: 11218.5. Samples: 163896320. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:38:36,051][1653645] Updated weights for policy 0, policy_version 320055 (0.0012) [2024-06-15 15:38:37,499][1653645] Updated weights for policy 0, policy_version 320080 (0.0011) [2024-06-15 15:38:38,530][1653645] Updated weights for policy 0, policy_version 320128 (0.0011) [2024-06-15 15:38:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 655622144. Throughput: 0: 11309.5. Samples: 163968000. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:38:44,944][1653645] Updated weights for policy 0, policy_version 320208 (0.0013) [2024-06-15 15:38:45,960][1648982] Fps is (10 sec: 42598.9, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 655884288. Throughput: 0: 11377.8. Samples: 164033536. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:45,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:38:46,386][1653645] Updated weights for policy 0, policy_version 320263 (0.0043) [2024-06-15 15:38:49,040][1653645] Updated weights for policy 0, policy_version 320321 (0.0014) [2024-06-15 15:38:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 656146432. Throughput: 0: 11320.9. Samples: 164068352. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:50,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:38:54,439][1653645] Updated weights for policy 0, policy_version 320385 (0.0024) [2024-06-15 15:38:55,878][1653645] Updated weights for policy 0, policy_version 320464 (0.0083) [2024-06-15 15:38:55,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 46421.2, 300 sec: 44097.9). Total num frames: 656310272. Throughput: 0: 11548.4. Samples: 164143104. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:38:55,959][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 15:38:56,199][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000320480_656343040.pth... [2024-06-15 15:38:56,306][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000315232_645595136.pth [2024-06-15 15:38:57,989][1653645] Updated weights for policy 0, policy_version 320528 (0.0014) [2024-06-15 15:39:00,700][1653645] Updated weights for policy 0, policy_version 320578 (0.0014) [2024-06-15 15:39:00,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 45875.1, 300 sec: 44542.2). Total num frames: 656572416. Throughput: 0: 11298.1. Samples: 164203008. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:39:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:39:02,185][1653645] Updated weights for policy 0, policy_version 320636 (0.0014) [2024-06-15 15:39:05,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 656703488. Throughput: 0: 11377.8. Samples: 164241408. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:39:05,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 15:39:06,982][1653645] Updated weights for policy 0, policy_version 320704 (0.0014) [2024-06-15 15:39:07,375][1651596] Signal inference workers to stop experience collection... (16600 times) [2024-06-15 15:39:07,467][1653645] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-06-15 15:39:07,558][1651596] Signal inference workers to resume experience collection... (16600 times) [2024-06-15 15:39:07,559][1653645] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-06-15 15:39:07,972][1653645] Updated weights for policy 0, policy_version 320759 (0.0014) [2024-06-15 15:39:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 46967.5, 300 sec: 44320.1). Total num frames: 657031168. Throughput: 0: 11639.5. Samples: 164310528. Policy #0 lag: (min: 15.0, avg: 84.9, max: 271.0) [2024-06-15 15:39:10,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:39:11,159][1653645] Updated weights for policy 0, policy_version 320832 (0.0012) [2024-06-15 15:39:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 44782.8, 300 sec: 44542.3). Total num frames: 657195008. Throughput: 0: 11514.2. Samples: 164379136. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:39:17,148][1653645] Updated weights for policy 0, policy_version 320903 (0.0014) [2024-06-15 15:39:18,178][1653645] Updated weights for policy 0, policy_version 320953 (0.0023) [2024-06-15 15:39:19,811][1653645] Updated weights for policy 0, policy_version 321017 (0.0011) [2024-06-15 15:39:20,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 657457152. Throughput: 0: 11503.0. Samples: 164413952. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:20,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 15:39:22,268][1653645] Updated weights for policy 0, policy_version 321056 (0.0012) [2024-06-15 15:39:24,832][1653645] Updated weights for policy 0, policy_version 321136 (0.0014) [2024-06-15 15:39:25,962][1648982] Fps is (10 sec: 52405.5, 60 sec: 45871.8, 300 sec: 44874.8). Total num frames: 657719296. Throughput: 0: 11206.0. Samples: 164472320. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:25,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:39:29,510][1653645] Updated weights for policy 0, policy_version 321159 (0.0017) [2024-06-15 15:39:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 657850368. Throughput: 0: 11411.9. Samples: 164547072. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:30,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 15:39:31,672][1653645] Updated weights for policy 0, policy_version 321248 (0.0013) [2024-06-15 15:39:33,907][1653645] Updated weights for policy 0, policy_version 321312 (0.0027) [2024-06-15 15:39:35,958][1648982] Fps is (10 sec: 39340.0, 60 sec: 44236.9, 300 sec: 44431.3). Total num frames: 658112512. Throughput: 0: 11207.1. Samples: 164572672. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:35,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 15:39:36,193][1653645] Updated weights for policy 0, policy_version 321360 (0.0073) [2024-06-15 15:39:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 658243584. Throughput: 0: 10922.7. Samples: 164634624. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:39:43,481][1653645] Updated weights for policy 0, policy_version 321457 (0.0013) [2024-06-15 15:39:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 658505728. Throughput: 0: 10899.9. Samples: 164693504. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:45,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:39:46,882][1653645] Updated weights for policy 0, policy_version 321552 (0.0014) [2024-06-15 15:39:49,387][1653645] Updated weights for policy 0, policy_version 321619 (0.0022) [2024-06-15 15:39:50,403][1653645] Updated weights for policy 0, policy_version 321662 (0.0011) [2024-06-15 15:39:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44876.0). Total num frames: 658767872. Throughput: 0: 10763.4. Samples: 164725760. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:50,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 15:39:54,923][1653645] Updated weights for policy 0, policy_version 321715 (0.0020) [2024-06-15 15:39:55,328][1651596] Signal inference workers to stop experience collection... (16650 times) [2024-06-15 15:39:55,374][1653645] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-06-15 15:39:55,624][1651596] Signal inference workers to resume experience collection... (16650 times) [2024-06-15 15:39:55,625][1653645] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-06-15 15:39:55,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 658931712. Throughput: 0: 10911.3. Samples: 164801536. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:39:55,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 15:39:59,078][1653645] Updated weights for policy 0, policy_version 321808 (0.0012) [2024-06-15 15:40:00,958][1648982] Fps is (10 sec: 39319.2, 60 sec: 43144.2, 300 sec: 44431.1). Total num frames: 659161088. Throughput: 0: 10604.0. Samples: 164856320. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:40:00,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 15:40:01,261][1653645] Updated weights for policy 0, policy_version 321872 (0.0011) [2024-06-15 15:40:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 659324928. Throughput: 0: 10558.6. Samples: 164889088. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:40:05,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 15:40:06,031][1653645] Updated weights for policy 0, policy_version 321952 (0.0021) [2024-06-15 15:40:08,144][1653645] Updated weights for policy 0, policy_version 322018 (0.0013) [2024-06-15 15:40:08,876][1653645] Updated weights for policy 0, policy_version 322048 (0.0027) [2024-06-15 15:40:10,958][1648982] Fps is (10 sec: 42600.4, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 659587072. Throughput: 0: 10821.4. Samples: 164959232. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:40:10,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:40:13,126][1653645] Updated weights for policy 0, policy_version 322128 (0.0011) [2024-06-15 15:40:15,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 659816448. Throughput: 0: 10672.3. Samples: 165027328. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:40:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:40:16,593][1653645] Updated weights for policy 0, policy_version 322178 (0.0011) [2024-06-15 15:40:17,872][1653645] Updated weights for policy 0, policy_version 322232 (0.0080) [2024-06-15 15:40:20,301][1653645] Updated weights for policy 0, policy_version 322288 (0.0014) [2024-06-15 15:40:20,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 660078592. Throughput: 0: 10865.8. Samples: 165061632. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:40:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:40:23,330][1653645] Updated weights for policy 0, policy_version 322357 (0.0026) [2024-06-15 15:40:24,378][1653645] Updated weights for policy 0, policy_version 322389 (0.0012) [2024-06-15 15:40:25,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43694.1, 300 sec: 44875.5). Total num frames: 660340736. Throughput: 0: 11002.3. Samples: 165129728. Policy #0 lag: (min: 7.0, avg: 143.5, max: 263.0) [2024-06-15 15:40:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:40:27,860][1653645] Updated weights for policy 0, policy_version 322436 (0.0113) [2024-06-15 15:40:28,878][1653645] Updated weights for policy 0, policy_version 322491 (0.0011) [2024-06-15 15:40:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 660504576. Throughput: 0: 11411.9. Samples: 165207040. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:40:30,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:40:31,713][1653645] Updated weights for policy 0, policy_version 322552 (0.0012) [2024-06-15 15:40:35,066][1653645] Updated weights for policy 0, policy_version 322619 (0.0014) [2024-06-15 15:40:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 44654.1). Total num frames: 660799488. Throughput: 0: 11411.9. Samples: 165239296. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:40:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:40:36,248][1653645] Updated weights for policy 0, policy_version 322681 (0.0013) [2024-06-15 15:40:40,151][1653645] Updated weights for policy 0, policy_version 322736 (0.0013) [2024-06-15 15:40:40,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 45875.0, 300 sec: 44653.3). Total num frames: 660996096. Throughput: 0: 11150.1. Samples: 165303296. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:40:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:40:43,058][1651596] Signal inference workers to stop experience collection... (16700 times) [2024-06-15 15:40:43,117][1653645] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-06-15 15:40:43,404][1651596] Signal inference workers to resume experience collection... (16700 times) [2024-06-15 15:40:43,405][1653645] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-06-15 15:40:43,805][1653645] Updated weights for policy 0, policy_version 322800 (0.0015) [2024-06-15 15:40:45,970][1648982] Fps is (10 sec: 39321.1, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 661192704. Throughput: 0: 11480.3. Samples: 165372928. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:40:45,970][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:40:46,236][1653645] Updated weights for policy 0, policy_version 322870 (0.0170) [2024-06-15 15:40:47,847][1653645] Updated weights for policy 0, policy_version 322928 (0.0019) [2024-06-15 15:40:50,758][1653645] Updated weights for policy 0, policy_version 322981 (0.0013) [2024-06-15 15:40:50,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 45329.1, 300 sec: 45208.8). Total num frames: 661487616. Throughput: 0: 11411.9. Samples: 165402624. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:40:50,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 15:40:54,907][1653645] Updated weights for policy 0, policy_version 323024 (0.0014) [2024-06-15 15:40:55,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 661618688. Throughput: 0: 11514.3. Samples: 165477376. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:40:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:40:56,032][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000323072_661651456.pth... [2024-06-15 15:40:56,079][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000317824_650903552.pth [2024-06-15 15:40:57,324][1653645] Updated weights for policy 0, policy_version 323088 (0.0012) [2024-06-15 15:40:58,350][1653645] Updated weights for policy 0, policy_version 323136 (0.0013) [2024-06-15 15:40:59,907][1653645] Updated weights for policy 0, policy_version 323198 (0.0050) [2024-06-15 15:41:00,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.7, 300 sec: 44875.5). Total num frames: 661913600. Throughput: 0: 11309.5. Samples: 165536256. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:00,963][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 15:41:02,909][1653645] Updated weights for policy 0, policy_version 323264 (0.0014) [2024-06-15 15:41:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 662044672. Throughput: 0: 11286.7. Samples: 165569536. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:41:08,299][1653645] Updated weights for policy 0, policy_version 323325 (0.0015) [2024-06-15 15:41:10,429][1653645] Updated weights for policy 0, policy_version 323392 (0.0013) [2024-06-15 15:41:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 44542.3). Total num frames: 662339584. Throughput: 0: 11355.0. Samples: 165640704. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:10,960][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:41:11,413][1653645] Updated weights for policy 0, policy_version 323441 (0.0014) [2024-06-15 15:41:14,189][1653645] Updated weights for policy 0, policy_version 323511 (0.0013) [2024-06-15 15:41:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 662568960. Throughput: 0: 11082.0. Samples: 165705728. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 15:41:19,973][1653645] Updated weights for policy 0, policy_version 323582 (0.0012) [2024-06-15 15:41:20,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 662700032. Throughput: 0: 11241.2. Samples: 165745152. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:41:22,320][1653645] Updated weights for policy 0, policy_version 323664 (0.0012) [2024-06-15 15:41:24,747][1653645] Updated weights for policy 0, policy_version 323729 (0.0013) [2024-06-15 15:41:25,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 663093248. Throughput: 0: 11207.1. Samples: 165807616. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:25,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:41:30,500][1651596] Signal inference workers to stop experience collection... (16750 times) [2024-06-15 15:41:30,593][1653645] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-06-15 15:41:30,596][1653645] Updated weights for policy 0, policy_version 323797 (0.0016) [2024-06-15 15:41:30,769][1651596] Signal inference workers to resume experience collection... (16750 times) [2024-06-15 15:41:30,770][1653645] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-06-15 15:41:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 663158784. Throughput: 0: 11343.7. Samples: 165883392. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:41:31,982][1653645] Updated weights for policy 0, policy_version 323856 (0.0015) [2024-06-15 15:41:32,935][1653645] Updated weights for policy 0, policy_version 323901 (0.0014) [2024-06-15 15:41:34,385][1653645] Updated weights for policy 0, policy_version 323966 (0.0096) [2024-06-15 15:41:35,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 663486464. Throughput: 0: 11377.8. Samples: 165914624. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:41:37,161][1653645] Updated weights for policy 0, policy_version 324027 (0.0014) [2024-06-15 15:41:40,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 663617536. Throughput: 0: 11138.8. Samples: 165978624. Policy #0 lag: (min: 4.0, avg: 107.7, max: 260.0) [2024-06-15 15:41:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:41:44,452][1653645] Updated weights for policy 0, policy_version 324128 (0.0012) [2024-06-15 15:41:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 663879680. Throughput: 0: 11195.7. Samples: 166040064. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:41:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:41:46,230][1653645] Updated weights for policy 0, policy_version 324182 (0.0020) [2024-06-15 15:41:47,709][1653645] Updated weights for policy 0, policy_version 324225 (0.0014) [2024-06-15 15:41:49,059][1653645] Updated weights for policy 0, policy_version 324288 (0.0098) [2024-06-15 15:41:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 664141824. Throughput: 0: 11116.1. Samples: 166069760. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:41:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:41:55,959][1648982] Fps is (10 sec: 36042.2, 60 sec: 43690.2, 300 sec: 44431.1). Total num frames: 664240128. Throughput: 0: 11161.4. Samples: 166142976. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:41:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:41:56,438][1653645] Updated weights for policy 0, policy_version 324368 (0.0011) [2024-06-15 15:41:59,315][1653645] Updated weights for policy 0, policy_version 324466 (0.0056) [2024-06-15 15:42:00,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 664600576. Throughput: 0: 10934.0. Samples: 166197760. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:42:01,474][1653645] Updated weights for policy 0, policy_version 324541 (0.0014) [2024-06-15 15:42:05,966][1648982] Fps is (10 sec: 42564.8, 60 sec: 43684.4, 300 sec: 44430.0). Total num frames: 664666112. Throughput: 0: 10749.9. Samples: 166228992. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:05,967][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:42:08,788][1653645] Updated weights for policy 0, policy_version 324608 (0.0013) [2024-06-15 15:42:10,468][1653645] Updated weights for policy 0, policy_version 324668 (0.0010) [2024-06-15 15:42:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 664961024. Throughput: 0: 10911.3. Samples: 166298624. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:42:11,816][1653645] Updated weights for policy 0, policy_version 324720 (0.0020) [2024-06-15 15:42:12,534][1651596] Signal inference workers to stop experience collection... (16800 times) [2024-06-15 15:42:12,620][1653645] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-06-15 15:42:12,831][1651596] Signal inference workers to resume experience collection... (16800 times) [2024-06-15 15:42:12,832][1653645] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-06-15 15:42:12,834][1653645] Updated weights for policy 0, policy_version 324752 (0.0013) [2024-06-15 15:42:14,168][1653645] Updated weights for policy 0, policy_version 324800 (0.0015) [2024-06-15 15:42:15,958][1648982] Fps is (10 sec: 52473.8, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 665190400. Throughput: 0: 10547.2. Samples: 166358016. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:42:20,958][1648982] Fps is (10 sec: 32767.1, 60 sec: 43144.3, 300 sec: 43986.8). Total num frames: 665288704. Throughput: 0: 10672.3. Samples: 166394880. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:42:21,572][1653645] Updated weights for policy 0, policy_version 324881 (0.0013) [2024-06-15 15:42:23,302][1653645] Updated weights for policy 0, policy_version 324944 (0.0013) [2024-06-15 15:42:24,474][1653645] Updated weights for policy 0, policy_version 324986 (0.0013) [2024-06-15 15:42:25,847][1653645] Updated weights for policy 0, policy_version 325045 (0.0015) [2024-06-15 15:42:25,960][1648982] Fps is (10 sec: 49151.9, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 665681920. Throughput: 0: 10490.3. Samples: 166450688. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:25,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:42:30,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 665714688. Throughput: 0: 10820.3. Samples: 166526976. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:42:32,910][1653645] Updated weights for policy 0, policy_version 325104 (0.0012) [2024-06-15 15:42:34,751][1653645] Updated weights for policy 0, policy_version 325170 (0.0013) [2024-06-15 15:42:35,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 42052.1, 300 sec: 44097.9). Total num frames: 666009600. Throughput: 0: 10808.8. Samples: 166556160. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:35,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:42:36,664][1653645] Updated weights for policy 0, policy_version 325248 (0.0053) [2024-06-15 15:42:38,171][1653645] Updated weights for policy 0, policy_version 325305 (0.0160) [2024-06-15 15:42:40,958][1648982] Fps is (10 sec: 52425.1, 60 sec: 43690.2, 300 sec: 44431.1). Total num frames: 666238976. Throughput: 0: 10444.8. Samples: 166612992. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:42:45,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 40960.0, 300 sec: 43653.6). Total num frames: 666337280. Throughput: 0: 10843.0. Samples: 166685696. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:42:46,323][1653645] Updated weights for policy 0, policy_version 325379 (0.0014) [2024-06-15 15:42:48,172][1653645] Updated weights for policy 0, policy_version 325456 (0.0013) [2024-06-15 15:42:49,773][1653645] Updated weights for policy 0, policy_version 325508 (0.0039) [2024-06-15 15:42:50,958][1648982] Fps is (10 sec: 52431.3, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 666763264. Throughput: 0: 10674.4. Samples: 166709248. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:50,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:42:55,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 42052.7, 300 sec: 43875.8). Total num frames: 666763264. Throughput: 0: 10649.6. Samples: 166777856. Policy #0 lag: (min: 47.0, avg: 110.6, max: 303.0) [2024-06-15 15:42:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:42:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000325568_666763264.pth... [2024-06-15 15:42:56,052][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000320480_656343040.pth [2024-06-15 15:42:57,441][1653645] Updated weights for policy 0, policy_version 325600 (0.0108) [2024-06-15 15:42:59,021][1653645] Updated weights for policy 0, policy_version 325664 (0.0013) [2024-06-15 15:42:59,161][1651596] Signal inference workers to stop experience collection... (16850 times) [2024-06-15 15:42:59,225][1653645] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-06-15 15:42:59,433][1651596] Signal inference workers to resume experience collection... (16850 times) [2024-06-15 15:42:59,434][1653645] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-06-15 15:43:00,410][1653645] Updated weights for policy 0, policy_version 325728 (0.0012) [2024-06-15 15:43:00,957][1648982] Fps is (10 sec: 39323.1, 60 sec: 42598.5, 300 sec: 44431.2). Total num frames: 667156480. Throughput: 0: 10638.3. Samples: 166836736. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:43:02,747][1653645] Updated weights for policy 0, policy_version 325808 (0.0014) [2024-06-15 15:43:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43696.8, 300 sec: 44320.1). Total num frames: 667287552. Throughput: 0: 10467.6. Samples: 166865920. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:05,961][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:43:10,852][1653645] Updated weights for policy 0, policy_version 325891 (0.0013) [2024-06-15 15:43:10,958][1648982] Fps is (10 sec: 26213.8, 60 sec: 40960.0, 300 sec: 43764.7). Total num frames: 667418624. Throughput: 0: 10831.6. Samples: 166938112. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:43:12,430][1653645] Updated weights for policy 0, policy_version 325971 (0.0013) [2024-06-15 15:43:14,884][1653645] Updated weights for policy 0, policy_version 326032 (0.0013) [2024-06-15 15:43:15,946][1653645] Updated weights for policy 0, policy_version 326080 (0.0013) [2024-06-15 15:43:15,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 667811840. Throughput: 0: 10387.9. Samples: 166994432. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:43:20,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42052.5, 300 sec: 43542.6). Total num frames: 667811840. Throughput: 0: 10513.1. Samples: 167029248. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:20,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 15:43:22,640][1653645] Updated weights for policy 0, policy_version 326146 (0.0013) [2024-06-15 15:43:24,076][1653645] Updated weights for policy 0, policy_version 326208 (0.0012) [2024-06-15 15:43:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 44431.2). Total num frames: 668205056. Throughput: 0: 10786.3. Samples: 167098368. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:43:26,440][1653645] Updated weights for policy 0, policy_version 326304 (0.0014) [2024-06-15 15:43:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 668336128. Throughput: 0: 10558.6. Samples: 167160832. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:43:34,765][1653645] Updated weights for policy 0, policy_version 326388 (0.0014) [2024-06-15 15:43:35,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 42052.2, 300 sec: 43764.7). Total num frames: 668532736. Throughput: 0: 10922.6. Samples: 167200768. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:35,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:43:36,702][1653645] Updated weights for policy 0, policy_version 326466 (0.0046) [2024-06-15 15:43:37,832][1653645] Updated weights for policy 0, policy_version 326526 (0.0015) [2024-06-15 15:43:39,603][1653645] Updated weights for policy 0, policy_version 326576 (0.0020) [2024-06-15 15:43:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43691.2, 300 sec: 43986.9). Total num frames: 668860416. Throughput: 0: 10547.2. Samples: 167252480. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:43:45,268][1651596] Signal inference workers to stop experience collection... (16900 times) [2024-06-15 15:43:45,365][1653645] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-06-15 15:43:45,511][1651596] Signal inference workers to resume experience collection... (16900 times) [2024-06-15 15:43:45,512][1653645] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-06-15 15:43:45,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43144.5, 300 sec: 43320.4). Total num frames: 668925952. Throughput: 0: 10979.5. Samples: 167330816. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:45,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:43:46,082][1653645] Updated weights for policy 0, policy_version 326627 (0.0013) [2024-06-15 15:43:47,480][1653645] Updated weights for policy 0, policy_version 326691 (0.0013) [2024-06-15 15:43:49,632][1653645] Updated weights for policy 0, policy_version 326768 (0.0029) [2024-06-15 15:43:50,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 44098.0). Total num frames: 669319168. Throughput: 0: 10991.0. Samples: 167360512. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:43:51,292][1653645] Updated weights for policy 0, policy_version 326844 (0.0015) [2024-06-15 15:43:55,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 669384704. Throughput: 0: 10774.7. Samples: 167422976. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:43:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:43:57,911][1653645] Updated weights for policy 0, policy_version 326899 (0.0039) [2024-06-15 15:43:59,777][1653645] Updated weights for policy 0, policy_version 326963 (0.0018) [2024-06-15 15:44:00,553][1653645] Updated weights for policy 0, policy_version 326980 (0.0011) [2024-06-15 15:44:00,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42052.1, 300 sec: 43986.9). Total num frames: 669679616. Throughput: 0: 11047.8. Samples: 167491584. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:44:00,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:44:02,613][1653645] Updated weights for policy 0, policy_version 327072 (0.0013) [2024-06-15 15:44:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 43653.7). Total num frames: 669908992. Throughput: 0: 10763.4. Samples: 167513600. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:44:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:44:09,463][1653645] Updated weights for policy 0, policy_version 327120 (0.0012) [2024-06-15 15:44:10,647][1653645] Updated weights for policy 0, policy_version 327163 (0.0011) [2024-06-15 15:44:10,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 43690.5, 300 sec: 43542.6). Total num frames: 670040064. Throughput: 0: 10911.3. Samples: 167589376. Policy #0 lag: (min: 82.0, avg: 129.2, max: 337.0) [2024-06-15 15:44:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:44:12,679][1653645] Updated weights for policy 0, policy_version 327232 (0.0012) [2024-06-15 15:44:14,058][1653645] Updated weights for policy 0, policy_version 327283 (0.0014) [2024-06-15 15:44:15,556][1653645] Updated weights for policy 0, policy_version 327352 (0.0013) [2024-06-15 15:44:15,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 670433280. Throughput: 0: 10729.2. Samples: 167643648. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:44:20,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.6, 300 sec: 43098.9). Total num frames: 670433280. Throughput: 0: 10717.9. Samples: 167683072. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:44:22,743][1653645] Updated weights for policy 0, policy_version 327408 (0.0011) [2024-06-15 15:44:24,997][1651596] Signal inference workers to stop experience collection... (16950 times) [2024-06-15 15:44:25,043][1653645] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-06-15 15:44:25,357][1651596] Signal inference workers to resume experience collection... (16950 times) [2024-06-15 15:44:25,358][1653645] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-06-15 15:44:25,360][1653645] Updated weights for policy 0, policy_version 327520 (0.0013) [2024-06-15 15:44:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 670793728. Throughput: 0: 11070.6. Samples: 167750656. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:44:27,555][1653645] Updated weights for policy 0, policy_version 327611 (0.0083) [2024-06-15 15:44:30,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.4, 300 sec: 43542.5). Total num frames: 670957568. Throughput: 0: 10660.9. Samples: 167810560. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:44:35,006][1653645] Updated weights for policy 0, policy_version 327665 (0.0013) [2024-06-15 15:44:35,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43144.7, 300 sec: 43653.6). Total num frames: 671121408. Throughput: 0: 10899.9. Samples: 167851008. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:44:36,692][1653645] Updated weights for policy 0, policy_version 327735 (0.0023) [2024-06-15 15:44:38,548][1653645] Updated weights for policy 0, policy_version 327794 (0.0110) [2024-06-15 15:44:40,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 671481856. Throughput: 0: 10581.3. Samples: 167899136. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:44:45,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 42598.2, 300 sec: 43098.2). Total num frames: 671481856. Throughput: 0: 10660.9. Samples: 167971328. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:44:47,154][1653645] Updated weights for policy 0, policy_version 327873 (0.0015) [2024-06-15 15:44:48,703][1653645] Updated weights for policy 0, policy_version 327952 (0.0121) [2024-06-15 15:44:50,772][1653645] Updated weights for policy 0, policy_version 328032 (0.0013) [2024-06-15 15:44:50,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 41506.1, 300 sec: 43653.6). Total num frames: 671809536. Throughput: 0: 10843.0. Samples: 168001536. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:44:52,730][1653645] Updated weights for policy 0, policy_version 328118 (0.0013) [2024-06-15 15:44:55,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.8, 300 sec: 43542.7). Total num frames: 672006144. Throughput: 0: 10501.8. Samples: 168061952. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:44:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:44:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000328128_672006144.pth... [2024-06-15 15:44:56,020][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000323072_661651456.pth [2024-06-15 15:45:00,057][1653645] Updated weights for policy 0, policy_version 328176 (0.0154) [2024-06-15 15:45:00,983][1648982] Fps is (10 sec: 35955.6, 60 sec: 41489.0, 300 sec: 43538.9). Total num frames: 672169984. Throughput: 0: 10780.2. Samples: 168129024. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:00,983][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:45:02,583][1653645] Updated weights for policy 0, policy_version 328288 (0.0107) [2024-06-15 15:45:03,933][1653645] Updated weights for policy 0, policy_version 328342 (0.0014) [2024-06-15 15:45:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 672530432. Throughput: 0: 10490.3. Samples: 168155136. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:45:10,958][1648982] Fps is (10 sec: 36133.7, 60 sec: 41506.2, 300 sec: 43098.2). Total num frames: 672530432. Throughput: 0: 10501.6. Samples: 168223232. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:10,959][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 15:45:11,984][1651596] Signal inference workers to stop experience collection... (17000 times) [2024-06-15 15:45:12,013][1653645] Updated weights for policy 0, policy_version 328401 (0.0084) [2024-06-15 15:45:12,053][1653645] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-06-15 15:45:12,202][1651596] Signal inference workers to resume experience collection... (17000 times) [2024-06-15 15:45:12,203][1653645] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-06-15 15:45:13,662][1653645] Updated weights for policy 0, policy_version 328490 (0.0019) [2024-06-15 15:45:14,952][1653645] Updated weights for policy 0, policy_version 328530 (0.0059) [2024-06-15 15:45:15,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 40960.0, 300 sec: 43431.5). Total num frames: 672890880. Throughput: 0: 10615.6. Samples: 168288256. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:15,958][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 15:45:17,144][1653645] Updated weights for policy 0, policy_version 328617 (0.0183) [2024-06-15 15:45:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 673054720. Throughput: 0: 10308.2. Samples: 168314880. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:20,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:45:23,564][1653645] Updated weights for policy 0, policy_version 328646 (0.0012) [2024-06-15 15:45:25,431][1653645] Updated weights for policy 0, policy_version 328736 (0.0013) [2024-06-15 15:45:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 43320.4). Total num frames: 673284096. Throughput: 0: 10820.3. Samples: 168386048. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:45:26,472][1653645] Updated weights for policy 0, policy_version 328770 (0.0012) [2024-06-15 15:45:28,207][1653645] Updated weights for policy 0, policy_version 328835 (0.0012) [2024-06-15 15:45:29,514][1653645] Updated weights for policy 0, policy_version 328896 (0.0010) [2024-06-15 15:45:30,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.9, 300 sec: 43320.4). Total num frames: 673579008. Throughput: 0: 10513.1. Samples: 168444416. Policy #0 lag: (min: 51.0, avg: 175.8, max: 297.0) [2024-06-15 15:45:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:45:35,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 41506.1, 300 sec: 42765.0). Total num frames: 673611776. Throughput: 0: 10695.1. Samples: 168482816. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:45:35,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:45:37,076][1653645] Updated weights for policy 0, policy_version 328955 (0.0012) [2024-06-15 15:45:38,475][1653645] Updated weights for policy 0, policy_version 329008 (0.0013) [2024-06-15 15:45:40,028][1653645] Updated weights for policy 0, policy_version 329072 (0.0012) [2024-06-15 15:45:40,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 674004992. Throughput: 0: 10763.4. Samples: 168546304. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:45:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:45:41,447][1653645] Updated weights for policy 0, policy_version 329124 (0.0011) [2024-06-15 15:45:45,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.8, 300 sec: 42765.0). Total num frames: 674103296. Throughput: 0: 10826.2. Samples: 168615936. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:45:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:45:48,510][1653645] Updated weights for policy 0, policy_version 329184 (0.0092) [2024-06-15 15:45:49,873][1653645] Updated weights for policy 0, policy_version 329232 (0.0012) [2024-06-15 15:45:50,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42052.3, 300 sec: 43098.3). Total num frames: 674332672. Throughput: 0: 10945.4. Samples: 168647680. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:45:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:45:51,078][1651596] Signal inference workers to stop experience collection... (17050 times) [2024-06-15 15:45:51,148][1653645] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-06-15 15:45:51,372][1651596] Signal inference workers to resume experience collection... (17050 times) [2024-06-15 15:45:51,373][1653645] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-06-15 15:45:51,513][1653645] Updated weights for policy 0, policy_version 329299 (0.0091) [2024-06-15 15:45:53,296][1653645] Updated weights for policy 0, policy_version 329379 (0.0013) [2024-06-15 15:45:55,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.4, 300 sec: 43098.2). Total num frames: 674627584. Throughput: 0: 10581.3. Samples: 168699392. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:45:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:46:00,747][1653645] Updated weights for policy 0, policy_version 329430 (0.0012) [2024-06-15 15:46:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42069.7, 300 sec: 42876.1). Total num frames: 674693120. Throughput: 0: 10865.8. Samples: 168777216. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:46:02,550][1653645] Updated weights for policy 0, policy_version 329506 (0.0011) [2024-06-15 15:46:04,636][1653645] Updated weights for policy 0, policy_version 329587 (0.0024) [2024-06-15 15:46:05,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 42598.3, 300 sec: 43209.3). Total num frames: 675086336. Throughput: 0: 10808.9. Samples: 168801280. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:46:06,242][1653645] Updated weights for policy 0, policy_version 329653 (0.0047) [2024-06-15 15:46:10,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.8, 300 sec: 42653.9). Total num frames: 675151872. Throughput: 0: 10729.2. Samples: 168868864. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:46:12,901][1653645] Updated weights for policy 0, policy_version 329712 (0.0013) [2024-06-15 15:46:14,843][1653645] Updated weights for policy 0, policy_version 329787 (0.0098) [2024-06-15 15:46:15,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 43320.4). Total num frames: 675479552. Throughput: 0: 10786.1. Samples: 168929792. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:46:16,223][1653645] Updated weights for policy 0, policy_version 329840 (0.0018) [2024-06-15 15:46:17,442][1653645] Updated weights for policy 0, policy_version 329888 (0.0015) [2024-06-15 15:46:20,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.6, 300 sec: 42653.9). Total num frames: 675676160. Throughput: 0: 10649.5. Samples: 168962048. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:46:23,973][1653645] Updated weights for policy 0, policy_version 329952 (0.0027) [2024-06-15 15:46:25,314][1653645] Updated weights for policy 0, policy_version 329986 (0.0032) [2024-06-15 15:46:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 675840000. Throughput: 0: 10820.3. Samples: 169033216. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:46:26,768][1653645] Updated weights for policy 0, policy_version 330049 (0.0012) [2024-06-15 15:46:28,189][1653645] Updated weights for policy 0, policy_version 330108 (0.0013) [2024-06-15 15:46:29,888][1653645] Updated weights for policy 0, policy_version 330160 (0.0013) [2024-06-15 15:46:30,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 676200448. Throughput: 0: 10706.5. Samples: 169097728. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:46:35,398][1653645] Updated weights for policy 0, policy_version 330194 (0.0065) [2024-06-15 15:46:35,803][1651596] Signal inference workers to stop experience collection... (17100 times) [2024-06-15 15:46:35,856][1653645] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-06-15 15:46:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 42876.1). Total num frames: 676265984. Throughput: 0: 10831.6. Samples: 169135104. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:46:36,083][1651596] Signal inference workers to resume experience collection... (17100 times) [2024-06-15 15:46:36,084][1653645] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-06-15 15:46:36,460][1653645] Updated weights for policy 0, policy_version 330240 (0.0010) [2024-06-15 15:46:38,984][1653645] Updated weights for policy 0, policy_version 330320 (0.0013) [2024-06-15 15:46:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 676593664. Throughput: 0: 11013.7. Samples: 169195008. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:46:42,163][1653645] Updated weights for policy 0, policy_version 330400 (0.0023) [2024-06-15 15:46:45,958][1648982] Fps is (10 sec: 45873.5, 60 sec: 43690.4, 300 sec: 42653.9). Total num frames: 676724736. Throughput: 0: 10729.1. Samples: 169260032. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:45,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:46:48,095][1653645] Updated weights for policy 0, policy_version 330464 (0.0012) [2024-06-15 15:46:50,067][1653645] Updated weights for policy 0, policy_version 330529 (0.0012) [2024-06-15 15:46:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 43209.4). Total num frames: 676986880. Throughput: 0: 11013.7. Samples: 169296896. Policy #0 lag: (min: 15.0, avg: 70.1, max: 271.0) [2024-06-15 15:46:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:46:51,909][1653645] Updated weights for policy 0, policy_version 330592 (0.0013) [2024-06-15 15:46:53,664][1653645] Updated weights for policy 0, policy_version 330628 (0.0013) [2024-06-15 15:46:55,960][1648982] Fps is (10 sec: 52417.8, 60 sec: 43689.1, 300 sec: 42875.7). Total num frames: 677249024. Throughput: 0: 10751.4. Samples: 169352704. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:46:55,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:46:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000330688_677249024.pth... [2024-06-15 15:46:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000325568_666763264.pth [2024-06-15 15:47:00,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43690.6, 300 sec: 42877.3). Total num frames: 677314560. Throughput: 0: 10990.9. Samples: 169424384. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:47:01,088][1653645] Updated weights for policy 0, policy_version 330736 (0.0089) [2024-06-15 15:47:02,701][1653645] Updated weights for policy 0, policy_version 330807 (0.0033) [2024-06-15 15:47:05,184][1653645] Updated weights for policy 0, policy_version 330877 (0.0014) [2024-06-15 15:47:05,958][1648982] Fps is (10 sec: 39331.2, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 677642240. Throughput: 0: 10809.0. Samples: 169448448. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:47:06,966][1653645] Updated weights for policy 0, policy_version 330912 (0.0012) [2024-06-15 15:47:10,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 43690.5, 300 sec: 42653.9). Total num frames: 677773312. Throughput: 0: 10729.2. Samples: 169516032. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:47:12,436][1653645] Updated weights for policy 0, policy_version 330947 (0.0029) [2024-06-15 15:47:14,295][1653645] Updated weights for policy 0, policy_version 331024 (0.0033) [2024-06-15 15:47:15,511][1653645] Updated weights for policy 0, policy_version 331066 (0.0019) [2024-06-15 15:47:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 43209.4). Total num frames: 678035456. Throughput: 0: 10638.2. Samples: 169576448. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:47:16,948][1653645] Updated weights for policy 0, policy_version 331128 (0.0014) [2024-06-15 15:47:19,045][1653645] Updated weights for policy 0, policy_version 331184 (0.0013) [2024-06-15 15:47:20,967][1648982] Fps is (10 sec: 52380.5, 60 sec: 43684.0, 300 sec: 42763.6). Total num frames: 678297600. Throughput: 0: 10658.7. Samples: 169614848. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:20,968][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:47:23,307][1651596] Signal inference workers to stop experience collection... (17150 times) [2024-06-15 15:47:23,380][1653645] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-06-15 15:47:23,790][1651596] Signal inference workers to resume experience collection... (17150 times) [2024-06-15 15:47:23,791][1653645] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-06-15 15:47:23,993][1653645] Updated weights for policy 0, policy_version 331220 (0.0020) [2024-06-15 15:47:25,504][1653645] Updated weights for policy 0, policy_version 331280 (0.0016) [2024-06-15 15:47:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 678461440. Throughput: 0: 10888.5. Samples: 169684992. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:25,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 15:47:28,088][1653645] Updated weights for policy 0, policy_version 331330 (0.0015) [2024-06-15 15:47:29,213][1653645] Updated weights for policy 0, policy_version 331386 (0.0013) [2024-06-15 15:47:30,958][1648982] Fps is (10 sec: 49197.7, 60 sec: 43144.4, 300 sec: 43320.4). Total num frames: 678789120. Throughput: 0: 10820.3. Samples: 169746944. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:47:31,155][1653645] Updated weights for policy 0, policy_version 331447 (0.0031) [2024-06-15 15:47:35,550][1653645] Updated weights for policy 0, policy_version 331488 (0.0020) [2024-06-15 15:47:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 42876.2). Total num frames: 678887424. Throughput: 0: 10797.5. Samples: 169782784. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:35,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 15:47:36,244][1653645] Updated weights for policy 0, policy_version 331515 (0.0021) [2024-06-15 15:47:38,210][1653645] Updated weights for policy 0, policy_version 331580 (0.0013) [2024-06-15 15:47:40,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 42598.4, 300 sec: 43431.5). Total num frames: 679149568. Throughput: 0: 11093.9. Samples: 169851904. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:47:41,259][1653645] Updated weights for policy 0, policy_version 331636 (0.0012) [2024-06-15 15:47:42,620][1653645] Updated weights for policy 0, policy_version 331680 (0.0016) [2024-06-15 15:47:45,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43691.0, 300 sec: 42654.0). Total num frames: 679346176. Throughput: 0: 11025.1. Samples: 169920512. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:47:46,857][1653645] Updated weights for policy 0, policy_version 331744 (0.0014) [2024-06-15 15:47:48,587][1653645] Updated weights for policy 0, policy_version 331793 (0.0016) [2024-06-15 15:47:50,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 679608320. Throughput: 0: 11150.2. Samples: 169950208. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:47:51,968][1653645] Updated weights for policy 0, policy_version 331860 (0.0012) [2024-06-15 15:47:54,930][1653645] Updated weights for policy 0, policy_version 331959 (0.0014) [2024-06-15 15:47:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43692.4, 300 sec: 43098.2). Total num frames: 679870464. Throughput: 0: 11173.0. Samples: 170018816. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:47:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:47:58,917][1653645] Updated weights for policy 0, policy_version 332000 (0.0014) [2024-06-15 15:48:00,551][1653645] Updated weights for policy 0, policy_version 332084 (0.0015) [2024-06-15 15:48:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 46967.5, 300 sec: 43542.6). Total num frames: 680132608. Throughput: 0: 11309.5. Samples: 170085376. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:48:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:48:03,648][1653645] Updated weights for policy 0, policy_version 332128 (0.0014) [2024-06-15 15:48:05,212][1653645] Updated weights for policy 0, policy_version 332176 (0.0012) [2024-06-15 15:48:05,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45328.9, 300 sec: 43875.8). Total num frames: 680361984. Throughput: 0: 11255.0. Samples: 170121216. Policy #0 lag: (min: 15.0, avg: 149.5, max: 271.0) [2024-06-15 15:48:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:48:06,095][1653645] Updated weights for policy 0, policy_version 332223 (0.0013) [2024-06-15 15:48:09,766][1651596] Signal inference workers to stop experience collection... (17200 times) [2024-06-15 15:48:09,871][1653645] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-06-15 15:48:10,057][1651596] Signal inference workers to resume experience collection... (17200 times) [2024-06-15 15:48:10,059][1653645] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-06-15 15:48:10,266][1653645] Updated weights for policy 0, policy_version 332259 (0.0021) [2024-06-15 15:48:10,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.4, 300 sec: 43098.2). Total num frames: 680525824. Throughput: 0: 11434.6. Samples: 170199552. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:48:11,235][1653645] Updated weights for policy 0, policy_version 332313 (0.0013) [2024-06-15 15:48:14,614][1653645] Updated weights for policy 0, policy_version 332384 (0.0015) [2024-06-15 15:48:15,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45875.0, 300 sec: 43986.8). Total num frames: 680787968. Throughput: 0: 11400.5. Samples: 170259968. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:48:16,728][1653645] Updated weights for policy 0, policy_version 332432 (0.0035) [2024-06-15 15:48:17,819][1653645] Updated weights for policy 0, policy_version 332480 (0.0011) [2024-06-15 15:48:20,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44243.9, 300 sec: 43209.4). Total num frames: 680951808. Throughput: 0: 11423.3. Samples: 170296832. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:48:22,452][1653645] Updated weights for policy 0, policy_version 332550 (0.0059) [2024-06-15 15:48:25,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45875.1, 300 sec: 43653.6). Total num frames: 681213952. Throughput: 0: 11332.3. Samples: 170361856. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:48:26,083][1653645] Updated weights for policy 0, policy_version 332640 (0.0014) [2024-06-15 15:48:28,485][1653645] Updated weights for policy 0, policy_version 332704 (0.0012) [2024-06-15 15:48:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44237.0, 300 sec: 43764.8). Total num frames: 681443328. Throughput: 0: 11343.6. Samples: 170430976. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:48:33,053][1653645] Updated weights for policy 0, policy_version 332784 (0.0013) [2024-06-15 15:48:35,690][1653645] Updated weights for policy 0, policy_version 332851 (0.0012) [2024-06-15 15:48:35,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 46967.5, 300 sec: 43542.6). Total num frames: 681705472. Throughput: 0: 11389.2. Samples: 170462720. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:48:37,798][1653645] Updated weights for policy 0, policy_version 332880 (0.0012) [2024-06-15 15:48:40,014][1653645] Updated weights for policy 0, policy_version 332930 (0.0012) [2024-06-15 15:48:40,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 681902080. Throughput: 0: 11332.3. Samples: 170528768. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:48:41,292][1653645] Updated weights for policy 0, policy_version 332988 (0.0011) [2024-06-15 15:48:44,279][1653645] Updated weights for policy 0, policy_version 333042 (0.0019) [2024-06-15 15:48:45,959][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 43431.5). Total num frames: 682131456. Throughput: 0: 11491.6. Samples: 170602496. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:45,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:48:46,961][1653645] Updated weights for policy 0, policy_version 333104 (0.0013) [2024-06-15 15:48:50,044][1653645] Updated weights for policy 0, policy_version 333169 (0.0114) [2024-06-15 15:48:50,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 682360832. Throughput: 0: 11434.7. Samples: 170635776. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:48:52,086][1653645] Updated weights for policy 0, policy_version 333216 (0.0013) [2024-06-15 15:48:54,968][1653645] Updated weights for policy 0, policy_version 333264 (0.0012) [2024-06-15 15:48:55,347][1651596] Signal inference workers to stop experience collection... (17250 times) [2024-06-15 15:48:55,409][1653645] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-06-15 15:48:55,530][1651596] Signal inference workers to resume experience collection... (17250 times) [2024-06-15 15:48:55,531][1653645] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-06-15 15:48:55,855][1653645] Updated weights for policy 0, policy_version 333312 (0.0011) [2024-06-15 15:48:55,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.2, 300 sec: 43875.8). Total num frames: 682622976. Throughput: 0: 11195.7. Samples: 170703360. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:48:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:48:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000333312_682622976.pth... [2024-06-15 15:48:56,027][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000328128_672006144.pth [2024-06-15 15:48:58,647][1653645] Updated weights for policy 0, policy_version 333373 (0.0015) [2024-06-15 15:49:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.6, 300 sec: 43653.6). Total num frames: 682786816. Throughput: 0: 11355.0. Samples: 170770944. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:49:00,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:49:01,749][1653645] Updated weights for policy 0, policy_version 333439 (0.0013) [2024-06-15 15:49:04,490][1653645] Updated weights for policy 0, policy_version 333498 (0.0014) [2024-06-15 15:49:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 683016192. Throughput: 0: 11377.8. Samples: 170808832. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:49:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:49:07,223][1653645] Updated weights for policy 0, policy_version 333563 (0.0013) [2024-06-15 15:49:09,655][1653645] Updated weights for policy 0, policy_version 333627 (0.0013) [2024-06-15 15:49:10,958][1648982] Fps is (10 sec: 49153.3, 60 sec: 45875.2, 300 sec: 43542.5). Total num frames: 683278336. Throughput: 0: 11184.4. Samples: 170865152. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:49:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:49:13,963][1653645] Updated weights for policy 0, policy_version 333693 (0.0105) [2024-06-15 15:49:15,967][1648982] Fps is (10 sec: 45831.6, 60 sec: 44776.0, 300 sec: 44207.6). Total num frames: 683474944. Throughput: 0: 11341.2. Samples: 170941440. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:49:15,968][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:49:16,465][1653645] Updated weights for policy 0, policy_version 333757 (0.0019) [2024-06-15 15:49:19,341][1653645] Updated weights for policy 0, policy_version 333824 (0.0013) [2024-06-15 15:49:20,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 46421.2, 300 sec: 43875.8). Total num frames: 683737088. Throughput: 0: 11366.4. Samples: 170974208. Policy #0 lag: (min: 15.0, avg: 99.5, max: 271.0) [2024-06-15 15:49:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:49:21,480][1653645] Updated weights for policy 0, policy_version 333886 (0.0013) [2024-06-15 15:49:25,958][1648982] Fps is (10 sec: 42638.8, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 683900928. Throughput: 0: 11309.5. Samples: 171037696. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:49:26,068][1653645] Updated weights for policy 0, policy_version 333944 (0.0013) [2024-06-15 15:49:28,088][1653645] Updated weights for policy 0, policy_version 334000 (0.0012) [2024-06-15 15:49:30,744][1653645] Updated weights for policy 0, policy_version 334034 (0.0026) [2024-06-15 15:49:30,959][1648982] Fps is (10 sec: 36040.1, 60 sec: 44235.7, 300 sec: 43986.7). Total num frames: 684097536. Throughput: 0: 11218.1. Samples: 171107328. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:30,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:49:31,856][1653645] Updated weights for policy 0, policy_version 334080 (0.0019) [2024-06-15 15:49:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 684326912. Throughput: 0: 11025.1. Samples: 171131904. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:49:36,817][1653645] Updated weights for policy 0, policy_version 334147 (0.0014) [2024-06-15 15:49:39,409][1653645] Updated weights for policy 0, policy_version 334209 (0.0012) [2024-06-15 15:49:40,958][1648982] Fps is (10 sec: 49158.1, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 684589056. Throughput: 0: 11070.6. Samples: 171201536. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:40,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:49:40,960][1653645] Updated weights for policy 0, policy_version 334272 (0.0012) [2024-06-15 15:49:43,708][1651596] Signal inference workers to stop experience collection... (17300 times) [2024-06-15 15:49:43,753][1653645] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-06-15 15:49:43,966][1651596] Signal inference workers to resume experience collection... (17300 times) [2024-06-15 15:49:43,967][1653645] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-06-15 15:49:44,637][1653645] Updated weights for policy 0, policy_version 334368 (0.0012) [2024-06-15 15:49:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 684851200. Throughput: 0: 10888.6. Samples: 171260928. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:49:50,157][1653645] Updated weights for policy 0, policy_version 334462 (0.0120) [2024-06-15 15:49:50,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 684982272. Throughput: 0: 10820.3. Samples: 171295744. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:49:54,845][1653645] Updated weights for policy 0, policy_version 334544 (0.0012) [2024-06-15 15:49:55,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 44323.8). Total num frames: 685244416. Throughput: 0: 11013.7. Samples: 171360768. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:49:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:49:56,435][1653645] Updated weights for policy 0, policy_version 334594 (0.0011) [2024-06-15 15:49:57,781][1653645] Updated weights for policy 0, policy_version 334653 (0.0069) [2024-06-15 15:50:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43144.8, 300 sec: 43542.6). Total num frames: 685375488. Throughput: 0: 10777.0. Samples: 171426304. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 15:50:02,859][1653645] Updated weights for policy 0, policy_version 334714 (0.0013) [2024-06-15 15:50:05,105][1653645] Updated weights for policy 0, policy_version 334775 (0.0126) [2024-06-15 15:50:05,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 685637632. Throughput: 0: 10740.7. Samples: 171457536. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:50:08,027][1653645] Updated weights for policy 0, policy_version 334832 (0.0012) [2024-06-15 15:50:09,949][1653645] Updated weights for policy 0, policy_version 334903 (0.0013) [2024-06-15 15:50:10,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 685899776. Throughput: 0: 10661.0. Samples: 171517440. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:50:13,511][1653645] Updated weights for policy 0, policy_version 334936 (0.0015) [2024-06-15 15:50:14,361][1653645] Updated weights for policy 0, policy_version 334976 (0.0022) [2024-06-15 15:50:15,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 43151.2, 300 sec: 44097.9). Total num frames: 686063616. Throughput: 0: 10820.5. Samples: 171594240. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:15,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:50:16,665][1653645] Updated weights for policy 0, policy_version 335039 (0.0014) [2024-06-15 15:50:20,068][1653645] Updated weights for policy 0, policy_version 335105 (0.0014) [2024-06-15 15:50:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 686358528. Throughput: 0: 11082.0. Samples: 171630592. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:50:24,783][1653645] Updated weights for policy 0, policy_version 335185 (0.0014) [2024-06-15 15:50:25,972][1648982] Fps is (10 sec: 49082.8, 60 sec: 44226.2, 300 sec: 43984.7). Total num frames: 686555136. Throughput: 0: 10976.1. Samples: 171695616. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:25,973][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:50:27,425][1653645] Updated weights for policy 0, policy_version 335237 (0.0014) [2024-06-15 15:50:28,580][1653645] Updated weights for policy 0, policy_version 335295 (0.0013) [2024-06-15 15:50:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44784.0, 300 sec: 44653.3). Total num frames: 686784512. Throughput: 0: 11127.5. Samples: 171761664. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:50:31,017][1653645] Updated weights for policy 0, policy_version 335352 (0.0030) [2024-06-15 15:50:32,590][1651596] Signal inference workers to stop experience collection... (17350 times) [2024-06-15 15:50:32,659][1653645] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-06-15 15:50:32,771][1651596] Signal inference workers to resume experience collection... (17350 times) [2024-06-15 15:50:32,772][1653645] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-06-15 15:50:33,338][1653645] Updated weights for policy 0, policy_version 335422 (0.0013) [2024-06-15 15:50:35,958][1648982] Fps is (10 sec: 39378.2, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 686948352. Throughput: 0: 10979.6. Samples: 171789824. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:50:37,132][1653645] Updated weights for policy 0, policy_version 335477 (0.0101) [2024-06-15 15:50:39,310][1653645] Updated weights for policy 0, policy_version 335507 (0.0013) [2024-06-15 15:50:40,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 687210496. Throughput: 0: 11252.5. Samples: 171867136. Policy #0 lag: (min: 47.0, avg: 144.3, max: 303.0) [2024-06-15 15:50:40,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 15:50:41,138][1653645] Updated weights for policy 0, policy_version 335568 (0.0029) [2024-06-15 15:50:42,404][1653645] Updated weights for policy 0, policy_version 335615 (0.0019) [2024-06-15 15:50:44,798][1653645] Updated weights for policy 0, policy_version 335679 (0.0095) [2024-06-15 15:50:45,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 687472640. Throughput: 0: 11138.8. Samples: 171927552. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:50:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:50:48,979][1653645] Updated weights for policy 0, policy_version 335744 (0.0095) [2024-06-15 15:50:50,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 687636480. Throughput: 0: 11264.0. Samples: 171964416. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:50:50,958][1648982] Avg episode reward: [(0, '36.960')] [2024-06-15 15:50:52,803][1653645] Updated weights for policy 0, policy_version 335815 (0.0015) [2024-06-15 15:50:55,833][1653645] Updated weights for policy 0, policy_version 335888 (0.0238) [2024-06-15 15:50:55,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 687898624. Throughput: 0: 11286.8. Samples: 172025344. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:50:55,958][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 15:50:56,586][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000335920_687964160.pth... [2024-06-15 15:50:56,641][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000330688_677249024.pth [2024-06-15 15:50:56,645][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000335920_687964160.pth [2024-06-15 15:50:59,668][1653645] Updated weights for policy 0, policy_version 335938 (0.0014) [2024-06-15 15:51:00,909][1653645] Updated weights for policy 0, policy_version 335988 (0.0013) [2024-06-15 15:51:00,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 45328.8, 300 sec: 44097.9). Total num frames: 688095232. Throughput: 0: 11138.8. Samples: 172095488. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:00,959][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 15:51:02,979][1653645] Updated weights for policy 0, policy_version 336038 (0.0013) [2024-06-15 15:51:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 688291840. Throughput: 0: 10979.6. Samples: 172124672. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:05,958][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 15:51:06,375][1653645] Updated weights for policy 0, policy_version 336112 (0.0026) [2024-06-15 15:51:08,439][1653645] Updated weights for policy 0, policy_version 336160 (0.0121) [2024-06-15 15:51:10,959][1648982] Fps is (10 sec: 42599.4, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 688521216. Throughput: 0: 11005.8. Samples: 172190720. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:10,960][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 15:51:12,068][1653645] Updated weights for policy 0, policy_version 336224 (0.0111) [2024-06-15 15:51:15,786][1653645] Updated weights for policy 0, policy_version 336314 (0.0015) [2024-06-15 15:51:15,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 688783360. Throughput: 0: 10979.6. Samples: 172255744. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:15,958][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 15:51:18,760][1653645] Updated weights for policy 0, policy_version 336368 (0.0011) [2024-06-15 15:51:20,648][1653645] Updated weights for policy 0, policy_version 336432 (0.0013) [2024-06-15 15:51:20,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 689012736. Throughput: 0: 11036.4. Samples: 172286464. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:20,958][1648982] Avg episode reward: [(0, '36.880')] [2024-06-15 15:51:23,737][1651596] Signal inference workers to stop experience collection... (17400 times) [2024-06-15 15:51:23,788][1653645] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-06-15 15:51:23,980][1651596] Signal inference workers to resume experience collection... (17400 times) [2024-06-15 15:51:23,981][1653645] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-06-15 15:51:24,548][1653645] Updated weights for policy 0, policy_version 336488 (0.0013) [2024-06-15 15:51:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43701.1, 300 sec: 43986.9). Total num frames: 689176576. Throughput: 0: 10843.1. Samples: 172355072. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:25,959][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 15:51:26,945][1653645] Updated weights for policy 0, policy_version 336528 (0.0011) [2024-06-15 15:51:27,884][1653645] Updated weights for policy 0, policy_version 336569 (0.0016) [2024-06-15 15:51:30,139][1653645] Updated weights for policy 0, policy_version 336610 (0.0013) [2024-06-15 15:51:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 689438720. Throughput: 0: 11013.7. Samples: 172423168. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:30,958][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 15:51:31,207][1653645] Updated weights for policy 0, policy_version 336643 (0.0011) [2024-06-15 15:51:32,292][1653645] Updated weights for policy 0, policy_version 336698 (0.0011) [2024-06-15 15:51:35,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 689668096. Throughput: 0: 10979.6. Samples: 172458496. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:35,958][1648982] Avg episode reward: [(0, '36.890')] [2024-06-15 15:51:36,024][1653645] Updated weights for policy 0, policy_version 336765 (0.0028) [2024-06-15 15:51:39,130][1653645] Updated weights for policy 0, policy_version 336822 (0.0061) [2024-06-15 15:51:40,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 689864704. Throughput: 0: 11195.7. Samples: 172529152. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:40,958][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 15:51:41,920][1653645] Updated weights for policy 0, policy_version 336887 (0.0147) [2024-06-15 15:51:44,189][1653645] Updated weights for policy 0, policy_version 336956 (0.0014) [2024-06-15 15:51:45,958][1648982] Fps is (10 sec: 42596.3, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 690094080. Throughput: 0: 10934.0. Samples: 172587520. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:45,959][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 15:51:47,944][1653645] Updated weights for policy 0, policy_version 337000 (0.0014) [2024-06-15 15:51:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44098.3). Total num frames: 690257920. Throughput: 0: 11093.3. Samples: 172623872. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:50,958][1648982] Avg episode reward: [(0, '36.850')] [2024-06-15 15:51:51,086][1653645] Updated weights for policy 0, policy_version 337041 (0.0031) [2024-06-15 15:51:51,972][1653645] Updated weights for policy 0, policy_version 337081 (0.0015) [2024-06-15 15:51:53,665][1653645] Updated weights for policy 0, policy_version 337137 (0.0089) [2024-06-15 15:51:54,270][1653645] Updated weights for policy 0, policy_version 337154 (0.0009) [2024-06-15 15:51:55,962][1648982] Fps is (10 sec: 52407.1, 60 sec: 45325.6, 300 sec: 45097.0). Total num frames: 690618368. Throughput: 0: 11206.0. Samples: 172695040. Policy #0 lag: (min: 41.0, avg: 156.8, max: 297.0) [2024-06-15 15:51:55,963][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 15:51:58,268][1653645] Updated weights for policy 0, policy_version 337220 (0.0012) [2024-06-15 15:52:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 690749440. Throughput: 0: 11264.0. Samples: 172762624. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:00,958][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 15:52:02,983][1653645] Updated weights for policy 0, policy_version 337298 (0.0025) [2024-06-15 15:52:03,773][1653645] Updated weights for policy 0, policy_version 337338 (0.0099) [2024-06-15 15:52:05,958][1648982] Fps is (10 sec: 39339.6, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 691011584. Throughput: 0: 11286.7. Samples: 172794368. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:05,959][1648982] Avg episode reward: [(0, '36.860')] [2024-06-15 15:52:06,202][1653645] Updated weights for policy 0, policy_version 337410 (0.0013) [2024-06-15 15:52:07,564][1653645] Updated weights for policy 0, policy_version 337472 (0.0010) [2024-06-15 15:52:10,434][1651596] Signal inference workers to stop experience collection... (17450 times) [2024-06-15 15:52:10,492][1653645] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-06-15 15:52:10,741][1651596] Signal inference workers to resume experience collection... (17450 times) [2024-06-15 15:52:10,742][1653645] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-06-15 15:52:10,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.2, 300 sec: 44764.4). Total num frames: 691240960. Throughput: 0: 11275.4. Samples: 172862464. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:10,958][1648982] Avg episode reward: [(0, '36.970')] [2024-06-15 15:52:10,966][1653645] Updated weights for policy 0, policy_version 337531 (0.0028) [2024-06-15 15:52:15,285][1653645] Updated weights for policy 0, policy_version 337588 (0.0096) [2024-06-15 15:52:15,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.5, 300 sec: 44432.6). Total num frames: 691404800. Throughput: 0: 11229.8. Samples: 172928512. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:15,959][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 15:52:17,555][1653645] Updated weights for policy 0, policy_version 337656 (0.0020) [2024-06-15 15:52:19,175][1653645] Updated weights for policy 0, policy_version 337712 (0.0014) [2024-06-15 15:52:20,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 691666944. Throughput: 0: 11082.0. Samples: 172957184. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:20,958][1648982] Avg episode reward: [(0, '37.000')] [2024-06-15 15:52:22,073][1653645] Updated weights for policy 0, policy_version 337763 (0.0013) [2024-06-15 15:52:25,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 691798016. Throughput: 0: 11150.2. Samples: 173030912. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:25,971][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 15:52:27,532][1653645] Updated weights for policy 0, policy_version 337840 (0.0015) [2024-06-15 15:52:29,069][1653645] Updated weights for policy 0, policy_version 337904 (0.0282) [2024-06-15 15:52:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 692125696. Throughput: 0: 11150.4. Samples: 173089280. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:30,958][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 15:52:31,103][1653645] Updated weights for policy 0, policy_version 337957 (0.0114) [2024-06-15 15:52:31,529][1653645] Updated weights for policy 0, policy_version 337984 (0.0011) [2024-06-15 15:52:34,500][1653645] Updated weights for policy 0, policy_version 338041 (0.0013) [2024-06-15 15:52:35,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 692322304. Throughput: 0: 11172.9. Samples: 173126656. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:35,959][1648982] Avg episode reward: [(0, '36.840')] [2024-06-15 15:52:39,224][1653645] Updated weights for policy 0, policy_version 338083 (0.0021) [2024-06-15 15:52:40,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 692551680. Throughput: 0: 11196.9. Samples: 173198848. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:40,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 15:52:41,035][1653645] Updated weights for policy 0, policy_version 338174 (0.0089) [2024-06-15 15:52:42,284][1653645] Updated weights for policy 0, policy_version 338224 (0.0012) [2024-06-15 15:52:45,906][1653645] Updated weights for policy 0, policy_version 338259 (0.0013) [2024-06-15 15:52:45,959][1648982] Fps is (10 sec: 42599.1, 60 sec: 44237.1, 300 sec: 44542.3). Total num frames: 692748288. Throughput: 0: 11104.7. Samples: 173262336. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:45,960][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 15:52:50,577][1653645] Updated weights for policy 0, policy_version 338336 (0.0014) [2024-06-15 15:52:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 692944896. Throughput: 0: 11241.3. Samples: 173300224. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:50,958][1648982] Avg episode reward: [(0, '36.950')] [2024-06-15 15:52:53,752][1653645] Updated weights for policy 0, policy_version 338448 (0.0012) [2024-06-15 15:52:54,602][1653645] Updated weights for policy 0, policy_version 338494 (0.0102) [2024-06-15 15:52:55,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 43693.8, 300 sec: 44431.1). Total num frames: 693239808. Throughput: 0: 10945.3. Samples: 173355008. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:52:55,959][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 15:52:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000338496_693239808.pth... [2024-06-15 15:52:56,009][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000333312_682622976.pth [2024-06-15 15:52:57,290][1651596] Signal inference workers to stop experience collection... (17500 times) [2024-06-15 15:52:57,331][1653645] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-06-15 15:52:57,506][1651596] Signal inference workers to resume experience collection... (17500 times) [2024-06-15 15:52:57,507][1653645] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-06-15 15:52:58,536][1653645] Updated weights for policy 0, policy_version 338552 (0.0012) [2024-06-15 15:53:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 693370880. Throughput: 0: 11195.8. Samples: 173432320. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:53:00,958][1648982] Avg episode reward: [(0, '36.710')] [2024-06-15 15:53:03,249][1653645] Updated weights for policy 0, policy_version 338624 (0.0011) [2024-06-15 15:53:04,435][1653645] Updated weights for policy 0, policy_version 338682 (0.0025) [2024-06-15 15:53:05,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 693698560. Throughput: 0: 11184.3. Samples: 173460480. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:53:05,959][1648982] Avg episode reward: [(0, '36.940')] [2024-06-15 15:53:06,115][1653645] Updated weights for policy 0, policy_version 338736 (0.0016) [2024-06-15 15:53:09,606][1653645] Updated weights for policy 0, policy_version 338784 (0.0076) [2024-06-15 15:53:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 693895168. Throughput: 0: 11104.7. Samples: 173530624. Policy #0 lag: (min: 9.0, avg: 118.4, max: 265.0) [2024-06-15 15:53:10,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 15:53:14,216][1653645] Updated weights for policy 0, policy_version 338864 (0.0013) [2024-06-15 15:53:15,827][1653645] Updated weights for policy 0, policy_version 338928 (0.0055) [2024-06-15 15:53:15,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 45329.3, 300 sec: 44653.3). Total num frames: 694124544. Throughput: 0: 11264.0. Samples: 173596160. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 15:53:17,189][1653645] Updated weights for policy 0, policy_version 338993 (0.0013) [2024-06-15 15:53:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 694288384. Throughput: 0: 11127.5. Samples: 173627392. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:53:21,752][1653645] Updated weights for policy 0, policy_version 339040 (0.0025) [2024-06-15 15:53:25,000][1653645] Updated weights for policy 0, policy_version 339091 (0.0016) [2024-06-15 15:53:25,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 694550528. Throughput: 0: 11218.5. Samples: 173703680. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:53:26,591][1653645] Updated weights for policy 0, policy_version 339156 (0.0015) [2024-06-15 15:53:28,109][1653645] Updated weights for policy 0, policy_version 339218 (0.0019) [2024-06-15 15:53:29,105][1653645] Updated weights for policy 0, policy_version 339264 (0.0011) [2024-06-15 15:53:30,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 694812672. Throughput: 0: 11172.9. Samples: 173765120. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:53:33,967][1653645] Updated weights for policy 0, policy_version 339328 (0.0010) [2024-06-15 15:53:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 694943744. Throughput: 0: 11047.8. Samples: 173797376. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:35,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 15:53:37,728][1653645] Updated weights for policy 0, policy_version 339388 (0.0099) [2024-06-15 15:53:40,040][1653645] Updated weights for policy 0, policy_version 339457 (0.0013) [2024-06-15 15:53:40,670][1651596] Signal inference workers to stop experience collection... (17550 times) [2024-06-15 15:53:40,724][1653645] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-06-15 15:53:40,905][1651596] Signal inference workers to resume experience collection... (17550 times) [2024-06-15 15:53:40,906][1653645] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-06-15 15:53:40,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 695304192. Throughput: 0: 11286.9. Samples: 173862912. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:40,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 15:53:41,169][1653645] Updated weights for policy 0, policy_version 339518 (0.0013) [2024-06-15 15:53:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 695435264. Throughput: 0: 11036.5. Samples: 173928960. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:53:48,492][1653645] Updated weights for policy 0, policy_version 339589 (0.0011) [2024-06-15 15:53:50,564][1653645] Updated weights for policy 0, policy_version 339652 (0.0010) [2024-06-15 15:53:50,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 695631872. Throughput: 0: 11207.1. Samples: 173964800. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:53:51,630][1653645] Updated weights for policy 0, policy_version 339709 (0.0012) [2024-06-15 15:53:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 695861248. Throughput: 0: 11047.8. Samples: 174027776. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:53:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:53:56,785][1653645] Updated weights for policy 0, policy_version 339781 (0.0084) [2024-06-15 15:53:58,041][1653645] Updated weights for policy 0, policy_version 339837 (0.0014) [2024-06-15 15:54:00,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 696090624. Throughput: 0: 11241.2. Samples: 174102016. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:54:01,084][1653645] Updated weights for policy 0, policy_version 339896 (0.0019) [2024-06-15 15:54:02,744][1653645] Updated weights for policy 0, policy_version 339961 (0.0012) [2024-06-15 15:54:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 696385536. Throughput: 0: 11161.6. Samples: 174129664. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:54:08,584][1653645] Updated weights for policy 0, policy_version 340048 (0.0021) [2024-06-15 15:54:09,455][1653645] Updated weights for policy 0, policy_version 340092 (0.0040) [2024-06-15 15:54:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44210.4). Total num frames: 696516608. Throughput: 0: 10990.9. Samples: 174198272. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:54:13,166][1653645] Updated weights for policy 0, policy_version 340144 (0.0013) [2024-06-15 15:54:14,835][1653645] Updated weights for policy 0, policy_version 340220 (0.0084) [2024-06-15 15:54:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 696844288. Throughput: 0: 11093.4. Samples: 174264320. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:15,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:54:16,209][1653645] Updated weights for policy 0, policy_version 340284 (0.0013) [2024-06-15 15:54:20,475][1653645] Updated weights for policy 0, policy_version 340336 (0.0016) [2024-06-15 15:54:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 697040896. Throughput: 0: 11150.2. Samples: 174299136. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:54:23,726][1653645] Updated weights for policy 0, policy_version 340388 (0.0025) [2024-06-15 15:54:25,653][1653645] Updated weights for policy 0, policy_version 340480 (0.0012) [2024-06-15 15:54:25,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 45875.1, 300 sec: 44764.6). Total num frames: 697303040. Throughput: 0: 11400.5. Samples: 174375936. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:54:26,248][1651596] Signal inference workers to stop experience collection... (17600 times) [2024-06-15 15:54:26,341][1653645] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-06-15 15:54:26,484][1651596] Signal inference workers to resume experience collection... (17600 times) [2024-06-15 15:54:26,485][1653645] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-06-15 15:54:27,749][1653645] Updated weights for policy 0, policy_version 340544 (0.0195) [2024-06-15 15:54:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 697434112. Throughput: 0: 11275.4. Samples: 174436352. Policy #0 lag: (min: 54.0, avg: 130.0, max: 310.0) [2024-06-15 15:54:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:54:32,758][1653645] Updated weights for policy 0, policy_version 340600 (0.0020) [2024-06-15 15:54:35,958][1648982] Fps is (10 sec: 32768.8, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 697630720. Throughput: 0: 11252.6. Samples: 174471168. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:54:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:54:36,197][1653645] Updated weights for policy 0, policy_version 340659 (0.0023) [2024-06-15 15:54:37,332][1653645] Updated weights for policy 0, policy_version 340720 (0.0010) [2024-06-15 15:54:40,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 697958400. Throughput: 0: 11264.0. Samples: 174534656. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:54:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:54:43,826][1653645] Updated weights for policy 0, policy_version 340818 (0.0107) [2024-06-15 15:54:45,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 698089472. Throughput: 0: 11104.7. Samples: 174601728. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:54:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 15:54:47,195][1653645] Updated weights for policy 0, policy_version 340866 (0.0014) [2024-06-15 15:54:49,959][1653645] Updated weights for policy 0, policy_version 340976 (0.0013) [2024-06-15 15:54:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 44542.3). Total num frames: 698384384. Throughput: 0: 11264.0. Samples: 174636544. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:54:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:54:51,812][1653645] Updated weights for policy 0, policy_version 341040 (0.0012) [2024-06-15 15:54:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 698482688. Throughput: 0: 10990.9. Samples: 174692864. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:54:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:54:56,384][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000341088_698548224.pth... [2024-06-15 15:54:56,536][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000335920_687964160.pth [2024-06-15 15:54:56,843][1653645] Updated weights for policy 0, policy_version 341104 (0.0014) [2024-06-15 15:55:00,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 698712064. Throughput: 0: 11104.7. Samples: 174764032. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:00,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:55:01,334][1653645] Updated weights for policy 0, policy_version 341184 (0.0014) [2024-06-15 15:55:02,845][1653645] Updated weights for policy 0, policy_version 341236 (0.0030) [2024-06-15 15:55:04,606][1653645] Updated weights for policy 0, policy_version 341306 (0.0013) [2024-06-15 15:55:05,963][1648982] Fps is (10 sec: 52401.0, 60 sec: 43686.7, 300 sec: 44430.4). Total num frames: 699006976. Throughput: 0: 10819.0. Samples: 174786048. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:05,967][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:55:09,829][1653645] Updated weights for policy 0, policy_version 341365 (0.0013) [2024-06-15 15:55:10,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44320.2). Total num frames: 699138048. Throughput: 0: 10729.3. Samples: 174858752. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:55:12,043][1653645] Updated weights for policy 0, policy_version 341408 (0.0011) [2024-06-15 15:55:12,599][1651596] Signal inference workers to stop experience collection... (17650 times) [2024-06-15 15:55:12,645][1653645] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-06-15 15:55:12,878][1651596] Signal inference workers to resume experience collection... (17650 times) [2024-06-15 15:55:12,878][1653645] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-06-15 15:55:13,600][1653645] Updated weights for policy 0, policy_version 341472 (0.0128) [2024-06-15 15:55:15,483][1653645] Updated weights for policy 0, policy_version 341552 (0.0016) [2024-06-15 15:55:15,958][1648982] Fps is (10 sec: 52456.9, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 699531264. Throughput: 0: 10717.9. Samples: 174918656. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:55:20,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 41506.0, 300 sec: 43989.0). Total num frames: 699531264. Throughput: 0: 10774.7. Samples: 174956032. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:20,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:55:22,079][1653645] Updated weights for policy 0, policy_version 341628 (0.0019) [2024-06-15 15:55:24,092][1653645] Updated weights for policy 0, policy_version 341681 (0.0012) [2024-06-15 15:55:25,677][1653645] Updated weights for policy 0, policy_version 341731 (0.0021) [2024-06-15 15:55:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.7, 300 sec: 44431.2). Total num frames: 699891712. Throughput: 0: 10934.0. Samples: 175026688. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 15:55:27,573][1653645] Updated weights for policy 0, policy_version 341815 (0.0012) [2024-06-15 15:55:30,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 700055552. Throughput: 0: 10843.0. Samples: 175089664. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:30,970][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:55:33,495][1653645] Updated weights for policy 0, policy_version 341857 (0.0031) [2024-06-15 15:55:34,367][1653645] Updated weights for policy 0, policy_version 341888 (0.0025) [2024-06-15 15:55:35,638][1653645] Updated weights for policy 0, policy_version 341949 (0.0011) [2024-06-15 15:55:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 700317696. Throughput: 0: 10911.3. Samples: 175127552. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:55:37,616][1653645] Updated weights for policy 0, policy_version 341992 (0.0017) [2024-06-15 15:55:39,467][1653645] Updated weights for policy 0, policy_version 342065 (0.0012) [2024-06-15 15:55:40,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 700579840. Throughput: 0: 10922.6. Samples: 175184384. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:55:44,659][1653645] Updated weights for policy 0, policy_version 342096 (0.0011) [2024-06-15 15:55:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 700710912. Throughput: 0: 11013.7. Samples: 175259648. Policy #0 lag: (min: 7.0, avg: 105.6, max: 263.0) [2024-06-15 15:55:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:55:46,324][1653645] Updated weights for policy 0, policy_version 342160 (0.0121) [2024-06-15 15:55:50,030][1653645] Updated weights for policy 0, policy_version 342240 (0.0024) [2024-06-15 15:55:50,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 700973056. Throughput: 0: 11128.8. Samples: 175286784. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:55:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:55:51,504][1653645] Updated weights for policy 0, policy_version 342304 (0.0023) [2024-06-15 15:55:55,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 701104128. Throughput: 0: 10899.9. Samples: 175349248. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:55:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:55:56,893][1651596] Signal inference workers to stop experience collection... (17700 times) [2024-06-15 15:55:56,948][1653645] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-06-15 15:55:57,144][1651596] Signal inference workers to resume experience collection... (17700 times) [2024-06-15 15:55:57,145][1653645] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-06-15 15:55:57,147][1653645] Updated weights for policy 0, policy_version 342368 (0.0065) [2024-06-15 15:55:58,576][1653645] Updated weights for policy 0, policy_version 342416 (0.0011) [2024-06-15 15:55:59,705][1653645] Updated weights for policy 0, policy_version 342460 (0.0011) [2024-06-15 15:56:00,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 701366272. Throughput: 0: 11116.1. Samples: 175418880. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:56:03,131][1653645] Updated weights for policy 0, policy_version 342547 (0.0012) [2024-06-15 15:56:03,983][1653645] Updated weights for policy 0, policy_version 342589 (0.0012) [2024-06-15 15:56:05,960][1648982] Fps is (10 sec: 52428.6, 60 sec: 43694.6, 300 sec: 44431.2). Total num frames: 701628416. Throughput: 0: 10911.3. Samples: 175447040. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:05,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:56:09,481][1653645] Updated weights for policy 0, policy_version 342641 (0.0013) [2024-06-15 15:56:10,840][1653645] Updated weights for policy 0, policy_version 342697 (0.0098) [2024-06-15 15:56:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 701825024. Throughput: 0: 10956.8. Samples: 175519744. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:56:13,645][1653645] Updated weights for policy 0, policy_version 342742 (0.0014) [2024-06-15 15:56:14,737][1653645] Updated weights for policy 0, policy_version 342784 (0.0013) [2024-06-15 15:56:15,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 44431.2). Total num frames: 702119936. Throughput: 0: 10934.0. Samples: 175581696. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:56:16,074][1653645] Updated weights for policy 0, policy_version 342839 (0.0014) [2024-06-15 15:56:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.2, 300 sec: 44209.1). Total num frames: 702218240. Throughput: 0: 10899.9. Samples: 175618048. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:56:21,099][1653645] Updated weights for policy 0, policy_version 342882 (0.0033) [2024-06-15 15:56:23,119][1653645] Updated weights for policy 0, policy_version 342964 (0.0014) [2024-06-15 15:56:25,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42598.4, 300 sec: 44097.9). Total num frames: 702447616. Throughput: 0: 10991.0. Samples: 175678976. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:56:26,181][1653645] Updated weights for policy 0, policy_version 343010 (0.0012) [2024-06-15 15:56:28,033][1653645] Updated weights for policy 0, policy_version 343102 (0.0134) [2024-06-15 15:56:30,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 702676992. Throughput: 0: 10774.7. Samples: 175744512. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:56:33,494][1653645] Updated weights for policy 0, policy_version 343164 (0.0103) [2024-06-15 15:56:35,127][1653645] Updated weights for policy 0, policy_version 343232 (0.0098) [2024-06-15 15:56:35,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 702939136. Throughput: 0: 10956.7. Samples: 175779840. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:35,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:56:38,029][1653645] Updated weights for policy 0, policy_version 343280 (0.0045) [2024-06-15 15:56:38,136][1651596] Signal inference workers to stop experience collection... (17750 times) [2024-06-15 15:56:38,183][1653645] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-06-15 15:56:38,373][1651596] Signal inference workers to resume experience collection... (17750 times) [2024-06-15 15:56:38,374][1653645] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-06-15 15:56:39,669][1653645] Updated weights for policy 0, policy_version 343355 (0.0011) [2024-06-15 15:56:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 703201280. Throughput: 0: 11059.2. Samples: 175846912. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 15:56:44,837][1653645] Updated weights for policy 0, policy_version 343408 (0.0023) [2024-06-15 15:56:45,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 703365120. Throughput: 0: 11059.2. Samples: 175916544. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:56:46,480][1653645] Updated weights for policy 0, policy_version 343485 (0.0012) [2024-06-15 15:56:49,641][1653645] Updated weights for policy 0, policy_version 343536 (0.0122) [2024-06-15 15:56:50,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44782.7, 300 sec: 44209.7). Total num frames: 703660032. Throughput: 0: 11229.8. Samples: 175952384. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:56:51,241][1653645] Updated weights for policy 0, policy_version 343604 (0.0038) [2024-06-15 15:56:55,958][1648982] Fps is (10 sec: 39319.4, 60 sec: 44236.4, 300 sec: 44097.9). Total num frames: 703758336. Throughput: 0: 11059.1. Samples: 176017408. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:56:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:56:56,479][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000343664_703823872.pth... [2024-06-15 15:56:56,530][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000338496_693239808.pth [2024-06-15 15:56:56,664][1653645] Updated weights for policy 0, policy_version 343671 (0.0053) [2024-06-15 15:56:58,423][1653645] Updated weights for policy 0, policy_version 343740 (0.0013) [2024-06-15 15:57:00,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 704020480. Throughput: 0: 11127.5. Samples: 176082432. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:57:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 15:57:01,698][1653645] Updated weights for policy 0, policy_version 343792 (0.0012) [2024-06-15 15:57:03,627][1653645] Updated weights for policy 0, policy_version 343865 (0.0012) [2024-06-15 15:57:05,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 43690.4, 300 sec: 44097.9). Total num frames: 704249856. Throughput: 0: 10820.2. Samples: 176104960. Policy #0 lag: (min: 5.0, avg: 108.3, max: 229.0) [2024-06-15 15:57:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:57:08,854][1653645] Updated weights for policy 0, policy_version 343922 (0.0102) [2024-06-15 15:57:09,572][1653645] Updated weights for policy 0, policy_version 343937 (0.0013) [2024-06-15 15:57:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 704512000. Throughput: 0: 11116.1. Samples: 176179200. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:57:12,886][1653645] Updated weights for policy 0, policy_version 344016 (0.0163) [2024-06-15 15:57:15,364][1653645] Updated weights for policy 0, policy_version 344105 (0.0013) [2024-06-15 15:57:15,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 704774144. Throughput: 0: 10899.9. Samples: 176235008. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:57:20,262][1653645] Updated weights for policy 0, policy_version 344160 (0.0010) [2024-06-15 15:57:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 704905216. Throughput: 0: 10979.6. Samples: 176273920. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:57:23,034][1653645] Updated weights for policy 0, policy_version 344240 (0.0032) [2024-06-15 15:57:24,801][1651596] Signal inference workers to stop experience collection... (17800 times) [2024-06-15 15:57:24,821][1653645] Updated weights for policy 0, policy_version 344274 (0.0013) [2024-06-15 15:57:24,871][1653645] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-06-15 15:57:25,062][1651596] Signal inference workers to resume experience collection... (17800 times) [2024-06-15 15:57:25,063][1653645] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-06-15 15:57:25,960][1648982] Fps is (10 sec: 39321.0, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 705167360. Throughput: 0: 10956.8. Samples: 176339968. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:25,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 15:57:26,750][1653645] Updated weights for policy 0, policy_version 344336 (0.0014) [2024-06-15 15:57:27,706][1653645] Updated weights for policy 0, policy_version 344383 (0.0012) [2024-06-15 15:57:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 705298432. Throughput: 0: 10979.5. Samples: 176410624. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:57:32,719][1653645] Updated weights for policy 0, policy_version 344446 (0.0013) [2024-06-15 15:57:35,214][1653645] Updated weights for policy 0, policy_version 344505 (0.0012) [2024-06-15 15:57:35,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 705560576. Throughput: 0: 10865.7. Samples: 176441344. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:35,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:57:37,086][1653645] Updated weights for policy 0, policy_version 344549 (0.0027) [2024-06-15 15:57:39,140][1653645] Updated weights for policy 0, policy_version 344624 (0.0019) [2024-06-15 15:57:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 705822720. Throughput: 0: 10740.7. Samples: 176500736. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:57:44,645][1653645] Updated weights for policy 0, policy_version 344695 (0.0013) [2024-06-15 15:57:45,960][1648982] Fps is (10 sec: 42599.6, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 705986560. Throughput: 0: 10979.6. Samples: 176576512. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:45,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:57:46,652][1653645] Updated weights for policy 0, policy_version 344752 (0.0013) [2024-06-15 15:57:49,124][1653645] Updated weights for policy 0, policy_version 344816 (0.0015) [2024-06-15 15:57:50,566][1653645] Updated weights for policy 0, policy_version 344867 (0.0012) [2024-06-15 15:57:50,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 44236.9, 300 sec: 44320.2). Total num frames: 706314240. Throughput: 0: 11138.9. Samples: 176606208. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:57:55,142][1653645] Updated weights for policy 0, policy_version 344902 (0.0014) [2024-06-15 15:57:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44237.1, 300 sec: 44209.0). Total num frames: 706412544. Throughput: 0: 11207.1. Samples: 176683520. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:57:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:57:56,348][1653645] Updated weights for policy 0, policy_version 344957 (0.0019) [2024-06-15 15:57:58,421][1653645] Updated weights for policy 0, policy_version 345017 (0.0028) [2024-06-15 15:57:59,953][1653645] Updated weights for policy 0, policy_version 345057 (0.0013) [2024-06-15 15:58:00,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 706772992. Throughput: 0: 11332.2. Samples: 176744960. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:58:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:58:01,814][1653645] Updated weights for policy 0, policy_version 345136 (0.0016) [2024-06-15 15:58:05,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43691.0, 300 sec: 43986.9). Total num frames: 706871296. Throughput: 0: 11161.6. Samples: 176776192. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:58:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:58:06,905][1653645] Updated weights for policy 0, policy_version 345168 (0.0010) [2024-06-15 15:58:09,808][1653645] Updated weights for policy 0, policy_version 345219 (0.0014) [2024-06-15 15:58:10,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43144.4, 300 sec: 43986.8). Total num frames: 707100672. Throughput: 0: 11195.7. Samples: 176843776. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:58:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:58:10,999][1653645] Updated weights for policy 0, policy_version 345277 (0.0012) [2024-06-15 15:58:12,269][1651596] Signal inference workers to stop experience collection... (17850 times) [2024-06-15 15:58:12,359][1653645] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-06-15 15:58:12,599][1651596] Signal inference workers to resume experience collection... (17850 times) [2024-06-15 15:58:12,601][1653645] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-06-15 15:58:12,603][1653645] Updated weights for policy 0, policy_version 345328 (0.0012) [2024-06-15 15:58:14,591][1653645] Updated weights for policy 0, policy_version 345405 (0.0093) [2024-06-15 15:58:15,958][1648982] Fps is (10 sec: 52424.1, 60 sec: 43690.0, 300 sec: 44431.1). Total num frames: 707395584. Throughput: 0: 10922.5. Samples: 176902144. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:58:15,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 15:58:20,876][1653645] Updated weights for policy 0, policy_version 345469 (0.0021) [2024-06-15 15:58:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 707526656. Throughput: 0: 11093.4. Samples: 176940544. Policy #0 lag: (min: 1.0, avg: 79.7, max: 257.0) [2024-06-15 15:58:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:58:22,866][1653645] Updated weights for policy 0, policy_version 345528 (0.0091) [2024-06-15 15:58:25,192][1653645] Updated weights for policy 0, policy_version 345569 (0.0013) [2024-06-15 15:58:25,957][1648982] Fps is (10 sec: 39325.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 707788800. Throughput: 0: 11161.6. Samples: 177003008. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:58:26,714][1653645] Updated weights for policy 0, policy_version 345632 (0.0010) [2024-06-15 15:58:30,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 707919872. Throughput: 0: 10956.8. Samples: 177069568. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:58:31,826][1653645] Updated weights for policy 0, policy_version 345683 (0.0012) [2024-06-15 15:58:33,558][1653645] Updated weights for policy 0, policy_version 345744 (0.0122) [2024-06-15 15:58:35,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.9, 300 sec: 43653.6). Total num frames: 708182016. Throughput: 0: 10968.2. Samples: 177099776. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 15:58:37,298][1653645] Updated weights for policy 0, policy_version 345829 (0.0018) [2024-06-15 15:58:38,841][1653645] Updated weights for policy 0, policy_version 345892 (0.0014) [2024-06-15 15:58:40,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 708444160. Throughput: 0: 10615.5. Samples: 177161216. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:58:44,577][1653645] Updated weights for policy 0, policy_version 345952 (0.0014) [2024-06-15 15:58:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 708575232. Throughput: 0: 10729.3. Samples: 177227776. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 15:58:46,941][1653645] Updated weights for policy 0, policy_version 346032 (0.0015) [2024-06-15 15:58:49,362][1653645] Updated weights for policy 0, policy_version 346081 (0.0034) [2024-06-15 15:58:50,877][1653645] Updated weights for policy 0, policy_version 346144 (0.0019) [2024-06-15 15:58:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 708902912. Throughput: 0: 10786.1. Samples: 177261568. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 15:58:55,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 43653.6). Total num frames: 708968448. Throughput: 0: 10774.8. Samples: 177328640. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:58:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 15:58:56,338][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000346192_709001216.pth... [2024-06-15 15:58:56,481][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000341088_698548224.pth [2024-06-15 15:58:56,777][1653645] Updated weights for policy 0, policy_version 346208 (0.0018) [2024-06-15 15:58:58,273][1653645] Updated weights for policy 0, policy_version 346259 (0.0013) [2024-06-15 15:58:59,272][1653645] Updated weights for policy 0, policy_version 346304 (0.0012) [2024-06-15 15:59:00,299][1651596] Signal inference workers to stop experience collection... (17900 times) [2024-06-15 15:59:00,363][1653645] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-06-15 15:59:00,597][1651596] Signal inference workers to resume experience collection... (17900 times) [2024-06-15 15:59:00,597][1653645] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-06-15 15:59:00,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 41506.2, 300 sec: 43653.6). Total num frames: 709263360. Throughput: 0: 10922.9. Samples: 177393664. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 15:59:01,949][1653645] Updated weights for policy 0, policy_version 346371 (0.0012) [2024-06-15 15:59:05,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.3, 300 sec: 43986.8). Total num frames: 709492736. Throughput: 0: 10638.2. Samples: 177419264. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 15:59:08,351][1653645] Updated weights for policy 0, policy_version 346448 (0.0087) [2024-06-15 15:59:09,642][1653645] Updated weights for policy 0, policy_version 346496 (0.0041) [2024-06-15 15:59:10,957][1648982] Fps is (10 sec: 45876.0, 60 sec: 43690.9, 300 sec: 43653.7). Total num frames: 709722112. Throughput: 0: 10911.3. Samples: 177494016. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 15:59:10,983][1653645] Updated weights for policy 0, policy_version 346554 (0.0013) [2024-06-15 15:59:14,103][1653645] Updated weights for policy 0, policy_version 346624 (0.0012) [2024-06-15 15:59:15,330][1653645] Updated weights for policy 0, policy_version 346683 (0.0012) [2024-06-15 15:59:15,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43691.2, 300 sec: 43986.9). Total num frames: 710017024. Throughput: 0: 10752.0. Samples: 177553408. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 15:59:20,879][1653645] Updated weights for policy 0, policy_version 346743 (0.0017) [2024-06-15 15:59:20,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43144.6, 300 sec: 43431.5). Total num frames: 710115328. Throughput: 0: 11047.8. Samples: 177596928. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 15:59:22,295][1653645] Updated weights for policy 0, policy_version 346807 (0.0138) [2024-06-15 15:59:24,678][1653645] Updated weights for policy 0, policy_version 346838 (0.0011) [2024-06-15 15:59:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 710410240. Throughput: 0: 11047.8. Samples: 177658368. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:25,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 15:59:26,097][1653645] Updated weights for policy 0, policy_version 346898 (0.0134) [2024-06-15 15:59:30,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 710541312. Throughput: 0: 11002.3. Samples: 177722880. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:30,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 15:59:31,572][1653645] Updated weights for policy 0, policy_version 346946 (0.0011) [2024-06-15 15:59:34,150][1653645] Updated weights for policy 0, policy_version 347024 (0.0168) [2024-06-15 15:59:35,274][1653645] Updated weights for policy 0, policy_version 347069 (0.0014) [2024-06-15 15:59:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 710803456. Throughput: 0: 10956.8. Samples: 177754624. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:35,958][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 15:59:36,848][1653645] Updated weights for policy 0, policy_version 347124 (0.0015) [2024-06-15 15:59:38,757][1653645] Updated weights for policy 0, policy_version 347169 (0.0011) [2024-06-15 15:59:40,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 711065600. Throughput: 0: 10843.0. Samples: 177816576. Policy #0 lag: (min: 15.0, avg: 128.9, max: 271.0) [2024-06-15 15:59:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 15:59:44,116][1653645] Updated weights for policy 0, policy_version 347216 (0.0012) [2024-06-15 15:59:45,570][1651596] Signal inference workers to stop experience collection... (17950 times) [2024-06-15 15:59:45,620][1653645] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-06-15 15:59:45,833][1651596] Signal inference workers to resume experience collection... (17950 times) [2024-06-15 15:59:45,834][1653645] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-06-15 15:59:45,836][1653645] Updated weights for policy 0, policy_version 347280 (0.0015) [2024-06-15 15:59:45,964][1648982] Fps is (10 sec: 42570.4, 60 sec: 44231.9, 300 sec: 43541.6). Total num frames: 711229440. Throughput: 0: 10966.6. Samples: 177887232. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 15:59:45,965][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 15:59:47,125][1653645] Updated weights for policy 0, policy_version 347331 (0.0015) [2024-06-15 15:59:48,606][1653645] Updated weights for policy 0, policy_version 347390 (0.0027) [2024-06-15 15:59:50,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 711524352. Throughput: 0: 11047.9. Samples: 177916416. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 15:59:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 15:59:51,423][1653645] Updated weights for policy 0, policy_version 347456 (0.0012) [2024-06-15 15:59:55,958][1648982] Fps is (10 sec: 36066.9, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 711589888. Throughput: 0: 10820.1. Samples: 177980928. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 15:59:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 15:59:57,499][1653645] Updated weights for policy 0, policy_version 347519 (0.0015) [2024-06-15 15:59:59,566][1653645] Updated weights for policy 0, policy_version 347581 (0.0012) [2024-06-15 16:00:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 43765.5). Total num frames: 711917568. Throughput: 0: 10831.6. Samples: 178040832. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:00:01,450][1653645] Updated weights for policy 0, policy_version 347632 (0.0015) [2024-06-15 16:00:03,209][1653645] Updated weights for policy 0, policy_version 347680 (0.0015) [2024-06-15 16:00:05,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 712114176. Throughput: 0: 10490.3. Samples: 178068992. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:00:08,725][1653645] Updated weights for policy 0, policy_version 347744 (0.0013) [2024-06-15 16:00:10,250][1653645] Updated weights for policy 0, policy_version 347778 (0.0013) [2024-06-15 16:00:10,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.4, 300 sec: 43320.4). Total num frames: 712310784. Throughput: 0: 10729.3. Samples: 178141184. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:00:11,430][1653645] Updated weights for policy 0, policy_version 347834 (0.0015) [2024-06-15 16:00:13,876][1653645] Updated weights for policy 0, policy_version 347904 (0.0016) [2024-06-15 16:00:15,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 42598.2, 300 sec: 44209.0). Total num frames: 712572928. Throughput: 0: 10660.9. Samples: 178202624. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:15,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:00:16,364][1653645] Updated weights for policy 0, policy_version 347959 (0.0057) [2024-06-15 16:00:20,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 712671232. Throughput: 0: 10695.1. Samples: 178235904. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:00:21,100][1653645] Updated weights for policy 0, policy_version 348000 (0.0013) [2024-06-15 16:00:23,365][1653645] Updated weights for policy 0, policy_version 348080 (0.0014) [2024-06-15 16:00:25,364][1653645] Updated weights for policy 0, policy_version 348144 (0.0016) [2024-06-15 16:00:25,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 713031680. Throughput: 0: 10740.6. Samples: 178299904. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:00:27,749][1653645] Updated weights for policy 0, policy_version 348194 (0.0013) [2024-06-15 16:00:30,958][1648982] Fps is (10 sec: 49149.0, 60 sec: 43690.3, 300 sec: 43542.5). Total num frames: 713162752. Throughput: 0: 10833.1. Samples: 178374656. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:00:33,111][1653645] Updated weights for policy 0, policy_version 348256 (0.0016) [2024-06-15 16:00:33,741][1651596] Signal inference workers to stop experience collection... (18000 times) [2024-06-15 16:00:33,813][1653645] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-06-15 16:00:34,082][1651596] Signal inference workers to resume experience collection... (18000 times) [2024-06-15 16:00:34,083][1653645] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-06-15 16:00:34,871][1653645] Updated weights for policy 0, policy_version 348320 (0.0087) [2024-06-15 16:00:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 713424896. Throughput: 0: 10899.9. Samples: 178406912. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:00:36,623][1653645] Updated weights for policy 0, policy_version 348388 (0.0013) [2024-06-15 16:00:39,575][1653645] Updated weights for policy 0, policy_version 348448 (0.0013) [2024-06-15 16:00:40,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 713687040. Throughput: 0: 10797.5. Samples: 178466816. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:00:44,488][1653645] Updated weights for policy 0, policy_version 348496 (0.0018) [2024-06-15 16:00:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43149.2, 300 sec: 43542.5). Total num frames: 713818112. Throughput: 0: 11104.7. Samples: 178540544. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:00:45,995][1653645] Updated weights for policy 0, policy_version 348552 (0.0017) [2024-06-15 16:00:46,974][1653645] Updated weights for policy 0, policy_version 348600 (0.0014) [2024-06-15 16:00:48,532][1653645] Updated weights for policy 0, policy_version 348667 (0.0013) [2024-06-15 16:00:50,958][1648982] Fps is (10 sec: 45877.4, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 714145792. Throughput: 0: 11070.6. Samples: 178567168. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:00:51,242][1653645] Updated weights for policy 0, policy_version 348720 (0.0012) [2024-06-15 16:00:55,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 714244096. Throughput: 0: 11218.4. Samples: 178646016. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:00:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:00:56,299][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000348784_714309632.pth... [2024-06-15 16:00:56,462][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000343664_703823872.pth [2024-06-15 16:00:56,791][1653645] Updated weights for policy 0, policy_version 348800 (0.0014) [2024-06-15 16:00:58,632][1653645] Updated weights for policy 0, policy_version 348864 (0.0177) [2024-06-15 16:01:00,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 714604544. Throughput: 0: 11047.9. Samples: 178699776. Policy #0 lag: (min: 15.0, avg: 95.0, max: 271.0) [2024-06-15 16:01:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:01:03,448][1653645] Updated weights for policy 0, policy_version 348961 (0.0013) [2024-06-15 16:01:04,227][1653645] Updated weights for policy 0, policy_version 348992 (0.0013) [2024-06-15 16:01:05,958][1648982] Fps is (10 sec: 49154.3, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 714735616. Throughput: 0: 11081.9. Samples: 178734592. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:01:09,287][1653645] Updated weights for policy 0, policy_version 349056 (0.0017) [2024-06-15 16:01:10,772][1653645] Updated weights for policy 0, policy_version 349120 (0.0011) [2024-06-15 16:01:10,958][1648982] Fps is (10 sec: 39319.8, 60 sec: 44782.5, 300 sec: 43653.6). Total num frames: 714997760. Throughput: 0: 11241.1. Samples: 178805760. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:01:14,529][1653645] Updated weights for policy 0, policy_version 349201 (0.0014) [2024-06-15 16:01:14,925][1651596] Signal inference workers to stop experience collection... (18050 times) [2024-06-15 16:01:14,982][1653645] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-06-15 16:01:15,238][1651596] Signal inference workers to resume experience collection... (18050 times) [2024-06-15 16:01:15,246][1653645] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-06-15 16:01:15,681][1653645] Updated weights for policy 0, policy_version 349248 (0.0013) [2024-06-15 16:01:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44783.2, 300 sec: 44209.0). Total num frames: 715259904. Throughput: 0: 10979.7. Samples: 178868736. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:01:20,614][1653645] Updated weights for policy 0, policy_version 349298 (0.0157) [2024-06-15 16:01:20,958][1648982] Fps is (10 sec: 39323.6, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 715390976. Throughput: 0: 11104.7. Samples: 178906624. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:01:22,103][1653645] Updated weights for policy 0, policy_version 349376 (0.0011) [2024-06-15 16:01:24,259][1653645] Updated weights for policy 0, policy_version 349440 (0.0012) [2024-06-15 16:01:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 715653120. Throughput: 0: 11070.7. Samples: 178964992. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:01:27,435][1653645] Updated weights for policy 0, policy_version 349500 (0.0125) [2024-06-15 16:01:30,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.9, 300 sec: 43542.6). Total num frames: 715784192. Throughput: 0: 11002.3. Samples: 179035648. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:01:32,886][1653645] Updated weights for policy 0, policy_version 349552 (0.0011) [2024-06-15 16:01:34,536][1653645] Updated weights for policy 0, policy_version 349619 (0.0091) [2024-06-15 16:01:35,957][1648982] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 716111872. Throughput: 0: 11138.8. Samples: 179068416. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:01:36,328][1653645] Updated weights for policy 0, policy_version 349689 (0.0014) [2024-06-15 16:01:40,328][1653645] Updated weights for policy 0, policy_version 349754 (0.0012) [2024-06-15 16:01:40,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 716308480. Throughput: 0: 10695.2. Samples: 179127296. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:01:45,103][1653645] Updated weights for policy 0, policy_version 349810 (0.0011) [2024-06-15 16:01:45,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 716439552. Throughput: 0: 10990.9. Samples: 179194368. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:01:46,342][1653645] Updated weights for policy 0, policy_version 349856 (0.0022) [2024-06-15 16:01:48,685][1653645] Updated weights for policy 0, policy_version 349950 (0.0012) [2024-06-15 16:01:50,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 43875.9). Total num frames: 716701696. Throughput: 0: 10740.6. Samples: 179217920. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:01:53,000][1653645] Updated weights for policy 0, policy_version 350013 (0.0016) [2024-06-15 16:01:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.8, 300 sec: 43431.5). Total num frames: 716832768. Throughput: 0: 10683.8. Samples: 179286528. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:01:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:01:58,553][1653645] Updated weights for policy 0, policy_version 350081 (0.0013) [2024-06-15 16:02:00,810][1651596] Signal inference workers to stop experience collection... (18100 times) [2024-06-15 16:02:00,841][1653645] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-06-15 16:02:00,960][1648982] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 43764.8). Total num frames: 717160448. Throughput: 0: 10524.4. Samples: 179342336. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:02:00,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:02:01,030][1651596] Signal inference workers to resume experience collection... (18100 times) [2024-06-15 16:02:01,031][1653645] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-06-15 16:02:01,233][1653645] Updated weights for policy 0, policy_version 350201 (0.0016) [2024-06-15 16:02:05,107][1653645] Updated weights for policy 0, policy_version 350256 (0.0019) [2024-06-15 16:02:05,962][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 717357056. Throughput: 0: 10444.8. Samples: 179376640. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:02:05,962][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:02:09,286][1653645] Updated weights for policy 0, policy_version 350288 (0.0013) [2024-06-15 16:02:10,959][1648982] Fps is (10 sec: 36040.6, 60 sec: 42051.8, 300 sec: 43209.1). Total num frames: 717520896. Throughput: 0: 10763.1. Samples: 179449344. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:02:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:02:11,152][1653645] Updated weights for policy 0, policy_version 350368 (0.0024) [2024-06-15 16:02:12,493][1653645] Updated weights for policy 0, policy_version 350419 (0.0014) [2024-06-15 16:02:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 43542.6). Total num frames: 717750272. Throughput: 0: 10524.5. Samples: 179509248. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:02:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:02:16,780][1653645] Updated weights for policy 0, policy_version 350480 (0.0123) [2024-06-15 16:02:17,762][1653645] Updated weights for policy 0, policy_version 350527 (0.0014) [2024-06-15 16:02:20,958][1648982] Fps is (10 sec: 36048.4, 60 sec: 41506.0, 300 sec: 43098.2). Total num frames: 717881344. Throughput: 0: 10535.7. Samples: 179542528. Policy #0 lag: (min: 15.0, avg: 142.7, max: 271.0) [2024-06-15 16:02:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:02:22,925][1653645] Updated weights for policy 0, policy_version 350597 (0.0012) [2024-06-15 16:02:24,403][1653645] Updated weights for policy 0, policy_version 350659 (0.0070) [2024-06-15 16:02:25,649][1653645] Updated weights for policy 0, policy_version 350710 (0.0011) [2024-06-15 16:02:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 718274560. Throughput: 0: 10604.1. Samples: 179604480. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:02:29,790][1653645] Updated weights for policy 0, policy_version 350756 (0.0012) [2024-06-15 16:02:30,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 718405632. Throughput: 0: 10535.8. Samples: 179668480. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:02:34,024][1653645] Updated weights for policy 0, policy_version 350816 (0.0131) [2024-06-15 16:02:35,600][1653645] Updated weights for policy 0, policy_version 350880 (0.0129) [2024-06-15 16:02:35,960][1648982] Fps is (10 sec: 32759.8, 60 sec: 41504.3, 300 sec: 43320.0). Total num frames: 718602240. Throughput: 0: 10956.2. Samples: 179710976. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:35,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:02:40,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 718798848. Throughput: 0: 10649.6. Samples: 179765760. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:02:41,151][1653645] Updated weights for policy 0, policy_version 350992 (0.0113) [2024-06-15 16:02:45,958][1648982] Fps is (10 sec: 36053.6, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 718962688. Throughput: 0: 10945.4. Samples: 179834880. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:02:46,110][1653645] Updated weights for policy 0, policy_version 351057 (0.0013) [2024-06-15 16:02:47,805][1651596] Signal inference workers to stop experience collection... (18150 times) [2024-06-15 16:02:47,850][1653645] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-06-15 16:02:48,072][1651596] Signal inference workers to resume experience collection... (18150 times) [2024-06-15 16:02:48,074][1653645] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-06-15 16:02:48,684][1653645] Updated weights for policy 0, policy_version 351158 (0.0012) [2024-06-15 16:02:49,860][1653645] Updated weights for policy 0, policy_version 351216 (0.0012) [2024-06-15 16:02:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 719323136. Throughput: 0: 10695.1. Samples: 179857920. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:02:54,582][1653645] Updated weights for policy 0, policy_version 351264 (0.0013) [2024-06-15 16:02:55,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 719454208. Throughput: 0: 10592.9. Samples: 179926016. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:02:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:02:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000351296_719454208.pth... [2024-06-15 16:02:56,034][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000346192_709001216.pth [2024-06-15 16:02:59,207][1653645] Updated weights for policy 0, policy_version 351345 (0.0013) [2024-06-15 16:03:00,582][1653645] Updated weights for policy 0, policy_version 351408 (0.0011) [2024-06-15 16:03:00,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 719683584. Throughput: 0: 10683.8. Samples: 179990016. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:03:02,007][1653645] Updated weights for policy 0, policy_version 351459 (0.0012) [2024-06-15 16:03:05,967][1648982] Fps is (10 sec: 39288.6, 60 sec: 41500.2, 300 sec: 43208.1). Total num frames: 719847424. Throughput: 0: 10659.0. Samples: 180022272. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:05,969][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:03:06,129][1653645] Updated weights for policy 0, policy_version 351505 (0.0015) [2024-06-15 16:03:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42599.3, 300 sec: 42987.3). Total num frames: 720076800. Throughput: 0: 10899.9. Samples: 180094976. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:03:11,083][1653645] Updated weights for policy 0, policy_version 351604 (0.0015) [2024-06-15 16:03:12,911][1653645] Updated weights for policy 0, policy_version 351680 (0.0025) [2024-06-15 16:03:14,234][1653645] Updated weights for policy 0, policy_version 351736 (0.0014) [2024-06-15 16:03:15,958][1648982] Fps is (10 sec: 52473.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 720371712. Throughput: 0: 10729.2. Samples: 180151296. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:03:19,208][1653645] Updated weights for policy 0, policy_version 351780 (0.0020) [2024-06-15 16:03:20,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 720502784. Throughput: 0: 10559.1. Samples: 180186112. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:20,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:03:22,384][1653645] Updated weights for policy 0, policy_version 351818 (0.0013) [2024-06-15 16:03:23,943][1653645] Updated weights for policy 0, policy_version 351875 (0.0010) [2024-06-15 16:03:25,794][1653645] Updated weights for policy 0, policy_version 351952 (0.0013) [2024-06-15 16:03:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 43653.7). Total num frames: 720797696. Throughput: 0: 10786.1. Samples: 180251136. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:03:30,940][1653645] Updated weights for policy 0, policy_version 352001 (0.0013) [2024-06-15 16:03:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.0, 300 sec: 43098.2). Total num frames: 720896000. Throughput: 0: 10672.3. Samples: 180315136. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:30,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:03:31,739][1651596] Signal inference workers to stop experience collection... (18200 times) [2024-06-15 16:03:31,785][1653645] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-06-15 16:03:32,037][1651596] Signal inference workers to resume experience collection... (18200 times) [2024-06-15 16:03:32,038][1653645] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-06-15 16:03:32,415][1653645] Updated weights for policy 0, policy_version 352063 (0.0012) [2024-06-15 16:03:35,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42054.0, 300 sec: 42987.2). Total num frames: 721125376. Throughput: 0: 10888.5. Samples: 180347904. Policy #0 lag: (min: 15.0, avg: 85.8, max: 271.0) [2024-06-15 16:03:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:03:36,071][1653645] Updated weights for policy 0, policy_version 352128 (0.0117) [2024-06-15 16:03:38,380][1653645] Updated weights for policy 0, policy_version 352224 (0.0111) [2024-06-15 16:03:40,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 721420288. Throughput: 0: 10558.6. Samples: 180401152. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:03:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:03:43,472][1653645] Updated weights for policy 0, policy_version 352304 (0.0014) [2024-06-15 16:03:45,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 721551360. Throughput: 0: 10820.3. Samples: 180476928. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:03:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:03:46,791][1653645] Updated weights for policy 0, policy_version 352336 (0.0013) [2024-06-15 16:03:48,771][1653645] Updated weights for policy 0, policy_version 352400 (0.0014) [2024-06-15 16:03:50,059][1653645] Updated weights for policy 0, policy_version 352454 (0.0012) [2024-06-15 16:03:50,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 721911808. Throughput: 0: 10754.1. Samples: 180506112. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:03:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:03:51,085][1653645] Updated weights for policy 0, policy_version 352512 (0.0013) [2024-06-15 16:03:54,843][1653645] Updated weights for policy 0, policy_version 352576 (0.0012) [2024-06-15 16:03:55,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 43690.6, 300 sec: 43431.4). Total num frames: 722075648. Throughput: 0: 10717.8. Samples: 180577280. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:03:55,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:03:59,511][1653645] Updated weights for policy 0, policy_version 352638 (0.0013) [2024-06-15 16:04:00,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43144.5, 300 sec: 43320.5). Total num frames: 722272256. Throughput: 0: 10888.5. Samples: 180641280. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:04:01,720][1653645] Updated weights for policy 0, policy_version 352720 (0.0013) [2024-06-15 16:04:05,068][1653645] Updated weights for policy 0, policy_version 352770 (0.0012) [2024-06-15 16:04:05,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 44789.4, 300 sec: 43431.5). Total num frames: 722534400. Throughput: 0: 10877.2. Samples: 180675584. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:04:06,409][1653645] Updated weights for policy 0, policy_version 352828 (0.0025) [2024-06-15 16:04:10,339][1653645] Updated weights for policy 0, policy_version 352890 (0.0016) [2024-06-15 16:04:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 43098.3). Total num frames: 722731008. Throughput: 0: 11150.2. Samples: 180752896. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:04:13,140][1653645] Updated weights for policy 0, policy_version 352976 (0.0015) [2024-06-15 16:04:13,648][1651596] Signal inference workers to stop experience collection... (18250 times) [2024-06-15 16:04:13,707][1653645] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-06-15 16:04:13,884][1651596] Signal inference workers to resume experience collection... (18250 times) [2024-06-15 16:04:13,885][1653645] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-06-15 16:04:14,088][1653645] Updated weights for policy 0, policy_version 353020 (0.0015) [2024-06-15 16:04:15,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 722993152. Throughput: 0: 11173.0. Samples: 180817920. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:04:17,690][1653645] Updated weights for policy 0, policy_version 353079 (0.0013) [2024-06-15 16:04:20,834][1653645] Updated weights for policy 0, policy_version 353123 (0.0036) [2024-06-15 16:04:20,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.2, 300 sec: 43320.4). Total num frames: 723189760. Throughput: 0: 11218.5. Samples: 180852736. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:04:24,466][1653645] Updated weights for policy 0, policy_version 353216 (0.0015) [2024-06-15 16:04:25,842][1653645] Updated weights for policy 0, policy_version 353280 (0.0014) [2024-06-15 16:04:25,958][1648982] Fps is (10 sec: 52425.4, 60 sec: 45328.5, 300 sec: 43986.8). Total num frames: 723517440. Throughput: 0: 11491.4. Samples: 180918272. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:04:29,125][1653645] Updated weights for policy 0, policy_version 353342 (0.0015) [2024-06-15 16:04:30,958][1648982] Fps is (10 sec: 45872.9, 60 sec: 45875.1, 300 sec: 43542.5). Total num frames: 723648512. Throughput: 0: 11514.2. Samples: 180995072. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:04:32,903][1653645] Updated weights for policy 0, policy_version 353402 (0.0019) [2024-06-15 16:04:35,982][1648982] Fps is (10 sec: 42500.1, 60 sec: 46948.8, 300 sec: 43650.1). Total num frames: 723943424. Throughput: 0: 11667.4. Samples: 181031424. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:35,982][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:04:35,988][1653645] Updated weights for policy 0, policy_version 353489 (0.0041) [2024-06-15 16:04:36,672][1653645] Updated weights for policy 0, policy_version 353534 (0.0013) [2024-06-15 16:04:40,150][1653645] Updated weights for policy 0, policy_version 353584 (0.0013) [2024-06-15 16:04:40,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45875.1, 300 sec: 43876.7). Total num frames: 724172800. Throughput: 0: 11514.3. Samples: 181095424. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:04:44,320][1653645] Updated weights for policy 0, policy_version 353654 (0.0015) [2024-06-15 16:04:45,958][1648982] Fps is (10 sec: 39415.8, 60 sec: 46421.2, 300 sec: 43431.5). Total num frames: 724336640. Throughput: 0: 11662.2. Samples: 181166080. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:04:46,494][1653645] Updated weights for policy 0, policy_version 353698 (0.0013) [2024-06-15 16:04:48,103][1653645] Updated weights for policy 0, policy_version 353787 (0.0095) [2024-06-15 16:04:50,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 724566016. Throughput: 0: 11480.1. Samples: 181192192. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:04:52,150][1653645] Updated weights for policy 0, policy_version 353851 (0.0039) [2024-06-15 16:04:55,211][1653645] Updated weights for policy 0, policy_version 353890 (0.0012) [2024-06-15 16:04:55,959][1648982] Fps is (10 sec: 49145.8, 60 sec: 45874.5, 300 sec: 43764.5). Total num frames: 724828160. Throughput: 0: 11423.0. Samples: 181266944. Policy #0 lag: (min: 121.0, avg: 233.5, max: 408.0) [2024-06-15 16:04:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:04:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000353920_724828160.pth... [2024-06-15 16:04:56,030][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000348784_714309632.pth [2024-06-15 16:04:57,596][1653645] Updated weights for policy 0, policy_version 353936 (0.0013) [2024-06-15 16:04:58,763][1651596] Signal inference workers to stop experience collection... (18300 times) [2024-06-15 16:04:58,825][1653645] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-06-15 16:04:58,966][1651596] Signal inference workers to resume experience collection... (18300 times) [2024-06-15 16:04:58,968][1653645] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-06-15 16:04:58,970][1653645] Updated weights for policy 0, policy_version 354000 (0.0013) [2024-06-15 16:05:00,127][1653645] Updated weights for policy 0, policy_version 354046 (0.0012) [2024-06-15 16:05:00,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 46967.5, 300 sec: 43986.9). Total num frames: 725090304. Throughput: 0: 11377.8. Samples: 181329920. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:05:03,818][1653645] Updated weights for policy 0, policy_version 354107 (0.0013) [2024-06-15 16:05:05,958][1648982] Fps is (10 sec: 42604.1, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 725254144. Throughput: 0: 11502.9. Samples: 181370368. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:05:06,748][1653645] Updated weights for policy 0, policy_version 354169 (0.0016) [2024-06-15 16:05:08,553][1653645] Updated weights for policy 0, policy_version 354201 (0.0013) [2024-06-15 16:05:10,125][1653645] Updated weights for policy 0, policy_version 354274 (0.0127) [2024-06-15 16:05:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 48059.7, 300 sec: 44209.1). Total num frames: 725614592. Throughput: 0: 11514.5. Samples: 181436416. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:05:14,256][1653645] Updated weights for policy 0, policy_version 354336 (0.0014) [2024-06-15 16:05:15,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 725745664. Throughput: 0: 11514.4. Samples: 181513216. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:05:16,699][1653645] Updated weights for policy 0, policy_version 354370 (0.0011) [2024-06-15 16:05:19,145][1653645] Updated weights for policy 0, policy_version 354439 (0.0013) [2024-06-15 16:05:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 47513.5, 300 sec: 44097.9). Total num frames: 726040576. Throughput: 0: 11543.2. Samples: 181550592. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:05:21,661][1653645] Updated weights for policy 0, policy_version 354544 (0.0014) [2024-06-15 16:05:25,155][1653645] Updated weights for policy 0, policy_version 354566 (0.0040) [2024-06-15 16:05:25,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44783.5, 300 sec: 44209.1). Total num frames: 726204416. Throughput: 0: 11537.1. Samples: 181614592. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:05:28,469][1653645] Updated weights for policy 0, policy_version 354642 (0.0013) [2024-06-15 16:05:30,276][1653645] Updated weights for policy 0, policy_version 354692 (0.0032) [2024-06-15 16:05:30,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46967.8, 300 sec: 44209.0). Total num frames: 726466560. Throughput: 0: 11616.7. Samples: 181688832. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:05:31,522][1653645] Updated weights for policy 0, policy_version 354749 (0.0012) [2024-06-15 16:05:32,923][1653645] Updated weights for policy 0, policy_version 354811 (0.0029) [2024-06-15 16:05:35,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45347.1, 300 sec: 43986.9). Total num frames: 726663168. Throughput: 0: 11639.5. Samples: 181715968. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:05:37,710][1653645] Updated weights for policy 0, policy_version 354878 (0.0013) [2024-06-15 16:05:40,554][1653645] Updated weights for policy 0, policy_version 354936 (0.0094) [2024-06-15 16:05:40,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.5, 300 sec: 44431.2). Total num frames: 726925312. Throughput: 0: 11730.8. Samples: 181794816. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:05:42,427][1651596] Signal inference workers to stop experience collection... (18350 times) [2024-06-15 16:05:42,495][1653645] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-06-15 16:05:42,657][1651596] Signal inference workers to resume experience collection... (18350 times) [2024-06-15 16:05:42,658][1653645] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-06-15 16:05:42,852][1653645] Updated weights for policy 0, policy_version 355001 (0.0122) [2024-06-15 16:05:44,801][1653645] Updated weights for policy 0, policy_version 355064 (0.0013) [2024-06-15 16:05:45,985][1648982] Fps is (10 sec: 52286.2, 60 sec: 47492.0, 300 sec: 44204.9). Total num frames: 727187456. Throughput: 0: 11530.1. Samples: 181849088. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:45,986][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:05:49,002][1653645] Updated weights for policy 0, policy_version 355120 (0.0013) [2024-06-15 16:05:50,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.3, 300 sec: 44320.2). Total num frames: 727318528. Throughput: 0: 11582.6. Samples: 181891584. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:05:51,918][1653645] Updated weights for policy 0, policy_version 355168 (0.0015) [2024-06-15 16:05:53,465][1653645] Updated weights for policy 0, policy_version 355232 (0.0014) [2024-06-15 16:05:55,281][1653645] Updated weights for policy 0, policy_version 355268 (0.0011) [2024-06-15 16:05:55,958][1648982] Fps is (10 sec: 42714.1, 60 sec: 46422.2, 300 sec: 44097.9). Total num frames: 727613440. Throughput: 0: 11571.2. Samples: 181957120. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:05:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:05:59,818][1653645] Updated weights for policy 0, policy_version 355334 (0.0013) [2024-06-15 16:06:00,853][1653645] Updated weights for policy 0, policy_version 355392 (0.0014) [2024-06-15 16:06:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 727842816. Throughput: 0: 11332.3. Samples: 182023168. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:06:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:06:05,690][1653645] Updated weights for policy 0, policy_version 355488 (0.0033) [2024-06-15 16:06:05,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 46421.3, 300 sec: 44209.1). Total num frames: 728039424. Throughput: 0: 11309.5. Samples: 182059520. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:06:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:06:06,538][1653645] Updated weights for policy 0, policy_version 355520 (0.0012) [2024-06-15 16:06:08,170][1653645] Updated weights for policy 0, policy_version 355583 (0.0014) [2024-06-15 16:06:10,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 728236032. Throughput: 0: 11298.1. Samples: 182123008. Policy #0 lag: (min: 15.0, avg: 112.3, max: 271.0) [2024-06-15 16:06:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:06:15,145][1653645] Updated weights for policy 0, policy_version 355649 (0.0014) [2024-06-15 16:06:15,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 728399872. Throughput: 0: 11184.3. Samples: 182192128. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:15,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:06:17,609][1653645] Updated weights for policy 0, policy_version 355732 (0.0012) [2024-06-15 16:06:19,955][1653645] Updated weights for policy 0, policy_version 355808 (0.0015) [2024-06-15 16:06:20,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 728760320. Throughput: 0: 11093.3. Samples: 182215168. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:06:25,419][1653645] Updated weights for policy 0, policy_version 355904 (0.0014) [2024-06-15 16:06:25,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 728891392. Throughput: 0: 10865.8. Samples: 182283776. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:06:29,148][1653645] Updated weights for policy 0, policy_version 355965 (0.0014) [2024-06-15 16:06:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 729153536. Throughput: 0: 10940.7. Samples: 182341120. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:06:31,655][1651596] Signal inference workers to stop experience collection... (18400 times) [2024-06-15 16:06:31,679][1653645] Updated weights for policy 0, policy_version 356034 (0.0064) [2024-06-15 16:06:31,719][1653645] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-06-15 16:06:31,891][1651596] Signal inference workers to resume experience collection... (18400 times) [2024-06-15 16:06:31,892][1653645] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-06-15 16:06:35,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 729284608. Throughput: 0: 10706.5. Samples: 182373376. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:06:37,713][1653645] Updated weights for policy 0, policy_version 356130 (0.0069) [2024-06-15 16:06:40,218][1653645] Updated weights for policy 0, policy_version 356176 (0.0012) [2024-06-15 16:06:40,968][1648982] Fps is (10 sec: 32768.1, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 729481216. Throughput: 0: 10854.4. Samples: 182445568. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:40,973][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:06:42,809][1653645] Updated weights for policy 0, policy_version 356283 (0.0011) [2024-06-15 16:06:45,124][1653645] Updated weights for policy 0, policy_version 356350 (0.0013) [2024-06-15 16:06:45,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43710.5, 300 sec: 44431.2). Total num frames: 729808896. Throughput: 0: 10695.1. Samples: 182504448. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:06:50,191][1653645] Updated weights for policy 0, policy_version 356401 (0.0142) [2024-06-15 16:06:50,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 729939968. Throughput: 0: 10740.6. Samples: 182542848. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:06:52,136][1653645] Updated weights for policy 0, policy_version 356448 (0.0021) [2024-06-15 16:06:54,044][1653645] Updated weights for policy 0, policy_version 356541 (0.0096) [2024-06-15 16:06:55,959][1648982] Fps is (10 sec: 45872.4, 60 sec: 44236.4, 300 sec: 44431.1). Total num frames: 730267648. Throughput: 0: 10751.9. Samples: 182606848. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:06:55,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:06:56,228][1653645] Updated weights for policy 0, policy_version 356592 (0.0012) [2024-06-15 16:06:56,239][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000356592_730300416.pth... [2024-06-15 16:06:56,280][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000351296_719454208.pth [2024-06-15 16:07:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 730333184. Throughput: 0: 10900.0. Samples: 182682624. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:07:01,361][1653645] Updated weights for policy 0, policy_version 356630 (0.0012) [2024-06-15 16:07:03,035][1653645] Updated weights for policy 0, policy_version 356704 (0.0012) [2024-06-15 16:07:05,023][1653645] Updated weights for policy 0, policy_version 356790 (0.0084) [2024-06-15 16:07:05,958][1648982] Fps is (10 sec: 45878.6, 60 sec: 44783.0, 300 sec: 44764.6). Total num frames: 730726400. Throughput: 0: 11025.1. Samples: 182711296. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:07:07,961][1653645] Updated weights for policy 0, policy_version 356848 (0.0037) [2024-06-15 16:07:10,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 730857472. Throughput: 0: 10911.2. Samples: 182774784. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:10,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:07:14,939][1653645] Updated weights for policy 0, policy_version 356928 (0.0035) [2024-06-15 16:07:15,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 731054080. Throughput: 0: 11116.1. Samples: 182841344. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:07:16,163][1651596] Signal inference workers to stop experience collection... (18450 times) [2024-06-15 16:07:16,263][1653645] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-06-15 16:07:16,399][1651596] Signal inference workers to resume experience collection... (18450 times) [2024-06-15 16:07:16,400][1653645] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-06-15 16:07:16,402][1653645] Updated weights for policy 0, policy_version 356992 (0.0012) [2024-06-15 16:07:19,444][1653645] Updated weights for policy 0, policy_version 357063 (0.0011) [2024-06-15 16:07:20,769][1653645] Updated weights for policy 0, policy_version 357116 (0.0012) [2024-06-15 16:07:20,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 731381760. Throughput: 0: 11047.8. Samples: 182870528. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:07:25,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 731512832. Throughput: 0: 11161.6. Samples: 182947840. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:07:27,079][1653645] Updated weights for policy 0, policy_version 357202 (0.0012) [2024-06-15 16:07:28,625][1653645] Updated weights for policy 0, policy_version 357280 (0.0013) [2024-06-15 16:07:30,966][1648982] Fps is (10 sec: 39288.6, 60 sec: 43684.6, 300 sec: 44652.4). Total num frames: 731774976. Throughput: 0: 11239.2. Samples: 183010304. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:30,967][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:07:31,414][1653645] Updated weights for policy 0, policy_version 357331 (0.0025) [2024-06-15 16:07:35,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 731906048. Throughput: 0: 11161.6. Samples: 183045120. Policy #0 lag: (min: 10.0, avg: 101.5, max: 266.0) [2024-06-15 16:07:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:07:36,320][1653645] Updated weights for policy 0, policy_version 357377 (0.0014) [2024-06-15 16:07:37,840][1653645] Updated weights for policy 0, policy_version 357456 (0.0061) [2024-06-15 16:07:39,552][1653645] Updated weights for policy 0, policy_version 357521 (0.0086) [2024-06-15 16:07:40,958][1648982] Fps is (10 sec: 52472.3, 60 sec: 46967.4, 300 sec: 45208.7). Total num frames: 732299264. Throughput: 0: 11139.0. Samples: 183108096. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:07:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:07:44,140][1653645] Updated weights for policy 0, policy_version 357602 (0.0013) [2024-06-15 16:07:45,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 732430336. Throughput: 0: 11013.7. Samples: 183178240. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:07:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:07:48,364][1653645] Updated weights for policy 0, policy_version 357637 (0.0032) [2024-06-15 16:07:49,689][1653645] Updated weights for policy 0, policy_version 357699 (0.0097) [2024-06-15 16:07:50,878][1653645] Updated weights for policy 0, policy_version 357760 (0.0013) [2024-06-15 16:07:50,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 732692480. Throughput: 0: 11286.7. Samples: 183219200. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:07:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:07:52,275][1653645] Updated weights for policy 0, policy_version 357824 (0.0013) [2024-06-15 16:07:55,862][1653645] Updated weights for policy 0, policy_version 357881 (0.0098) [2024-06-15 16:07:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44783.5, 300 sec: 44986.6). Total num frames: 732954624. Throughput: 0: 11332.4. Samples: 183284736. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:07:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:08:00,511][1651596] Signal inference workers to stop experience collection... (18500 times) [2024-06-15 16:08:00,547][1653645] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-06-15 16:08:00,789][1651596] Signal inference workers to resume experience collection... (18500 times) [2024-06-15 16:08:00,791][1653645] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-06-15 16:08:00,923][1653645] Updated weights for policy 0, policy_version 357936 (0.0143) [2024-06-15 16:08:00,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 45328.9, 300 sec: 44765.7). Total num frames: 733052928. Throughput: 0: 11423.3. Samples: 183355392. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:08:02,757][1653645] Updated weights for policy 0, policy_version 358020 (0.0227) [2024-06-15 16:08:05,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44986.6). Total num frames: 733347840. Throughput: 0: 11241.2. Samples: 183376384. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:08:07,005][1653645] Updated weights for policy 0, policy_version 358081 (0.0013) [2024-06-15 16:08:10,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 733478912. Throughput: 0: 11138.8. Samples: 183449088. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:08:11,814][1653645] Updated weights for policy 0, policy_version 358151 (0.0019) [2024-06-15 16:08:14,115][1653645] Updated weights for policy 0, policy_version 358242 (0.0138) [2024-06-15 16:08:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 46421.3, 300 sec: 45208.8). Total num frames: 733839360. Throughput: 0: 10913.3. Samples: 183501312. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:15,959][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 16:08:16,236][1653645] Updated weights for policy 0, policy_version 358331 (0.0012) [2024-06-15 16:08:19,960][1653645] Updated weights for policy 0, policy_version 358371 (0.0016) [2024-06-15 16:08:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 734003200. Throughput: 0: 11025.1. Samples: 183541248. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:08:24,541][1653645] Updated weights for policy 0, policy_version 358432 (0.0011) [2024-06-15 16:08:25,960][1648982] Fps is (10 sec: 36037.3, 60 sec: 44781.3, 300 sec: 45097.4). Total num frames: 734199808. Throughput: 0: 11240.7. Samples: 183613952. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:25,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:08:26,053][1653645] Updated weights for policy 0, policy_version 358498 (0.0099) [2024-06-15 16:08:27,300][1653645] Updated weights for policy 0, policy_version 358548 (0.0109) [2024-06-15 16:08:28,205][1653645] Updated weights for policy 0, policy_version 358589 (0.0011) [2024-06-15 16:08:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44789.2, 300 sec: 45208.7). Total num frames: 734461952. Throughput: 0: 11184.3. Samples: 183681536. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:08:31,190][1653645] Updated weights for policy 0, policy_version 358640 (0.0013) [2024-06-15 16:08:35,958][1648982] Fps is (10 sec: 39330.2, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 734593024. Throughput: 0: 11127.4. Samples: 183719936. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:08:36,673][1653645] Updated weights for policy 0, policy_version 358723 (0.0011) [2024-06-15 16:08:38,468][1653645] Updated weights for policy 0, policy_version 358787 (0.0012) [2024-06-15 16:08:39,445][1651596] Signal inference workers to stop experience collection... (18550 times) [2024-06-15 16:08:39,483][1653645] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-06-15 16:08:39,676][1651596] Signal inference workers to resume experience collection... (18550 times) [2024-06-15 16:08:39,677][1653645] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-06-15 16:08:39,918][1653645] Updated weights for policy 0, policy_version 358843 (0.0013) [2024-06-15 16:08:40,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 734920704. Throughput: 0: 10854.3. Samples: 183773184. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:08:42,511][1653645] Updated weights for policy 0, policy_version 358896 (0.0012) [2024-06-15 16:08:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 735051776. Throughput: 0: 11116.2. Samples: 183855616. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:08:48,541][1653645] Updated weights for policy 0, policy_version 358960 (0.0039) [2024-06-15 16:08:50,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.4, 300 sec: 44875.5). Total num frames: 735313920. Throughput: 0: 11411.8. Samples: 183889920. Policy #0 lag: (min: 12.0, avg: 83.9, max: 268.0) [2024-06-15 16:08:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:08:51,025][1653645] Updated weights for policy 0, policy_version 359044 (0.0012) [2024-06-15 16:08:52,564][1653645] Updated weights for policy 0, policy_version 359097 (0.0011) [2024-06-15 16:08:54,802][1653645] Updated weights for policy 0, policy_version 359152 (0.0013) [2024-06-15 16:08:55,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 43690.3, 300 sec: 45097.6). Total num frames: 735576064. Throughput: 0: 10956.7. Samples: 183942144. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:08:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:08:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000359168_735576064.pth... [2024-06-15 16:08:56,046][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000353920_724828160.pth [2024-06-15 16:09:00,220][1653645] Updated weights for policy 0, policy_version 359200 (0.0013) [2024-06-15 16:09:00,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 43690.8, 300 sec: 44542.2). Total num frames: 735674368. Throughput: 0: 11332.3. Samples: 184011264. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:00,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:09:02,092][1653645] Updated weights for policy 0, policy_version 359268 (0.0011) [2024-06-15 16:09:03,963][1653645] Updated weights for policy 0, policy_version 359344 (0.0011) [2024-06-15 16:09:05,957][1648982] Fps is (10 sec: 39323.8, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 735969280. Throughput: 0: 10922.7. Samples: 184032768. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:05,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 16:09:08,299][1653645] Updated weights for policy 0, policy_version 359416 (0.0017) [2024-06-15 16:09:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 736100352. Throughput: 0: 10809.4. Samples: 184100352. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:09:12,462][1653645] Updated weights for policy 0, policy_version 359479 (0.0027) [2024-06-15 16:09:14,258][1653645] Updated weights for policy 0, policy_version 359539 (0.0012) [2024-06-15 16:09:15,871][1653645] Updated weights for policy 0, policy_version 359613 (0.0011) [2024-06-15 16:09:15,958][1648982] Fps is (10 sec: 52423.9, 60 sec: 44236.4, 300 sec: 45097.5). Total num frames: 736493568. Throughput: 0: 10638.0. Samples: 184160256. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:15,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 16:09:20,958][1648982] Fps is (10 sec: 49150.5, 60 sec: 43144.2, 300 sec: 44320.2). Total num frames: 736591872. Throughput: 0: 10547.1. Samples: 184194560. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:09:20,994][1653645] Updated weights for policy 0, policy_version 359679 (0.0013) [2024-06-15 16:09:24,892][1653645] Updated weights for policy 0, policy_version 359733 (0.0011) [2024-06-15 16:09:25,659][1651596] Signal inference workers to stop experience collection... (18600 times) [2024-06-15 16:09:25,712][1653645] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-06-15 16:09:25,888][1651596] Signal inference workers to resume experience collection... (18600 times) [2024-06-15 16:09:25,889][1653645] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-06-15 16:09:25,958][1648982] Fps is (10 sec: 32770.3, 60 sec: 43692.3, 300 sec: 44653.4). Total num frames: 736821248. Throughput: 0: 10854.5. Samples: 184261632. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:09:27,833][1653645] Updated weights for policy 0, policy_version 359824 (0.0012) [2024-06-15 16:09:28,629][1653645] Updated weights for policy 0, policy_version 359871 (0.0011) [2024-06-15 16:09:30,958][1648982] Fps is (10 sec: 42600.1, 60 sec: 42598.3, 300 sec: 44323.7). Total num frames: 737017856. Throughput: 0: 10456.2. Samples: 184326144. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:09:35,523][1653645] Updated weights for policy 0, policy_version 359952 (0.0012) [2024-06-15 16:09:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 737214464. Throughput: 0: 10433.5. Samples: 184359424. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:09:37,136][1653645] Updated weights for policy 0, policy_version 360016 (0.0011) [2024-06-15 16:09:38,161][1653645] Updated weights for policy 0, policy_version 360062 (0.0011) [2024-06-15 16:09:40,171][1653645] Updated weights for policy 0, policy_version 360121 (0.0012) [2024-06-15 16:09:40,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 737542144. Throughput: 0: 10820.3. Samples: 184429056. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:09:45,416][1653645] Updated weights for policy 0, policy_version 360183 (0.0014) [2024-06-15 16:09:45,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 737673216. Throughput: 0: 10717.9. Samples: 184493568. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:09:47,620][1653645] Updated weights for policy 0, policy_version 360252 (0.0026) [2024-06-15 16:09:49,144][1653645] Updated weights for policy 0, policy_version 360304 (0.0013) [2024-06-15 16:09:50,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.9, 300 sec: 44431.4). Total num frames: 737935360. Throughput: 0: 10956.8. Samples: 184525824. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:09:52,108][1653645] Updated weights for policy 0, policy_version 360352 (0.0013) [2024-06-15 16:09:55,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 41506.4, 300 sec: 43986.9). Total num frames: 738066432. Throughput: 0: 10956.9. Samples: 184593408. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:09:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:09:56,020][1653645] Updated weights for policy 0, policy_version 360400 (0.0012) [2024-06-15 16:09:58,533][1653645] Updated weights for policy 0, policy_version 360449 (0.0013) [2024-06-15 16:10:00,607][1653645] Updated weights for policy 0, policy_version 360514 (0.0014) [2024-06-15 16:10:00,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 738361344. Throughput: 0: 11070.7. Samples: 184658432. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:10:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:10:01,693][1653645] Updated weights for policy 0, policy_version 360576 (0.0012) [2024-06-15 16:10:04,760][1653645] Updated weights for policy 0, policy_version 360634 (0.0013) [2024-06-15 16:10:05,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 738590720. Throughput: 0: 11116.2. Samples: 184694784. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:10:05,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:10:08,196][1653645] Updated weights for policy 0, policy_version 360688 (0.0022) [2024-06-15 16:10:10,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 738754560. Throughput: 0: 11082.0. Samples: 184760320. Policy #0 lag: (min: 13.0, avg: 160.2, max: 269.0) [2024-06-15 16:10:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:10:11,656][1653645] Updated weights for policy 0, policy_version 360762 (0.0017) [2024-06-15 16:10:13,664][1651596] Signal inference workers to stop experience collection... (18650 times) [2024-06-15 16:10:13,721][1653645] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-06-15 16:10:13,944][1651596] Signal inference workers to resume experience collection... (18650 times) [2024-06-15 16:10:13,945][1653645] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-06-15 16:10:14,162][1653645] Updated weights for policy 0, policy_version 360828 (0.0022) [2024-06-15 16:10:15,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43145.1, 300 sec: 44209.0). Total num frames: 739082240. Throughput: 0: 11002.3. Samples: 184821248. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:10:16,212][1653645] Updated weights for policy 0, policy_version 360890 (0.0046) [2024-06-15 16:10:20,161][1653645] Updated weights for policy 0, policy_version 360934 (0.0012) [2024-06-15 16:10:20,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44237.2, 300 sec: 44209.0). Total num frames: 739246080. Throughput: 0: 11025.1. Samples: 184855552. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:20,963][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:10:22,503][1653645] Updated weights for policy 0, policy_version 360976 (0.0013) [2024-06-15 16:10:24,159][1653645] Updated weights for policy 0, policy_version 361040 (0.0029) [2024-06-15 16:10:24,997][1653645] Updated weights for policy 0, policy_version 361081 (0.0012) [2024-06-15 16:10:25,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 739508224. Throughput: 0: 11082.0. Samples: 184927744. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:10:27,871][1653645] Updated weights for policy 0, policy_version 361145 (0.0014) [2024-06-15 16:10:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 739639296. Throughput: 0: 11161.6. Samples: 184995840. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:10:32,273][1653645] Updated weights for policy 0, policy_version 361204 (0.0012) [2024-06-15 16:10:34,636][1653645] Updated weights for policy 0, policy_version 361264 (0.0012) [2024-06-15 16:10:35,832][1653645] Updated weights for policy 0, policy_version 361312 (0.0012) [2024-06-15 16:10:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 739966976. Throughput: 0: 11218.5. Samples: 185030656. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:10:38,841][1653645] Updated weights for policy 0, policy_version 361363 (0.0013) [2024-06-15 16:10:40,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.8, 300 sec: 43990.9). Total num frames: 740163584. Throughput: 0: 11116.1. Samples: 185093632. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:10:43,084][1653645] Updated weights for policy 0, policy_version 361428 (0.0021) [2024-06-15 16:10:45,506][1653645] Updated weights for policy 0, policy_version 361488 (0.0021) [2024-06-15 16:10:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 740360192. Throughput: 0: 11207.1. Samples: 185162752. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:10:47,798][1653645] Updated weights for policy 0, policy_version 361539 (0.0013) [2024-06-15 16:10:49,156][1653645] Updated weights for policy 0, policy_version 361599 (0.0013) [2024-06-15 16:10:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 740589568. Throughput: 0: 11002.3. Samples: 185189888. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:10:51,763][1653645] Updated weights for policy 0, policy_version 361656 (0.0014) [2024-06-15 16:10:55,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 44236.6, 300 sec: 43653.6). Total num frames: 740720640. Throughput: 0: 11002.3. Samples: 185255424. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:10:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:10:56,344][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000361712_740786176.pth... [2024-06-15 16:10:56,383][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000356592_730300416.pth [2024-06-15 16:10:56,387][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000361712_740786176.pth [2024-06-15 16:10:56,489][1653645] Updated weights for policy 0, policy_version 361714 (0.0015) [2024-06-15 16:10:58,067][1653645] Updated weights for policy 0, policy_version 361765 (0.0013) [2024-06-15 16:11:00,250][1653645] Updated weights for policy 0, policy_version 361795 (0.0016) [2024-06-15 16:11:00,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 741015552. Throughput: 0: 11207.1. Samples: 185325568. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:11:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:11:01,407][1653645] Updated weights for policy 0, policy_version 361848 (0.0013) [2024-06-15 16:11:02,637][1651596] Signal inference workers to stop experience collection... (18700 times) [2024-06-15 16:11:02,671][1653645] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-06-15 16:11:02,870][1651596] Signal inference workers to resume experience collection... (18700 times) [2024-06-15 16:11:02,871][1653645] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-06-15 16:11:03,235][1653645] Updated weights for policy 0, policy_version 361904 (0.0039) [2024-06-15 16:11:05,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 741212160. Throughput: 0: 11138.8. Samples: 185356800. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:11:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:11:07,774][1653645] Updated weights for policy 0, policy_version 361968 (0.0076) [2024-06-15 16:11:10,079][1653645] Updated weights for policy 0, policy_version 362017 (0.0014) [2024-06-15 16:11:10,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 741474304. Throughput: 0: 11036.4. Samples: 185424384. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:11:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:11:11,653][1653645] Updated weights for policy 0, policy_version 362050 (0.0011) [2024-06-15 16:11:13,017][1653645] Updated weights for policy 0, policy_version 362110 (0.0012) [2024-06-15 16:11:15,552][1653645] Updated weights for policy 0, policy_version 362172 (0.0025) [2024-06-15 16:11:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 741736448. Throughput: 0: 10911.3. Samples: 185486848. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:11:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:11:19,790][1653645] Updated weights for policy 0, policy_version 362233 (0.0018) [2024-06-15 16:11:20,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 741900288. Throughput: 0: 11013.6. Samples: 185526272. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:11:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:11:21,901][1653645] Updated weights for policy 0, policy_version 362293 (0.0012) [2024-06-15 16:11:23,848][1653645] Updated weights for policy 0, policy_version 362336 (0.0013) [2024-06-15 16:11:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 742162432. Throughput: 0: 11070.6. Samples: 185591808. Policy #0 lag: (min: 63.0, avg: 160.5, max: 319.0) [2024-06-15 16:11:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:11:26,850][1653645] Updated weights for policy 0, policy_version 362432 (0.0015) [2024-06-15 16:11:30,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 742293504. Throughput: 0: 11059.2. Samples: 185660416. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:11:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:11:32,102][1653645] Updated weights for policy 0, policy_version 362496 (0.0019) [2024-06-15 16:11:34,536][1653645] Updated weights for policy 0, policy_version 362558 (0.0014) [2024-06-15 16:11:35,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 742522880. Throughput: 0: 11025.1. Samples: 185686016. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:11:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:11:37,677][1653645] Updated weights for policy 0, policy_version 362620 (0.0012) [2024-06-15 16:11:39,276][1653645] Updated weights for policy 0, policy_version 362685 (0.0015) [2024-06-15 16:11:40,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 742785024. Throughput: 0: 10945.4. Samples: 185747968. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:11:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:11:43,482][1653645] Updated weights for policy 0, policy_version 362741 (0.0131) [2024-06-15 16:11:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 742981632. Throughput: 0: 10922.6. Samples: 185817088. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:11:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:11:46,060][1653645] Updated weights for policy 0, policy_version 362800 (0.0013) [2024-06-15 16:11:49,113][1653645] Updated weights for policy 0, policy_version 362864 (0.0012) [2024-06-15 16:11:50,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 743276544. Throughput: 0: 11070.6. Samples: 185854976. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:11:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:11:51,044][1653645] Updated weights for policy 0, policy_version 362941 (0.0014) [2024-06-15 16:11:53,767][1651596] Signal inference workers to stop experience collection... (18750 times) [2024-06-15 16:11:53,896][1653645] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-06-15 16:11:54,033][1651596] Signal inference workers to resume experience collection... (18750 times) [2024-06-15 16:11:54,044][1653645] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-06-15 16:11:55,171][1653645] Updated weights for policy 0, policy_version 362997 (0.0022) [2024-06-15 16:11:55,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 743440384. Throughput: 0: 10979.6. Samples: 185918464. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:11:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:11:57,630][1653645] Updated weights for policy 0, policy_version 363040 (0.0012) [2024-06-15 16:12:00,387][1653645] Updated weights for policy 0, policy_version 363088 (0.0012) [2024-06-15 16:12:00,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 743636992. Throughput: 0: 11047.8. Samples: 185984000. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:00,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:12:02,918][1653645] Updated weights for policy 0, policy_version 363152 (0.0012) [2024-06-15 16:12:05,800][1653645] Updated weights for policy 0, policy_version 363201 (0.0013) [2024-06-15 16:12:05,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 743833600. Throughput: 0: 10763.4. Samples: 186010624. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:12:10,676][1653645] Updated weights for policy 0, policy_version 363296 (0.0014) [2024-06-15 16:12:10,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 744030208. Throughput: 0: 10740.6. Samples: 186075136. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:12:12,993][1653645] Updated weights for policy 0, policy_version 363344 (0.0011) [2024-06-15 16:12:14,634][1653645] Updated weights for policy 0, policy_version 363408 (0.0013) [2024-06-15 16:12:15,729][1653645] Updated weights for policy 0, policy_version 363453 (0.0013) [2024-06-15 16:12:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 744357888. Throughput: 0: 10649.6. Samples: 186139648. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:12:18,387][1653645] Updated weights for policy 0, policy_version 363504 (0.0019) [2024-06-15 16:12:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 744488960. Throughput: 0: 10888.5. Samples: 186176000. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:12:22,655][1653645] Updated weights for policy 0, policy_version 363569 (0.0114) [2024-06-15 16:12:24,972][1653645] Updated weights for policy 0, policy_version 363602 (0.0013) [2024-06-15 16:12:25,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 43988.1). Total num frames: 744751104. Throughput: 0: 11161.7. Samples: 186250240. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:12:26,833][1653645] Updated weights for policy 0, policy_version 363680 (0.0012) [2024-06-15 16:12:29,188][1653645] Updated weights for policy 0, policy_version 363720 (0.0013) [2024-06-15 16:12:30,189][1653645] Updated weights for policy 0, policy_version 363776 (0.0013) [2024-06-15 16:12:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 745013248. Throughput: 0: 10956.8. Samples: 186310144. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:12:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 745144320. Throughput: 0: 10968.2. Samples: 186348544. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:12:36,071][1653645] Updated weights for policy 0, policy_version 363842 (0.0014) [2024-06-15 16:12:38,546][1653645] Updated weights for policy 0, policy_version 363936 (0.0097) [2024-06-15 16:12:38,680][1651596] Signal inference workers to stop experience collection... (18800 times) [2024-06-15 16:12:38,732][1653645] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-06-15 16:12:38,968][1651596] Signal inference workers to resume experience collection... (18800 times) [2024-06-15 16:12:38,968][1653645] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-06-15 16:12:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 745406464. Throughput: 0: 10843.0. Samples: 186406400. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:12:41,717][1653645] Updated weights for policy 0, policy_version 363984 (0.0012) [2024-06-15 16:12:42,835][1653645] Updated weights for policy 0, policy_version 364030 (0.0010) [2024-06-15 16:12:45,893][1653645] Updated weights for policy 0, policy_version 364087 (0.0017) [2024-06-15 16:12:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 745635840. Throughput: 0: 10945.5. Samples: 186476544. Policy #0 lag: (min: 9.0, avg: 102.2, max: 265.0) [2024-06-15 16:12:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:12:49,158][1653645] Updated weights for policy 0, policy_version 364144 (0.0014) [2024-06-15 16:12:50,470][1653645] Updated weights for policy 0, policy_version 364194 (0.0017) [2024-06-15 16:12:50,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 745897984. Throughput: 0: 11275.4. Samples: 186518016. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:12:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:12:53,109][1653645] Updated weights for policy 0, policy_version 364243 (0.0013) [2024-06-15 16:12:54,068][1653645] Updated weights for policy 0, policy_version 364282 (0.0016) [2024-06-15 16:12:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 746061824. Throughput: 0: 11241.3. Samples: 186580992. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:12:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:12:56,497][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000364320_746127360.pth... [2024-06-15 16:12:56,652][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000359168_735576064.pth [2024-06-15 16:12:59,587][1653645] Updated weights for policy 0, policy_version 364353 (0.0014) [2024-06-15 16:13:00,856][1653645] Updated weights for policy 0, policy_version 364414 (0.0043) [2024-06-15 16:13:00,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 746323968. Throughput: 0: 11423.3. Samples: 186653696. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:13:02,236][1653645] Updated weights for policy 0, policy_version 364473 (0.0021) [2024-06-15 16:13:04,596][1653645] Updated weights for policy 0, policy_version 364518 (0.0013) [2024-06-15 16:13:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 746586112. Throughput: 0: 11309.5. Samples: 186684928. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:13:08,843][1653645] Updated weights for policy 0, policy_version 364592 (0.0012) [2024-06-15 16:13:10,959][1648982] Fps is (10 sec: 39317.4, 60 sec: 44782.1, 300 sec: 43653.5). Total num frames: 746717184. Throughput: 0: 11172.7. Samples: 186753024. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:10,962][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:13:11,879][1653645] Updated weights for policy 0, policy_version 364643 (0.0014) [2024-06-15 16:13:13,304][1653645] Updated weights for policy 0, policy_version 364688 (0.0012) [2024-06-15 16:13:15,480][1653645] Updated weights for policy 0, policy_version 364739 (0.0012) [2024-06-15 16:13:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 747012096. Throughput: 0: 11332.2. Samples: 186820096. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:13:16,662][1653645] Updated weights for policy 0, policy_version 364790 (0.0012) [2024-06-15 16:13:19,912][1653645] Updated weights for policy 0, policy_version 364833 (0.0013) [2024-06-15 16:13:20,958][1648982] Fps is (10 sec: 52435.0, 60 sec: 45875.2, 300 sec: 44209.4). Total num frames: 747241472. Throughput: 0: 11252.6. Samples: 186854912. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:13:22,772][1653645] Updated weights for policy 0, policy_version 364880 (0.0012) [2024-06-15 16:13:23,839][1653645] Updated weights for policy 0, policy_version 364928 (0.0012) [2024-06-15 16:13:25,087][1651596] Signal inference workers to stop experience collection... (18850 times) [2024-06-15 16:13:25,131][1653645] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-06-15 16:13:25,363][1651596] Signal inference workers to resume experience collection... (18850 times) [2024-06-15 16:13:25,364][1653645] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-06-15 16:13:25,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 45875.1, 300 sec: 44209.0). Total num frames: 747503616. Throughput: 0: 11548.4. Samples: 186926080. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:25,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:13:27,298][1653645] Updated weights for policy 0, policy_version 365024 (0.0110) [2024-06-15 16:13:30,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 747634688. Throughput: 0: 11411.9. Samples: 186990080. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:13:31,578][1653645] Updated weights for policy 0, policy_version 365076 (0.0087) [2024-06-15 16:13:32,676][1653645] Updated weights for policy 0, policy_version 365118 (0.0011) [2024-06-15 16:13:35,180][1653645] Updated weights for policy 0, policy_version 365168 (0.0015) [2024-06-15 16:13:35,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 747896832. Throughput: 0: 11195.7. Samples: 187021824. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:13:37,041][1653645] Updated weights for policy 0, policy_version 365207 (0.0013) [2024-06-15 16:13:38,062][1653645] Updated weights for policy 0, policy_version 365246 (0.0011) [2024-06-15 16:13:39,486][1653645] Updated weights for policy 0, policy_version 365302 (0.0069) [2024-06-15 16:13:40,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 748158976. Throughput: 0: 11150.2. Samples: 187082752. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:13:44,452][1653645] Updated weights for policy 0, policy_version 365370 (0.0014) [2024-06-15 16:13:45,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 748290048. Throughput: 0: 11161.6. Samples: 187155968. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:13:47,602][1653645] Updated weights for policy 0, policy_version 365429 (0.0014) [2024-06-15 16:13:49,447][1653645] Updated weights for policy 0, policy_version 365472 (0.0037) [2024-06-15 16:13:50,934][1653645] Updated weights for policy 0, policy_version 365525 (0.0013) [2024-06-15 16:13:50,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 748584960. Throughput: 0: 11127.5. Samples: 187185664. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:13:55,687][1653645] Updated weights for policy 0, policy_version 365600 (0.0015) [2024-06-15 16:13:55,963][1648982] Fps is (10 sec: 45851.0, 60 sec: 44778.9, 300 sec: 44319.3). Total num frames: 748748800. Throughput: 0: 11194.7. Samples: 187256832. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:13:55,965][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:13:58,244][1653645] Updated weights for policy 0, policy_version 365664 (0.0014) [2024-06-15 16:14:00,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 748945408. Throughput: 0: 11093.3. Samples: 187319296. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:14:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:14:01,359][1653645] Updated weights for policy 0, policy_version 365716 (0.0011) [2024-06-15 16:14:03,012][1653645] Updated weights for policy 0, policy_version 365808 (0.0014) [2024-06-15 16:14:05,958][1648982] Fps is (10 sec: 45899.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 749207552. Throughput: 0: 10990.9. Samples: 187349504. Policy #0 lag: (min: 22.0, avg: 117.1, max: 278.0) [2024-06-15 16:14:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:14:07,837][1653645] Updated weights for policy 0, policy_version 365884 (0.0012) [2024-06-15 16:14:10,277][1653645] Updated weights for policy 0, policy_version 365942 (0.0012) [2024-06-15 16:14:10,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45876.1, 300 sec: 43987.0). Total num frames: 749469696. Throughput: 0: 11059.2. Samples: 187423744. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:14:13,095][1653645] Updated weights for policy 0, policy_version 365968 (0.0017) [2024-06-15 16:14:13,207][1651596] Signal inference workers to stop experience collection... (18900 times) [2024-06-15 16:14:13,236][1653645] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-06-15 16:14:13,370][1651596] Signal inference workers to resume experience collection... (18900 times) [2024-06-15 16:14:13,371][1653645] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-06-15 16:14:14,647][1653645] Updated weights for policy 0, policy_version 366037 (0.0011) [2024-06-15 16:14:15,756][1653645] Updated weights for policy 0, policy_version 366078 (0.0012) [2024-06-15 16:14:15,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45328.9, 300 sec: 44542.3). Total num frames: 749731840. Throughput: 0: 11025.0. Samples: 187486208. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:14:19,380][1653645] Updated weights for policy 0, policy_version 366134 (0.0013) [2024-06-15 16:14:20,650][1653645] Updated weights for policy 0, policy_version 366164 (0.0012) [2024-06-15 16:14:20,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 44782.7, 300 sec: 44431.1). Total num frames: 749928448. Throughput: 0: 11309.4. Samples: 187530752. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:20,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:14:24,461][1653645] Updated weights for policy 0, policy_version 366224 (0.0032) [2024-06-15 16:14:25,566][1653645] Updated weights for policy 0, policy_version 366273 (0.0018) [2024-06-15 16:14:25,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 750157824. Throughput: 0: 11366.4. Samples: 187594240. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:14:29,793][1653645] Updated weights for policy 0, policy_version 366352 (0.0012) [2024-06-15 16:14:30,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 45875.1, 300 sec: 44653.4). Total num frames: 750387200. Throughput: 0: 11332.3. Samples: 187665920. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:14:31,804][1653645] Updated weights for policy 0, policy_version 366416 (0.0012) [2024-06-15 16:14:35,945][1653645] Updated weights for policy 0, policy_version 366467 (0.0042) [2024-06-15 16:14:35,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 750518272. Throughput: 0: 11286.8. Samples: 187693568. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:14:37,456][1653645] Updated weights for policy 0, policy_version 366544 (0.0012) [2024-06-15 16:14:40,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 750780416. Throughput: 0: 11151.5. Samples: 187758592. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:40,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 16:14:41,669][1653645] Updated weights for policy 0, policy_version 366608 (0.0011) [2024-06-15 16:14:42,909][1653645] Updated weights for policy 0, policy_version 366656 (0.0013) [2024-06-15 16:14:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 751042560. Throughput: 0: 11241.2. Samples: 187825152. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:14:48,302][1653645] Updated weights for policy 0, policy_version 366737 (0.0016) [2024-06-15 16:14:49,896][1653645] Updated weights for policy 0, policy_version 366803 (0.0096) [2024-06-15 16:14:50,699][1653645] Updated weights for policy 0, policy_version 366841 (0.0012) [2024-06-15 16:14:50,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 751304704. Throughput: 0: 11309.5. Samples: 187858432. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:14:53,750][1653645] Updated weights for policy 0, policy_version 366896 (0.0014) [2024-06-15 16:14:55,730][1651596] Signal inference workers to stop experience collection... (18950 times) [2024-06-15 16:14:55,810][1653645] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-06-15 16:14:55,916][1651596] Signal inference workers to resume experience collection... (18950 times) [2024-06-15 16:14:55,917][1653645] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-06-15 16:14:55,958][1648982] Fps is (10 sec: 49150.0, 60 sec: 46425.2, 300 sec: 44653.3). Total num frames: 751534080. Throughput: 0: 11241.1. Samples: 187929600. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:14:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:14:56,013][1653645] Updated weights for policy 0, policy_version 366966 (0.0012) [2024-06-15 16:14:56,196][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000366976_751566848.pth... [2024-06-15 16:14:56,251][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000361712_740786176.pth [2024-06-15 16:15:00,466][1653645] Updated weights for policy 0, policy_version 367024 (0.0013) [2024-06-15 16:15:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 751697920. Throughput: 0: 11355.1. Samples: 187997184. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:15:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:15:01,768][1653645] Updated weights for policy 0, policy_version 367072 (0.0012) [2024-06-15 16:15:04,654][1653645] Updated weights for policy 0, policy_version 367106 (0.0013) [2024-06-15 16:15:05,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 751927296. Throughput: 0: 11025.1. Samples: 188026880. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:15:05,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:15:06,143][1653645] Updated weights for policy 0, policy_version 367166 (0.0011) [2024-06-15 16:15:07,955][1653645] Updated weights for policy 0, policy_version 367229 (0.0024) [2024-06-15 16:15:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 752091136. Throughput: 0: 10990.9. Samples: 188088832. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:15:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:15:14,040][1653645] Updated weights for policy 0, policy_version 367297 (0.0037) [2024-06-15 16:15:15,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 752353280. Throughput: 0: 10911.2. Samples: 188156928. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:15:15,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 16:15:16,429][1653645] Updated weights for policy 0, policy_version 367361 (0.0014) [2024-06-15 16:15:17,716][1653645] Updated weights for policy 0, policy_version 367419 (0.0012) [2024-06-15 16:15:19,293][1653645] Updated weights for policy 0, policy_version 367472 (0.0014) [2024-06-15 16:15:20,958][1648982] Fps is (10 sec: 52426.9, 60 sec: 44782.9, 300 sec: 44431.1). Total num frames: 752615424. Throughput: 0: 11093.2. Samples: 188192768. Policy #0 lag: (min: 15.0, avg: 109.5, max: 271.0) [2024-06-15 16:15:20,961][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:15:25,207][1653645] Updated weights for policy 0, policy_version 367536 (0.0082) [2024-06-15 16:15:25,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 752779264. Throughput: 0: 11264.1. Samples: 188265472. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:25,960][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:15:26,336][1653645] Updated weights for policy 0, policy_version 367600 (0.0013) [2024-06-15 16:15:28,278][1653645] Updated weights for policy 0, policy_version 367655 (0.0013) [2024-06-15 16:15:30,214][1653645] Updated weights for policy 0, policy_version 367712 (0.0088) [2024-06-15 16:15:30,892][1653645] Updated weights for policy 0, policy_version 367744 (0.0017) [2024-06-15 16:15:30,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 753139712. Throughput: 0: 11127.5. Samples: 188325888. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:15:35,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 753172480. Throughput: 0: 11207.1. Samples: 188362752. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:15:37,060][1653645] Updated weights for policy 0, policy_version 367824 (0.0103) [2024-06-15 16:15:38,849][1653645] Updated weights for policy 0, policy_version 367892 (0.0015) [2024-06-15 16:15:39,127][1651596] Signal inference workers to stop experience collection... (19000 times) [2024-06-15 16:15:39,231][1653645] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-06-15 16:15:39,448][1651596] Signal inference workers to resume experience collection... (19000 times) [2024-06-15 16:15:39,450][1653645] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-06-15 16:15:39,895][1653645] Updated weights for policy 0, policy_version 367936 (0.0090) [2024-06-15 16:15:40,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 753532928. Throughput: 0: 11150.3. Samples: 188431360. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:15:45,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 753664000. Throughput: 0: 11207.1. Samples: 188501504. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:15:47,131][1653645] Updated weights for policy 0, policy_version 368004 (0.0023) [2024-06-15 16:15:48,815][1653645] Updated weights for policy 0, policy_version 368065 (0.0017) [2024-06-15 16:15:50,227][1653645] Updated weights for policy 0, policy_version 368130 (0.0012) [2024-06-15 16:15:50,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 753991680. Throughput: 0: 11332.3. Samples: 188536832. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:15:53,557][1653645] Updated weights for policy 0, policy_version 368195 (0.0012) [2024-06-15 16:15:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44237.0, 300 sec: 44653.3). Total num frames: 754188288. Throughput: 0: 11286.7. Samples: 188596736. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:15:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:15:58,711][1653645] Updated weights for policy 0, policy_version 368260 (0.0014) [2024-06-15 16:16:00,078][1653645] Updated weights for policy 0, policy_version 368316 (0.0011) [2024-06-15 16:16:00,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 754319360. Throughput: 0: 11275.4. Samples: 188664320. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:16:02,797][1653645] Updated weights for policy 0, policy_version 368400 (0.0012) [2024-06-15 16:16:03,579][1653645] Updated weights for policy 0, policy_version 368446 (0.0013) [2024-06-15 16:16:05,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 754581504. Throughput: 0: 11104.8. Samples: 188692480. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:16:06,790][1653645] Updated weights for policy 0, policy_version 368496 (0.0013) [2024-06-15 16:16:10,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 754745344. Throughput: 0: 11070.6. Samples: 188763648. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:16:11,260][1653645] Updated weights for policy 0, policy_version 368544 (0.0011) [2024-06-15 16:16:13,681][1653645] Updated weights for policy 0, policy_version 368609 (0.0012) [2024-06-15 16:16:14,740][1653645] Updated weights for policy 0, policy_version 368662 (0.0012) [2024-06-15 16:16:15,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45875.0, 300 sec: 44764.4). Total num frames: 755105792. Throughput: 0: 11116.0. Samples: 188826112. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:15,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:16:17,866][1653645] Updated weights for policy 0, policy_version 368707 (0.0014) [2024-06-15 16:16:18,956][1653645] Updated weights for policy 0, policy_version 368768 (0.0042) [2024-06-15 16:16:20,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 755236864. Throughput: 0: 11218.5. Samples: 188867584. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:16:23,379][1653645] Updated weights for policy 0, policy_version 368831 (0.0140) [2024-06-15 16:16:25,153][1651596] Signal inference workers to stop experience collection... (19050 times) [2024-06-15 16:16:25,219][1653645] Updated weights for policy 0, policy_version 368897 (0.0013) [2024-06-15 16:16:25,257][1653645] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-06-15 16:16:25,414][1651596] Signal inference workers to resume experience collection... (19050 times) [2024-06-15 16:16:25,415][1653645] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-06-15 16:16:25,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 46421.1, 300 sec: 44986.5). Total num frames: 755564544. Throughput: 0: 11195.7. Samples: 188935168. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:25,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 16:16:26,229][1653645] Updated weights for policy 0, policy_version 368956 (0.0012) [2024-06-15 16:16:29,976][1653645] Updated weights for policy 0, policy_version 369017 (0.0013) [2024-06-15 16:16:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 755761152. Throughput: 0: 11104.7. Samples: 189001216. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:16:34,220][1653645] Updated weights for policy 0, policy_version 369059 (0.0011) [2024-06-15 16:16:35,389][1653645] Updated weights for policy 0, policy_version 369110 (0.0012) [2024-06-15 16:16:35,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 46967.5, 300 sec: 44764.5). Total num frames: 755990528. Throughput: 0: 11275.4. Samples: 189044224. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:16:37,749][1653645] Updated weights for policy 0, policy_version 369208 (0.0025) [2024-06-15 16:16:40,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 756187136. Throughput: 0: 11298.1. Samples: 189105152. Policy #0 lag: (min: 31.0, avg: 110.5, max: 287.0) [2024-06-15 16:16:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:16:41,056][1653645] Updated weights for policy 0, policy_version 369248 (0.0013) [2024-06-15 16:16:45,162][1653645] Updated weights for policy 0, policy_version 369281 (0.0011) [2024-06-15 16:16:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 756350976. Throughput: 0: 11434.6. Samples: 189178880. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:16:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:16:46,511][1653645] Updated weights for policy 0, policy_version 369344 (0.0013) [2024-06-15 16:16:48,333][1653645] Updated weights for policy 0, policy_version 369424 (0.0014) [2024-06-15 16:16:49,290][1653645] Updated weights for policy 0, policy_version 369471 (0.0013) [2024-06-15 16:16:50,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 756678656. Throughput: 0: 11423.3. Samples: 189206528. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:16:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:16:53,246][1653645] Updated weights for policy 0, policy_version 369530 (0.0017) [2024-06-15 16:16:55,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 756809728. Throughput: 0: 11468.8. Samples: 189279744. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:16:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:16:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000369536_756809728.pth... [2024-06-15 16:16:56,118][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000364320_746127360.pth [2024-06-15 16:16:58,228][1653645] Updated weights for policy 0, policy_version 369616 (0.0013) [2024-06-15 16:17:00,188][1653645] Updated weights for policy 0, policy_version 369680 (0.0013) [2024-06-15 16:17:00,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 47513.5, 300 sec: 45208.7). Total num frames: 757170176. Throughput: 0: 11571.3. Samples: 189346816. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:00,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:17:01,160][1653645] Updated weights for policy 0, policy_version 369723 (0.0019) [2024-06-15 16:17:03,991][1653645] Updated weights for policy 0, policy_version 369791 (0.0017) [2024-06-15 16:17:05,959][1648982] Fps is (10 sec: 52429.9, 60 sec: 45875.3, 300 sec: 45097.7). Total num frames: 757334016. Throughput: 0: 11446.0. Samples: 189382656. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:05,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:17:10,061][1653645] Updated weights for policy 0, policy_version 369872 (0.0114) [2024-06-15 16:17:10,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 757596160. Throughput: 0: 11559.9. Samples: 189455360. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:10,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 16:17:10,969][1651596] Signal inference workers to stop experience collection... (19100 times) [2024-06-15 16:17:10,978][1653645] Updated weights for policy 0, policy_version 369921 (0.0013) [2024-06-15 16:17:11,048][1653645] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-06-15 16:17:11,140][1651596] Signal inference workers to resume experience collection... (19100 times) [2024-06-15 16:17:11,141][1653645] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-06-15 16:17:12,185][1653645] Updated weights for policy 0, policy_version 369982 (0.0011) [2024-06-15 16:17:15,605][1653645] Updated weights for policy 0, policy_version 370048 (0.0013) [2024-06-15 16:17:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.5, 300 sec: 45319.8). Total num frames: 757858304. Throughput: 0: 11491.6. Samples: 189518336. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:17:20,915][1653645] Updated weights for policy 0, policy_version 370128 (0.0013) [2024-06-15 16:17:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 44986.6). Total num frames: 758022144. Throughput: 0: 11571.2. Samples: 189564928. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:17:22,677][1653645] Updated weights for policy 0, policy_version 370195 (0.0015) [2024-06-15 16:17:23,575][1653645] Updated weights for policy 0, policy_version 370239 (0.0016) [2024-06-15 16:17:25,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44783.1, 300 sec: 44875.5). Total num frames: 758251520. Throughput: 0: 11503.0. Samples: 189622784. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:17:27,072][1653645] Updated weights for policy 0, policy_version 370288 (0.0013) [2024-06-15 16:17:30,788][1653645] Updated weights for policy 0, policy_version 370320 (0.0011) [2024-06-15 16:17:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 758415360. Throughput: 0: 11502.9. Samples: 189696512. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:30,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 16:17:32,716][1653645] Updated weights for policy 0, policy_version 370374 (0.0014) [2024-06-15 16:17:34,726][1653645] Updated weights for policy 0, policy_version 370454 (0.0017) [2024-06-15 16:17:35,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 758775808. Throughput: 0: 11616.7. Samples: 189729280. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:17:37,561][1653645] Updated weights for policy 0, policy_version 370497 (0.0011) [2024-06-15 16:17:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.2, 300 sec: 44986.6). Total num frames: 758906880. Throughput: 0: 11229.9. Samples: 189785088. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:17:42,829][1653645] Updated weights for policy 0, policy_version 370576 (0.0012) [2024-06-15 16:17:43,720][1653645] Updated weights for policy 0, policy_version 370624 (0.0013) [2024-06-15 16:17:45,614][1653645] Updated weights for policy 0, policy_version 370691 (0.0121) [2024-06-15 16:17:45,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 47513.6, 300 sec: 45097.6). Total num frames: 759201792. Throughput: 0: 11434.7. Samples: 189861376. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:17:46,990][1653645] Updated weights for policy 0, policy_version 370746 (0.0013) [2024-06-15 16:17:50,640][1653645] Updated weights for policy 0, policy_version 370816 (0.0082) [2024-06-15 16:17:50,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 45874.9, 300 sec: 45319.8). Total num frames: 759431168. Throughput: 0: 11275.3. Samples: 189890048. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:17:55,617][1653645] Updated weights for policy 0, policy_version 370872 (0.0014) [2024-06-15 16:17:55,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 759562240. Throughput: 0: 11229.8. Samples: 189960704. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:17:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:17:57,160][1651596] Signal inference workers to stop experience collection... (19150 times) [2024-06-15 16:17:57,279][1653645] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-06-15 16:17:57,470][1651596] Signal inference workers to resume experience collection... (19150 times) [2024-06-15 16:17:57,471][1653645] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-06-15 16:17:57,689][1653645] Updated weights for policy 0, policy_version 370930 (0.0011) [2024-06-15 16:17:59,523][1653645] Updated weights for policy 0, policy_version 370999 (0.0016) [2024-06-15 16:18:00,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 759824384. Throughput: 0: 11161.5. Samples: 190020608. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:18:00,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:18:02,217][1653645] Updated weights for policy 0, policy_version 371044 (0.0012) [2024-06-15 16:18:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.5, 300 sec: 44875.6). Total num frames: 759955456. Throughput: 0: 10843.0. Samples: 190052864. Policy #0 lag: (min: 13.0, avg: 85.5, max: 269.0) [2024-06-15 16:18:05,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 16:18:06,904][1653645] Updated weights for policy 0, policy_version 371088 (0.0012) [2024-06-15 16:18:08,138][1653645] Updated weights for policy 0, policy_version 371135 (0.0010) [2024-06-15 16:18:10,180][1653645] Updated weights for policy 0, policy_version 371200 (0.0012) [2024-06-15 16:18:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 760250368. Throughput: 0: 11081.9. Samples: 190121472. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:18:11,671][1653645] Updated weights for policy 0, policy_version 371257 (0.0099) [2024-06-15 16:18:14,720][1653645] Updated weights for policy 0, policy_version 371321 (0.0159) [2024-06-15 16:18:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 760479744. Throughput: 0: 10683.7. Samples: 190177280. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:18:20,953][1653645] Updated weights for policy 0, policy_version 371377 (0.0013) [2024-06-15 16:18:20,958][1648982] Fps is (10 sec: 32768.8, 60 sec: 42598.5, 300 sec: 44320.1). Total num frames: 760578048. Throughput: 0: 10774.8. Samples: 190214144. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:18:22,182][1653645] Updated weights for policy 0, policy_version 371429 (0.0013) [2024-06-15 16:18:23,887][1653645] Updated weights for policy 0, policy_version 371490 (0.0011) [2024-06-15 16:18:25,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.3, 300 sec: 44875.4). Total num frames: 760872960. Throughput: 0: 10888.4. Samples: 190275072. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:18:26,758][1653645] Updated weights for policy 0, policy_version 371568 (0.0030) [2024-06-15 16:18:30,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 761004032. Throughput: 0: 10683.7. Samples: 190342144. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:18:33,375][1653645] Updated weights for policy 0, policy_version 371634 (0.0012) [2024-06-15 16:18:34,990][1653645] Updated weights for policy 0, policy_version 371696 (0.0013) [2024-06-15 16:18:35,958][1648982] Fps is (10 sec: 42601.1, 60 sec: 42052.2, 300 sec: 44542.3). Total num frames: 761298944. Throughput: 0: 10786.2. Samples: 190375424. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:18:36,591][1653645] Updated weights for policy 0, policy_version 371760 (0.0012) [2024-06-15 16:18:38,566][1653645] Updated weights for policy 0, policy_version 371781 (0.0012) [2024-06-15 16:18:40,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 761528320. Throughput: 0: 10558.7. Samples: 190435840. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:18:44,380][1651596] Signal inference workers to stop experience collection... (19200 times) [2024-06-15 16:18:44,428][1653645] Updated weights for policy 0, policy_version 371843 (0.0013) [2024-06-15 16:18:44,449][1653645] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-06-15 16:18:44,669][1651596] Signal inference workers to resume experience collection... (19200 times) [2024-06-15 16:18:44,670][1653645] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-06-15 16:18:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 40960.0, 300 sec: 44320.1). Total num frames: 761659392. Throughput: 0: 10752.1. Samples: 190504448. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:18:46,102][1653645] Updated weights for policy 0, policy_version 371907 (0.0097) [2024-06-15 16:18:47,637][1653645] Updated weights for policy 0, policy_version 371968 (0.0014) [2024-06-15 16:18:50,959][1648982] Fps is (10 sec: 39318.4, 60 sec: 41505.8, 300 sec: 44654.0). Total num frames: 761921536. Throughput: 0: 10558.4. Samples: 190528000. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:18:51,221][1653645] Updated weights for policy 0, policy_version 372048 (0.0013) [2024-06-15 16:18:52,385][1653645] Updated weights for policy 0, policy_version 372096 (0.0012) [2024-06-15 16:18:55,958][1648982] Fps is (10 sec: 39319.8, 60 sec: 41506.1, 300 sec: 44431.1). Total num frames: 762052608. Throughput: 0: 10615.4. Samples: 190599168. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:18:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:18:55,973][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000372096_762052608.pth... [2024-06-15 16:18:56,212][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000366976_751566848.pth [2024-06-15 16:18:57,807][1653645] Updated weights for policy 0, policy_version 372150 (0.0027) [2024-06-15 16:18:59,322][1653645] Updated weights for policy 0, policy_version 372211 (0.0029) [2024-06-15 16:19:00,459][1653645] Updated weights for policy 0, policy_version 372272 (0.0014) [2024-06-15 16:19:00,958][1648982] Fps is (10 sec: 52432.7, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 762445824. Throughput: 0: 10695.1. Samples: 190658560. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:19:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:19:05,041][1653645] Updated weights for policy 0, policy_version 372348 (0.0186) [2024-06-15 16:19:05,958][1648982] Fps is (10 sec: 52431.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 762576896. Throughput: 0: 10695.1. Samples: 190695424. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:19:05,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 16:19:09,381][1653645] Updated weights for policy 0, policy_version 372388 (0.0063) [2024-06-15 16:19:10,706][1653645] Updated weights for policy 0, policy_version 372448 (0.0013) [2024-06-15 16:19:10,960][1648982] Fps is (10 sec: 32768.0, 60 sec: 42052.3, 300 sec: 44209.1). Total num frames: 762773504. Throughput: 0: 10763.5. Samples: 190759424. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:19:10,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:19:12,569][1653645] Updated weights for policy 0, policy_version 372539 (0.0012) [2024-06-15 16:19:15,831][1653645] Updated weights for policy 0, policy_version 372578 (0.0013) [2024-06-15 16:19:15,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 42598.5, 300 sec: 44431.2). Total num frames: 763035648. Throughput: 0: 10786.1. Samples: 190827520. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:19:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:19:20,380][1653645] Updated weights for policy 0, policy_version 372614 (0.0013) [2024-06-15 16:19:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 763166720. Throughput: 0: 10808.9. Samples: 190861824. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:19:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:19:21,811][1653645] Updated weights for policy 0, policy_version 372673 (0.0018) [2024-06-15 16:19:22,831][1653645] Updated weights for policy 0, policy_version 372724 (0.0017) [2024-06-15 16:19:23,092][1651596] Signal inference workers to stop experience collection... (19250 times) [2024-06-15 16:19:23,127][1653645] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-06-15 16:19:23,316][1651596] Signal inference workers to resume experience collection... (19250 times) [2024-06-15 16:19:23,317][1653645] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-06-15 16:19:24,383][1653645] Updated weights for policy 0, policy_version 372800 (0.0012) [2024-06-15 16:19:25,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 43690.9, 300 sec: 44431.1). Total num frames: 763494400. Throughput: 0: 10922.6. Samples: 190927360. Policy #0 lag: (min: 7.0, avg: 83.5, max: 263.0) [2024-06-15 16:19:25,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:19:27,809][1653645] Updated weights for policy 0, policy_version 372857 (0.0013) [2024-06-15 16:19:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 763625472. Throughput: 0: 10934.0. Samples: 190996480. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:19:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:19:33,553][1653645] Updated weights for policy 0, policy_version 372945 (0.0012) [2024-06-15 16:19:34,353][1653645] Updated weights for policy 0, policy_version 372986 (0.0019) [2024-06-15 16:19:35,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44782.7, 300 sec: 44764.4). Total num frames: 763985920. Throughput: 0: 11070.7. Samples: 191026176. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:19:35,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:19:38,489][1653645] Updated weights for policy 0, policy_version 373072 (0.0023) [2024-06-15 16:19:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 764149760. Throughput: 0: 10843.1. Samples: 191087104. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:19:40,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:19:44,385][1653645] Updated weights for policy 0, policy_version 373121 (0.0014) [2024-06-15 16:19:45,706][1653645] Updated weights for policy 0, policy_version 373184 (0.0013) [2024-06-15 16:19:45,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 764280832. Throughput: 0: 11138.8. Samples: 191159808. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:19:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:19:47,611][1653645] Updated weights for policy 0, policy_version 373264 (0.0015) [2024-06-15 16:19:48,900][1653645] Updated weights for policy 0, policy_version 373312 (0.0050) [2024-06-15 16:19:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44237.4, 300 sec: 44209.1). Total num frames: 764575744. Throughput: 0: 10831.6. Samples: 191182848. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:19:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:19:55,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 43986.8). Total num frames: 764674048. Throughput: 0: 10990.9. Samples: 191254016. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:19:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:19:57,062][1653645] Updated weights for policy 0, policy_version 373408 (0.0015) [2024-06-15 16:19:58,531][1653645] Updated weights for policy 0, policy_version 373477 (0.0013) [2024-06-15 16:19:59,875][1653645] Updated weights for policy 0, policy_version 373527 (0.0013) [2024-06-15 16:20:00,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 765067264. Throughput: 0: 10877.2. Samples: 191316992. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:20:02,765][1653645] Updated weights for policy 0, policy_version 373600 (0.0014) [2024-06-15 16:20:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 765198336. Throughput: 0: 10877.2. Samples: 191351296. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:20:08,694][1653645] Updated weights for policy 0, policy_version 373648 (0.0014) [2024-06-15 16:20:09,656][1651596] Signal inference workers to stop experience collection... (19300 times) [2024-06-15 16:20:09,708][1653645] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-06-15 16:20:09,866][1651596] Signal inference workers to resume experience collection... (19300 times) [2024-06-15 16:20:09,867][1653645] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-06-15 16:20:10,595][1653645] Updated weights for policy 0, policy_version 373728 (0.0134) [2024-06-15 16:20:10,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 765394944. Throughput: 0: 11093.4. Samples: 191426560. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:20:12,395][1653645] Updated weights for policy 0, policy_version 373796 (0.0014) [2024-06-15 16:20:14,251][1653645] Updated weights for policy 0, policy_version 373825 (0.0012) [2024-06-15 16:20:15,448][1653645] Updated weights for policy 0, policy_version 373880 (0.0012) [2024-06-15 16:20:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 765722624. Throughput: 0: 10763.4. Samples: 191480832. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:20:20,947][1653645] Updated weights for policy 0, policy_version 373920 (0.0013) [2024-06-15 16:20:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 765788160. Throughput: 0: 10968.2. Samples: 191519744. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:20:22,669][1653645] Updated weights for policy 0, policy_version 373987 (0.0013) [2024-06-15 16:20:24,094][1653645] Updated weights for policy 0, policy_version 374050 (0.0149) [2024-06-15 16:20:25,902][1653645] Updated weights for policy 0, policy_version 374096 (0.0013) [2024-06-15 16:20:25,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 766148608. Throughput: 0: 11047.8. Samples: 191584256. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:20:26,963][1653645] Updated weights for policy 0, policy_version 374144 (0.0026) [2024-06-15 16:20:30,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 766246912. Throughput: 0: 10945.5. Samples: 191652352. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:20:34,241][1653645] Updated weights for policy 0, policy_version 374229 (0.0013) [2024-06-15 16:20:35,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 42598.6, 300 sec: 44098.0). Total num frames: 766541824. Throughput: 0: 11229.9. Samples: 191688192. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:35,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:20:36,110][1653645] Updated weights for policy 0, policy_version 374304 (0.0016) [2024-06-15 16:20:38,537][1653645] Updated weights for policy 0, policy_version 374362 (0.0026) [2024-06-15 16:20:39,351][1653645] Updated weights for policy 0, policy_version 374397 (0.0049) [2024-06-15 16:20:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 766771200. Throughput: 0: 10854.4. Samples: 191742464. Policy #0 lag: (min: 5.0, avg: 138.1, max: 261.0) [2024-06-15 16:20:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:20:45,797][1653645] Updated weights for policy 0, policy_version 374450 (0.0013) [2024-06-15 16:20:45,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43144.7, 300 sec: 43653.6). Total num frames: 766869504. Throughput: 0: 11059.2. Samples: 191814656. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:20:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:20:47,649][1653645] Updated weights for policy 0, policy_version 374528 (0.0012) [2024-06-15 16:20:50,883][1651596] Signal inference workers to stop experience collection... (19350 times) [2024-06-15 16:20:50,914][1653645] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-06-15 16:20:50,957][1648982] Fps is (10 sec: 39322.4, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 767164416. Throughput: 0: 10831.7. Samples: 191838720. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:20:50,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 16:20:51,238][1651596] Signal inference workers to resume experience collection... (19350 times) [2024-06-15 16:20:51,239][1653645] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-06-15 16:20:51,243][1653645] Updated weights for policy 0, policy_version 374608 (0.0019) [2024-06-15 16:20:52,470][1653645] Updated weights for policy 0, policy_version 374656 (0.0036) [2024-06-15 16:20:55,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 767295488. Throughput: 0: 10581.3. Samples: 191902720. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:20:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:20:55,976][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000374656_767295488.pth... [2024-06-15 16:20:56,040][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000369536_756809728.pth [2024-06-15 16:20:58,480][1653645] Updated weights for policy 0, policy_version 374720 (0.0012) [2024-06-15 16:21:00,036][1653645] Updated weights for policy 0, policy_version 374786 (0.0013) [2024-06-15 16:21:00,958][1648982] Fps is (10 sec: 49149.7, 60 sec: 43144.3, 300 sec: 44320.1). Total num frames: 767655936. Throughput: 0: 10740.6. Samples: 191964160. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:21:03,974][1653645] Updated weights for policy 0, policy_version 374896 (0.0015) [2024-06-15 16:21:05,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 767819776. Throughput: 0: 10615.5. Samples: 191997440. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:21:10,467][1653645] Updated weights for policy 0, policy_version 374971 (0.0107) [2024-06-15 16:21:10,960][1648982] Fps is (10 sec: 29491.7, 60 sec: 42598.3, 300 sec: 43542.6). Total num frames: 767950848. Throughput: 0: 10706.5. Samples: 192066048. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:10,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:21:11,504][1653645] Updated weights for policy 0, policy_version 375012 (0.0012) [2024-06-15 16:21:13,477][1653645] Updated weights for policy 0, policy_version 375092 (0.0030) [2024-06-15 16:21:15,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 768311296. Throughput: 0: 10524.4. Samples: 192125952. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:21:15,987][1653645] Updated weights for policy 0, policy_version 375162 (0.0013) [2024-06-15 16:21:20,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 768344064. Throughput: 0: 10478.9. Samples: 192159744. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:21:23,317][1653645] Updated weights for policy 0, policy_version 375235 (0.0027) [2024-06-15 16:21:24,674][1653645] Updated weights for policy 0, policy_version 375286 (0.0013) [2024-06-15 16:21:25,958][1648982] Fps is (10 sec: 36043.7, 60 sec: 42052.2, 300 sec: 43764.7). Total num frames: 768671744. Throughput: 0: 10842.9. Samples: 192230400. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:21:26,822][1653645] Updated weights for policy 0, policy_version 375392 (0.0084) [2024-06-15 16:21:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 768868352. Throughput: 0: 10592.7. Samples: 192291328. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:21:33,267][1653645] Updated weights for policy 0, policy_version 375456 (0.0182) [2024-06-15 16:21:35,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 41506.1, 300 sec: 43542.6). Total num frames: 769032192. Throughput: 0: 10899.9. Samples: 192329216. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:21:36,774][1651596] Signal inference workers to stop experience collection... (19400 times) [2024-06-15 16:21:36,833][1653645] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-06-15 16:21:37,016][1651596] Signal inference workers to resume experience collection... (19400 times) [2024-06-15 16:21:37,016][1653645] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-06-15 16:21:37,018][1653645] Updated weights for policy 0, policy_version 375552 (0.0131) [2024-06-15 16:21:38,783][1653645] Updated weights for policy 0, policy_version 375632 (0.0062) [2024-06-15 16:21:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 769392640. Throughput: 0: 10661.0. Samples: 192382464. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:21:44,929][1653645] Updated weights for policy 0, policy_version 375696 (0.0011) [2024-06-15 16:21:45,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 769490944. Throughput: 0: 10934.1. Samples: 192456192. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:21:47,797][1653645] Updated weights for policy 0, policy_version 375747 (0.0044) [2024-06-15 16:21:48,943][1653645] Updated weights for policy 0, policy_version 375793 (0.0038) [2024-06-15 16:21:50,548][1653645] Updated weights for policy 0, policy_version 375868 (0.0012) [2024-06-15 16:21:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 769785856. Throughput: 0: 10934.1. Samples: 192489472. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:21:52,318][1653645] Updated weights for policy 0, policy_version 375936 (0.0123) [2024-06-15 16:21:55,958][1648982] Fps is (10 sec: 42596.5, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 769916928. Throughput: 0: 10774.7. Samples: 192550912. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:21:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:21:59,246][1653645] Updated weights for policy 0, policy_version 376008 (0.0013) [2024-06-15 16:22:00,687][1653645] Updated weights for policy 0, policy_version 376080 (0.0014) [2024-06-15 16:22:00,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42598.6, 300 sec: 43653.6). Total num frames: 770211840. Throughput: 0: 11104.7. Samples: 192625664. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:22:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:22:02,386][1653645] Updated weights for policy 0, policy_version 376144 (0.0012) [2024-06-15 16:22:05,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 770441216. Throughput: 0: 10945.4. Samples: 192652288. Policy #0 lag: (min: 24.0, avg: 84.5, max: 280.0) [2024-06-15 16:22:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:22:08,424][1653645] Updated weights for policy 0, policy_version 376210 (0.0016) [2024-06-15 16:22:10,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 770572288. Throughput: 0: 10911.3. Samples: 192721408. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:22:11,856][1653645] Updated weights for policy 0, policy_version 376272 (0.0014) [2024-06-15 16:22:13,977][1653645] Updated weights for policy 0, policy_version 376357 (0.0123) [2024-06-15 16:22:15,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 770932736. Throughput: 0: 10843.0. Samples: 192779264. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:15,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:22:16,043][1653645] Updated weights for policy 0, policy_version 376442 (0.0012) [2024-06-15 16:22:20,557][1651596] Signal inference workers to stop experience collection... (19450 times) [2024-06-15 16:22:20,653][1653645] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-06-15 16:22:20,863][1651596] Signal inference workers to resume experience collection... (19450 times) [2024-06-15 16:22:20,864][1653645] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-06-15 16:22:20,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.8, 300 sec: 43209.3). Total num frames: 770998272. Throughput: 0: 10695.1. Samples: 192810496. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:22:21,466][1653645] Updated weights for policy 0, policy_version 376485 (0.0013) [2024-06-15 16:22:25,326][1653645] Updated weights for policy 0, policy_version 376545 (0.0013) [2024-06-15 16:22:25,958][1648982] Fps is (10 sec: 26214.6, 60 sec: 42052.6, 300 sec: 43320.4). Total num frames: 771194880. Throughput: 0: 11104.7. Samples: 192882176. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:22:27,281][1653645] Updated weights for policy 0, policy_version 376627 (0.0012) [2024-06-15 16:22:29,067][1653645] Updated weights for policy 0, policy_version 376703 (0.0122) [2024-06-15 16:22:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 771489792. Throughput: 0: 10547.2. Samples: 192930816. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:22:34,434][1653645] Updated weights for policy 0, policy_version 376759 (0.0018) [2024-06-15 16:22:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 771620864. Throughput: 0: 10615.5. Samples: 192967168. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:22:38,194][1653645] Updated weights for policy 0, policy_version 376819 (0.0021) [2024-06-15 16:22:40,074][1653645] Updated weights for policy 0, policy_version 376897 (0.0012) [2024-06-15 16:22:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 771948544. Throughput: 0: 10661.0. Samples: 193030656. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:22:45,493][1653645] Updated weights for policy 0, policy_version 376961 (0.0014) [2024-06-15 16:22:45,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 42598.3, 300 sec: 42765.1). Total num frames: 772046848. Throughput: 0: 10444.8. Samples: 193095680. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:45,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:22:50,023][1653645] Updated weights for policy 0, policy_version 377040 (0.0028) [2024-06-15 16:22:50,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 40960.0, 300 sec: 42987.2). Total num frames: 772243456. Throughput: 0: 10581.3. Samples: 193128448. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:22:52,605][1653645] Updated weights for policy 0, policy_version 377136 (0.0011) [2024-06-15 16:22:54,092][1653645] Updated weights for policy 0, policy_version 377210 (0.0014) [2024-06-15 16:22:55,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.8, 300 sec: 43098.2). Total num frames: 772538368. Throughput: 0: 10251.4. Samples: 193182720. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:22:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:22:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000377216_772538368.pth... [2024-06-15 16:22:56,026][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000372096_762052608.pth [2024-06-15 16:22:58,155][1653645] Updated weights for policy 0, policy_version 377250 (0.0027) [2024-06-15 16:23:00,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 43098.3). Total num frames: 772669440. Throughput: 0: 10615.5. Samples: 193256960. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:23:00,958][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 16:23:02,688][1653645] Updated weights for policy 0, policy_version 377304 (0.0027) [2024-06-15 16:23:04,157][1651596] Signal inference workers to stop experience collection... (19500 times) [2024-06-15 16:23:04,177][1653645] Updated weights for policy 0, policy_version 377361 (0.0014) [2024-06-15 16:23:04,216][1653645] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-06-15 16:23:04,423][1651596] Signal inference workers to resume experience collection... (19500 times) [2024-06-15 16:23:04,423][1653645] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-06-15 16:23:05,268][1653645] Updated weights for policy 0, policy_version 377411 (0.0013) [2024-06-15 16:23:05,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 42598.5, 300 sec: 43209.4). Total num frames: 772997120. Throughput: 0: 10717.9. Samples: 193292800. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:23:05,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 16:23:08,572][1653645] Updated weights for policy 0, policy_version 377488 (0.0012) [2024-06-15 16:23:10,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 773193728. Throughput: 0: 10569.9. Samples: 193357824. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:23:10,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 16:23:13,968][1653645] Updated weights for policy 0, policy_version 377552 (0.0011) [2024-06-15 16:23:15,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 40413.8, 300 sec: 43320.4). Total num frames: 773357568. Throughput: 0: 11047.8. Samples: 193427968. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:23:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:23:16,146][1653645] Updated weights for policy 0, policy_version 377633 (0.0012) [2024-06-15 16:23:17,988][1653645] Updated weights for policy 0, policy_version 377714 (0.0012) [2024-06-15 16:23:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 44236.8, 300 sec: 43320.5). Total num frames: 773652480. Throughput: 0: 10752.0. Samples: 193451008. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:23:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:23:21,234][1653645] Updated weights for policy 0, policy_version 377786 (0.0105) [2024-06-15 16:23:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 43320.4). Total num frames: 773783552. Throughput: 0: 11127.5. Samples: 193531392. Policy #0 lag: (min: 14.0, avg: 98.4, max: 270.0) [2024-06-15 16:23:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:23:26,554][1653645] Updated weights for policy 0, policy_version 377842 (0.0014) [2024-06-15 16:23:28,379][1653645] Updated weights for policy 0, policy_version 377920 (0.0012) [2024-06-15 16:23:29,561][1653645] Updated weights for policy 0, policy_version 377975 (0.0025) [2024-06-15 16:23:30,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 43690.5, 300 sec: 43431.4). Total num frames: 774111232. Throughput: 0: 10968.1. Samples: 193589248. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:23:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:23:32,496][1653645] Updated weights for policy 0, policy_version 378021 (0.0020) [2024-06-15 16:23:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 774242304. Throughput: 0: 10945.4. Samples: 193620992. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:23:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:23:37,976][1653645] Updated weights for policy 0, policy_version 378067 (0.0013) [2024-06-15 16:23:39,779][1653645] Updated weights for policy 0, policy_version 378144 (0.0014) [2024-06-15 16:23:40,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 43653.6). Total num frames: 774537216. Throughput: 0: 11195.8. Samples: 193686528. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:23:40,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:23:41,828][1653645] Updated weights for policy 0, policy_version 378239 (0.0143) [2024-06-15 16:23:44,790][1651596] Signal inference workers to stop experience collection... (19550 times) [2024-06-15 16:23:44,861][1653645] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-06-15 16:23:45,004][1651596] Signal inference workers to resume experience collection... (19550 times) [2024-06-15 16:23:45,005][1653645] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-06-15 16:23:45,061][1653645] Updated weights for policy 0, policy_version 378290 (0.0012) [2024-06-15 16:23:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 43542.7). Total num frames: 774766592. Throughput: 0: 10956.8. Samples: 193750016. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:23:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:23:50,268][1653645] Updated weights for policy 0, policy_version 378352 (0.0208) [2024-06-15 16:23:50,959][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 774897664. Throughput: 0: 11025.0. Samples: 193788928. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:23:50,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:23:51,628][1653645] Updated weights for policy 0, policy_version 378416 (0.0013) [2024-06-15 16:23:53,623][1653645] Updated weights for policy 0, policy_version 378494 (0.0068) [2024-06-15 16:23:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 775159808. Throughput: 0: 10797.5. Samples: 193843712. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:23:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:23:57,378][1653645] Updated weights for policy 0, policy_version 378551 (0.0013) [2024-06-15 16:24:00,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 775290880. Throughput: 0: 10911.3. Samples: 193918976. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:24:02,870][1653645] Updated weights for policy 0, policy_version 378609 (0.0015) [2024-06-15 16:24:04,431][1653645] Updated weights for policy 0, policy_version 378673 (0.0126) [2024-06-15 16:24:05,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 775618560. Throughput: 0: 11047.8. Samples: 193948160. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:24:06,166][1653645] Updated weights for policy 0, policy_version 378740 (0.0018) [2024-06-15 16:24:09,169][1653645] Updated weights for policy 0, policy_version 378771 (0.0013) [2024-06-15 16:24:10,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 775815168. Throughput: 0: 10569.9. Samples: 194007040. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:24:15,245][1653645] Updated weights for policy 0, policy_version 378848 (0.0051) [2024-06-15 16:24:15,958][1648982] Fps is (10 sec: 29491.8, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 775913472. Throughput: 0: 10786.2. Samples: 194074624. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:24:17,117][1653645] Updated weights for policy 0, policy_version 378912 (0.0012) [2024-06-15 16:24:19,186][1653645] Updated weights for policy 0, policy_version 379008 (0.0012) [2024-06-15 16:24:20,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 43098.3). Total num frames: 776208384. Throughput: 0: 10581.4. Samples: 194097152. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:24:22,958][1653645] Updated weights for policy 0, policy_version 379067 (0.0063) [2024-06-15 16:24:25,962][1648982] Fps is (10 sec: 42580.8, 60 sec: 42595.5, 300 sec: 43097.7). Total num frames: 776339456. Throughput: 0: 10682.8. Samples: 194167296. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:25,962][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:24:28,632][1653645] Updated weights for policy 0, policy_version 379124 (0.0086) [2024-06-15 16:24:30,037][1651596] Signal inference workers to stop experience collection... (19600 times) [2024-06-15 16:24:30,079][1653645] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-06-15 16:24:30,221][1651596] Signal inference workers to resume experience collection... (19600 times) [2024-06-15 16:24:30,222][1653645] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-06-15 16:24:30,752][1653645] Updated weights for policy 0, policy_version 379224 (0.0014) [2024-06-15 16:24:30,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 42598.6, 300 sec: 42987.2). Total num frames: 776667136. Throughput: 0: 10547.2. Samples: 194224640. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:24:34,024][1653645] Updated weights for policy 0, policy_version 379283 (0.0020) [2024-06-15 16:24:35,958][1648982] Fps is (10 sec: 52449.9, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 776863744. Throughput: 0: 10478.9. Samples: 194260480. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:35,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:24:39,104][1653645] Updated weights for policy 0, policy_version 379329 (0.0013) [2024-06-15 16:24:40,433][1653645] Updated weights for policy 0, policy_version 379383 (0.0012) [2024-06-15 16:24:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 41506.1, 300 sec: 43209.4). Total num frames: 777027584. Throughput: 0: 10786.1. Samples: 194329088. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:40,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 16:24:42,003][1653645] Updated weights for policy 0, policy_version 379456 (0.0127) [2024-06-15 16:24:45,597][1653645] Updated weights for policy 0, policy_version 379525 (0.0016) [2024-06-15 16:24:45,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 43098.2). Total num frames: 777289728. Throughput: 0: 10399.2. Samples: 194386944. Policy #0 lag: (min: 58.0, avg: 134.3, max: 330.0) [2024-06-15 16:24:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:24:46,709][1653645] Updated weights for policy 0, policy_version 379580 (0.0014) [2024-06-15 16:24:50,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 41506.2, 300 sec: 43098.3). Total num frames: 777388032. Throughput: 0: 10490.4. Samples: 194420224. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:24:50,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 16:24:52,474][1653645] Updated weights for policy 0, policy_version 379654 (0.0093) [2024-06-15 16:24:54,127][1653645] Updated weights for policy 0, policy_version 379725 (0.0012) [2024-06-15 16:24:55,508][1653645] Updated weights for policy 0, policy_version 379772 (0.0012) [2024-06-15 16:24:55,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 777781248. Throughput: 0: 10604.1. Samples: 194484224. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:24:55,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:24:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000379776_777781248.pth... [2024-06-15 16:24:56,054][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000374656_767295488.pth [2024-06-15 16:24:59,236][1653645] Updated weights for policy 0, policy_version 379824 (0.0012) [2024-06-15 16:25:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 777912320. Throughput: 0: 10524.4. Samples: 194548224. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:25:04,586][1653645] Updated weights for policy 0, policy_version 379889 (0.0012) [2024-06-15 16:25:05,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 41506.2, 300 sec: 43098.2). Total num frames: 778108928. Throughput: 0: 10990.9. Samples: 194591744. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:25:07,328][1653645] Updated weights for policy 0, policy_version 380000 (0.0016) [2024-06-15 16:25:10,555][1653645] Updated weights for policy 0, policy_version 380067 (0.0015) [2024-06-15 16:25:10,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 778403840. Throughput: 0: 10582.3. Samples: 194643456. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:25:15,970][1648982] Fps is (10 sec: 32767.9, 60 sec: 42052.2, 300 sec: 42876.1). Total num frames: 778436608. Throughput: 0: 10899.9. Samples: 194715136. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:15,971][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:25:16,686][1651596] Signal inference workers to stop experience collection... (19650 times) [2024-06-15 16:25:16,765][1653645] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-06-15 16:25:17,030][1651596] Signal inference workers to resume experience collection... (19650 times) [2024-06-15 16:25:17,032][1653645] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-06-15 16:25:17,117][1653645] Updated weights for policy 0, policy_version 380128 (0.0095) [2024-06-15 16:25:19,139][1653645] Updated weights for policy 0, policy_version 380224 (0.0015) [2024-06-15 16:25:20,377][1653645] Updated weights for policy 0, policy_version 380286 (0.0013) [2024-06-15 16:25:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 778829824. Throughput: 0: 10672.4. Samples: 194740736. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:25:22,949][1653645] Updated weights for policy 0, policy_version 380322 (0.0061) [2024-06-15 16:25:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43693.6, 300 sec: 43098.2). Total num frames: 778960896. Throughput: 0: 10638.2. Samples: 194807808. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:25:28,480][1653645] Updated weights for policy 0, policy_version 380368 (0.0043) [2024-06-15 16:25:29,808][1653645] Updated weights for policy 0, policy_version 380432 (0.0014) [2024-06-15 16:25:30,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 779223040. Throughput: 0: 10831.7. Samples: 194874368. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:25:31,743][1653645] Updated weights for policy 0, policy_version 380515 (0.0013) [2024-06-15 16:25:34,798][1653645] Updated weights for policy 0, policy_version 380578 (0.0013) [2024-06-15 16:25:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 779485184. Throughput: 0: 10786.1. Samples: 194905600. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:25:40,068][1653645] Updated weights for policy 0, policy_version 380640 (0.0014) [2024-06-15 16:25:40,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 43690.4, 300 sec: 43320.4). Total num frames: 779649024. Throughput: 0: 11013.7. Samples: 194979840. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:40,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 16:25:41,687][1653645] Updated weights for policy 0, policy_version 380720 (0.0011) [2024-06-15 16:25:43,335][1653645] Updated weights for policy 0, policy_version 380784 (0.0022) [2024-06-15 16:25:45,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.8, 300 sec: 43209.3). Total num frames: 779911168. Throughput: 0: 10956.8. Samples: 195041280. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:25:46,013][1653645] Updated weights for policy 0, policy_version 380817 (0.0013) [2024-06-15 16:25:46,877][1653645] Updated weights for policy 0, policy_version 380864 (0.0011) [2024-06-15 16:25:50,958][1648982] Fps is (10 sec: 36046.2, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 780009472. Throughput: 0: 10808.9. Samples: 195078144. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:25:52,900][1653645] Updated weights for policy 0, policy_version 380944 (0.0041) [2024-06-15 16:25:54,870][1653645] Updated weights for policy 0, policy_version 381008 (0.0113) [2024-06-15 16:25:54,891][1651596] Signal inference workers to stop experience collection... (19700 times) [2024-06-15 16:25:55,023][1653645] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-06-15 16:25:55,113][1651596] Signal inference workers to resume experience collection... (19700 times) [2024-06-15 16:25:55,115][1653645] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-06-15 16:25:55,742][1653645] Updated weights for policy 0, policy_version 381056 (0.0018) [2024-06-15 16:25:55,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.8, 300 sec: 43209.4). Total num frames: 780402688. Throughput: 0: 11070.6. Samples: 195141632. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:25:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:25:58,809][1653645] Updated weights for policy 0, policy_version 381120 (0.0012) [2024-06-15 16:26:00,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 780533760. Throughput: 0: 10956.8. Samples: 195208192. Policy #0 lag: (min: 31.0, avg: 169.5, max: 287.0) [2024-06-15 16:26:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:26:04,323][1653645] Updated weights for policy 0, policy_version 381187 (0.0012) [2024-06-15 16:26:05,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 43431.5). Total num frames: 780763136. Throughput: 0: 11275.4. Samples: 195248128. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:26:06,578][1653645] Updated weights for policy 0, policy_version 381264 (0.0013) [2024-06-15 16:26:10,524][1653645] Updated weights for policy 0, policy_version 381329 (0.0017) [2024-06-15 16:26:10,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 780992512. Throughput: 0: 10979.6. Samples: 195301888. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:26:11,475][1653645] Updated weights for policy 0, policy_version 381376 (0.0026) [2024-06-15 16:26:15,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.9, 300 sec: 43209.3). Total num frames: 781090816. Throughput: 0: 11082.0. Samples: 195373056. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:26:18,062][1653645] Updated weights for policy 0, policy_version 381478 (0.0019) [2024-06-15 16:26:19,163][1653645] Updated weights for policy 0, policy_version 381520 (0.0013) [2024-06-15 16:26:20,230][1653645] Updated weights for policy 0, policy_version 381562 (0.0012) [2024-06-15 16:26:20,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 43320.5). Total num frames: 781451264. Throughput: 0: 10808.9. Samples: 195392000. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:26:23,361][1653645] Updated weights for policy 0, policy_version 381601 (0.0012) [2024-06-15 16:26:23,968][1653645] Updated weights for policy 0, policy_version 381632 (0.0024) [2024-06-15 16:26:25,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 781582336. Throughput: 0: 10786.2. Samples: 195465216. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:26:28,910][1653645] Updated weights for policy 0, policy_version 381715 (0.0113) [2024-06-15 16:26:30,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44236.7, 300 sec: 43542.5). Total num frames: 781877248. Throughput: 0: 10820.2. Samples: 195528192. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:30,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:26:31,711][1653645] Updated weights for policy 0, policy_version 381808 (0.0013) [2024-06-15 16:26:35,430][1653645] Updated weights for policy 0, policy_version 381857 (0.0017) [2024-06-15 16:26:35,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 782073856. Throughput: 0: 10672.3. Samples: 195558400. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:35,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:26:38,952][1653645] Updated weights for policy 0, policy_version 381904 (0.0012) [2024-06-15 16:26:40,535][1651596] Signal inference workers to stop experience collection... (19750 times) [2024-06-15 16:26:40,605][1653645] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-06-15 16:26:40,618][1653645] Updated weights for policy 0, policy_version 381971 (0.0022) [2024-06-15 16:26:40,802][1651596] Signal inference workers to resume experience collection... (19750 times) [2024-06-15 16:26:40,802][1653645] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-06-15 16:26:40,959][1648982] Fps is (10 sec: 42599.3, 60 sec: 44237.0, 300 sec: 43431.5). Total num frames: 782303232. Throughput: 0: 10979.5. Samples: 195635712. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:40,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:26:42,279][1653645] Updated weights for policy 0, policy_version 382032 (0.0040) [2024-06-15 16:26:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 782532608. Throughput: 0: 10956.8. Samples: 195701248. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:26:46,120][1653645] Updated weights for policy 0, policy_version 382102 (0.0012) [2024-06-15 16:26:50,613][1653645] Updated weights for policy 0, policy_version 382177 (0.0015) [2024-06-15 16:26:50,959][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 43431.5). Total num frames: 782729216. Throughput: 0: 10911.3. Samples: 195739136. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:50,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:26:51,780][1653645] Updated weights for policy 0, policy_version 382224 (0.0012) [2024-06-15 16:26:53,111][1653645] Updated weights for policy 0, policy_version 382271 (0.0109) [2024-06-15 16:26:54,600][1653645] Updated weights for policy 0, policy_version 382320 (0.0024) [2024-06-15 16:26:55,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 783024128. Throughput: 0: 11127.4. Samples: 195802624. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:26:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:26:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000382336_783024128.pth... [2024-06-15 16:26:56,059][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000377216_772538368.pth [2024-06-15 16:26:57,640][1653645] Updated weights for policy 0, policy_version 382368 (0.0012) [2024-06-15 16:27:00,970][1648982] Fps is (10 sec: 42544.6, 60 sec: 43681.5, 300 sec: 43096.4). Total num frames: 783155200. Throughput: 0: 11101.6. Samples: 195872768. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:27:00,971][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:27:01,611][1653645] Updated weights for policy 0, policy_version 382412 (0.0014) [2024-06-15 16:27:02,731][1653645] Updated weights for policy 0, policy_version 382458 (0.0016) [2024-06-15 16:27:05,177][1653645] Updated weights for policy 0, policy_version 382527 (0.0021) [2024-06-15 16:27:05,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 783417344. Throughput: 0: 11411.9. Samples: 195905536. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:27:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:27:06,949][1653645] Updated weights for policy 0, policy_version 382586 (0.0013) [2024-06-15 16:27:10,272][1653645] Updated weights for policy 0, policy_version 382647 (0.0086) [2024-06-15 16:27:10,958][1648982] Fps is (10 sec: 52495.3, 60 sec: 44782.9, 300 sec: 43209.3). Total num frames: 783679488. Throughput: 0: 11161.6. Samples: 195967488. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:27:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:27:14,063][1653645] Updated weights for policy 0, policy_version 382688 (0.0011) [2024-06-15 16:27:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45329.1, 300 sec: 43431.5). Total num frames: 783810560. Throughput: 0: 11229.9. Samples: 196033536. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:27:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:27:17,562][1653645] Updated weights for policy 0, policy_version 382768 (0.0133) [2024-06-15 16:27:19,008][1653645] Updated weights for policy 0, policy_version 382832 (0.0030) [2024-06-15 16:27:20,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 43764.7). Total num frames: 784105472. Throughput: 0: 11218.5. Samples: 196063232. Policy #0 lag: (min: 95.0, avg: 175.1, max: 367.0) [2024-06-15 16:27:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:27:21,297][1653645] Updated weights for policy 0, policy_version 382880 (0.0015) [2024-06-15 16:27:25,128][1653645] Updated weights for policy 0, policy_version 382917 (0.0012) [2024-06-15 16:27:25,958][1648982] Fps is (10 sec: 45873.3, 60 sec: 44782.7, 300 sec: 43320.4). Total num frames: 784269312. Throughput: 0: 11093.3. Samples: 196134912. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:27:29,110][1653645] Updated weights for policy 0, policy_version 382993 (0.0223) [2024-06-15 16:27:29,921][1651596] Signal inference workers to stop experience collection... (19800 times) [2024-06-15 16:27:29,981][1653645] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-06-15 16:27:30,216][1651596] Signal inference workers to resume experience collection... (19800 times) [2024-06-15 16:27:30,216][1653645] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-06-15 16:27:30,394][1653645] Updated weights for policy 0, policy_version 383043 (0.0013) [2024-06-15 16:27:30,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 784498688. Throughput: 0: 11002.3. Samples: 196196352. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:30,960][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:27:33,177][1653645] Updated weights for policy 0, policy_version 383107 (0.0013) [2024-06-15 16:27:35,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 784728064. Throughput: 0: 10877.2. Samples: 196228608. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:27:37,559][1653645] Updated weights for policy 0, policy_version 383184 (0.0012) [2024-06-15 16:27:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 43431.5). Total num frames: 784859136. Throughput: 0: 10956.8. Samples: 196295680. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:27:41,190][1653645] Updated weights for policy 0, policy_version 383248 (0.0013) [2024-06-15 16:27:43,457][1653645] Updated weights for policy 0, policy_version 383331 (0.0012) [2024-06-15 16:27:45,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 785186816. Throughput: 0: 10755.0. Samples: 196356608. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:27:46,574][1653645] Updated weights for policy 0, policy_version 383423 (0.0012) [2024-06-15 16:27:50,482][1653645] Updated weights for policy 0, policy_version 383472 (0.0016) [2024-06-15 16:27:50,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 785383424. Throughput: 0: 10763.4. Samples: 196389888. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:27:53,747][1653645] Updated weights for policy 0, policy_version 383525 (0.0077) [2024-06-15 16:27:55,885][1653645] Updated weights for policy 0, policy_version 383611 (0.0104) [2024-06-15 16:27:55,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 785612800. Throughput: 0: 10820.2. Samples: 196454400. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:27:55,961][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 16:27:58,117][1653645] Updated weights for policy 0, policy_version 383664 (0.0015) [2024-06-15 16:28:00,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44246.0, 300 sec: 43431.5). Total num frames: 785809408. Throughput: 0: 10934.0. Samples: 196525568. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:28:01,305][1653645] Updated weights for policy 0, policy_version 383714 (0.0014) [2024-06-15 16:28:05,471][1653645] Updated weights for policy 0, policy_version 383779 (0.0013) [2024-06-15 16:28:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43144.3, 300 sec: 43431.5). Total num frames: 786006016. Throughput: 0: 11013.7. Samples: 196558848. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:05,959][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 16:28:06,414][1653645] Updated weights for policy 0, policy_version 383812 (0.0026) [2024-06-15 16:28:08,716][1653645] Updated weights for policy 0, policy_version 383875 (0.0024) [2024-06-15 16:28:10,206][1653645] Updated weights for policy 0, policy_version 383933 (0.0012) [2024-06-15 16:28:10,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 786300928. Throughput: 0: 10843.1. Samples: 196622848. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:28:13,613][1653645] Updated weights for policy 0, policy_version 383986 (0.0012) [2024-06-15 16:28:15,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 786432000. Throughput: 0: 11036.5. Samples: 196692992. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:28:17,008][1653645] Updated weights for policy 0, policy_version 384006 (0.0015) [2024-06-15 16:28:18,025][1651596] Signal inference workers to stop experience collection... (19850 times) [2024-06-15 16:28:18,081][1653645] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-06-15 16:28:18,278][1651596] Signal inference workers to resume experience collection... (19850 times) [2024-06-15 16:28:18,278][1653645] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-06-15 16:28:18,687][1653645] Updated weights for policy 0, policy_version 384080 (0.0012) [2024-06-15 16:28:20,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 43764.7). Total num frames: 786694144. Throughput: 0: 10990.9. Samples: 196723200. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:28:21,020][1653645] Updated weights for policy 0, policy_version 384131 (0.0042) [2024-06-15 16:28:22,325][1653645] Updated weights for policy 0, policy_version 384183 (0.0013) [2024-06-15 16:28:25,887][1653645] Updated weights for policy 0, policy_version 384244 (0.0013) [2024-06-15 16:28:25,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 44236.8, 300 sec: 43431.5). Total num frames: 786923520. Throughput: 0: 11013.7. Samples: 196791296. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:28:29,408][1653645] Updated weights for policy 0, policy_version 384275 (0.0041) [2024-06-15 16:28:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.9, 300 sec: 43653.6). Total num frames: 787120128. Throughput: 0: 11036.5. Samples: 196853248. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:28:31,237][1653645] Updated weights for policy 0, policy_version 384355 (0.0139) [2024-06-15 16:28:32,653][1653645] Updated weights for policy 0, policy_version 384388 (0.0011) [2024-06-15 16:28:35,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 787349504. Throughput: 0: 10911.3. Samples: 196880896. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:28:36,821][1653645] Updated weights for policy 0, policy_version 384464 (0.0011) [2024-06-15 16:28:40,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 787480576. Throughput: 0: 11047.9. Samples: 196951552. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:28:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:28:41,194][1653645] Updated weights for policy 0, policy_version 384529 (0.0130) [2024-06-15 16:28:43,793][1653645] Updated weights for policy 0, policy_version 384631 (0.0032) [2024-06-15 16:28:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 787808256. Throughput: 0: 10729.3. Samples: 197008384. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:28:45,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:28:46,189][1653645] Updated weights for policy 0, policy_version 384695 (0.0012) [2024-06-15 16:28:50,783][1653645] Updated weights for policy 0, policy_version 384737 (0.0013) [2024-06-15 16:28:50,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 42598.2, 300 sec: 43320.4). Total num frames: 787939328. Throughput: 0: 10740.6. Samples: 197042176. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:28:50,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:28:54,278][1653645] Updated weights for policy 0, policy_version 384816 (0.0013) [2024-06-15 16:28:55,974][1648982] Fps is (10 sec: 42528.5, 60 sec: 43678.9, 300 sec: 43873.3). Total num frames: 788234240. Throughput: 0: 10793.6. Samples: 197108736. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:28:55,975][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:28:56,140][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000384896_788267008.pth... [2024-06-15 16:28:56,206][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000379776_777781248.pth [2024-06-15 16:28:57,304][1653645] Updated weights for policy 0, policy_version 384905 (0.0014) [2024-06-15 16:29:00,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43144.3, 300 sec: 43320.4). Total num frames: 788398080. Throughput: 0: 10626.8. Samples: 197171200. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:00,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 16:29:02,234][1653645] Updated weights for policy 0, policy_version 384976 (0.0014) [2024-06-15 16:29:05,301][1653645] Updated weights for policy 0, policy_version 385026 (0.0012) [2024-06-15 16:29:05,739][1651596] Signal inference workers to stop experience collection... (19900 times) [2024-06-15 16:29:05,785][1653645] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-06-15 16:29:05,958][1648982] Fps is (10 sec: 32822.1, 60 sec: 42598.6, 300 sec: 43209.3). Total num frames: 788561920. Throughput: 0: 10740.6. Samples: 197206528. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:05,958][1648982] Avg episode reward: [(0, '36.900')] [2024-06-15 16:29:06,014][1651596] Signal inference workers to resume experience collection... (19900 times) [2024-06-15 16:29:06,015][1653645] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-06-15 16:29:06,920][1653645] Updated weights for policy 0, policy_version 385090 (0.0013) [2024-06-15 16:29:08,392][1653645] Updated weights for policy 0, policy_version 385152 (0.0043) [2024-06-15 16:29:10,631][1653645] Updated weights for policy 0, policy_version 385209 (0.0012) [2024-06-15 16:29:10,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.2, 300 sec: 44097.9). Total num frames: 788922368. Throughput: 0: 10581.3. Samples: 197267456. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:10,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 16:29:14,705][1653645] Updated weights for policy 0, policy_version 385249 (0.0012) [2024-06-15 16:29:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 789053440. Throughput: 0: 10729.2. Samples: 197336064. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:29:18,391][1653645] Updated weights for policy 0, policy_version 385299 (0.0037) [2024-06-15 16:29:20,295][1653645] Updated weights for policy 0, policy_version 385379 (0.0014) [2024-06-15 16:29:20,958][1648982] Fps is (10 sec: 39323.9, 60 sec: 43690.8, 300 sec: 43987.5). Total num frames: 789315584. Throughput: 0: 10922.7. Samples: 197372416. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:29:21,529][1653645] Updated weights for policy 0, policy_version 385424 (0.0014) [2024-06-15 16:29:22,496][1653645] Updated weights for policy 0, policy_version 385472 (0.0055) [2024-06-15 16:29:25,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 42052.3, 300 sec: 43320.4). Total num frames: 789446656. Throughput: 0: 10626.8. Samples: 197429760. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:29:27,485][1653645] Updated weights for policy 0, policy_version 385530 (0.0011) [2024-06-15 16:29:30,824][1653645] Updated weights for policy 0, policy_version 385569 (0.0016) [2024-06-15 16:29:30,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 42052.3, 300 sec: 43320.4). Total num frames: 789643264. Throughput: 0: 10877.2. Samples: 197497856. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:29:32,753][1653645] Updated weights for policy 0, policy_version 385648 (0.0123) [2024-06-15 16:29:34,846][1653645] Updated weights for policy 0, policy_version 385719 (0.0128) [2024-06-15 16:29:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 789970944. Throughput: 0: 10638.2. Samples: 197520896. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:29:39,343][1653645] Updated weights for policy 0, policy_version 385760 (0.0025) [2024-06-15 16:29:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 790102016. Throughput: 0: 10847.0. Samples: 197596672. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:40,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 16:29:41,566][1653645] Updated weights for policy 0, policy_version 385796 (0.0011) [2024-06-15 16:29:43,198][1653645] Updated weights for policy 0, policy_version 385860 (0.0013) [2024-06-15 16:29:45,819][1653645] Updated weights for policy 0, policy_version 385938 (0.0173) [2024-06-15 16:29:45,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 790396928. Throughput: 0: 10820.3. Samples: 197658112. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:29:46,894][1653645] Updated weights for policy 0, policy_version 385984 (0.0018) [2024-06-15 16:29:50,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 790560768. Throughput: 0: 10854.4. Samples: 197694976. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:29:51,333][1653645] Updated weights for policy 0, policy_version 386040 (0.0014) [2024-06-15 16:29:52,699][1651596] Signal inference workers to stop experience collection... (19950 times) [2024-06-15 16:29:52,753][1653645] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-06-15 16:29:53,020][1651596] Signal inference workers to resume experience collection... (19950 times) [2024-06-15 16:29:53,020][1653645] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-06-15 16:29:54,188][1653645] Updated weights for policy 0, policy_version 386112 (0.0015) [2024-06-15 16:29:55,962][1648982] Fps is (10 sec: 45855.4, 60 sec: 43699.4, 300 sec: 43875.1). Total num frames: 790855680. Throughput: 0: 11035.5. Samples: 197764096. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:29:55,963][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:29:56,013][1653645] Updated weights for policy 0, policy_version 386176 (0.0013) [2024-06-15 16:29:57,849][1653645] Updated weights for policy 0, policy_version 386238 (0.0086) [2024-06-15 16:30:00,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 43691.0, 300 sec: 43764.7). Total num frames: 791019520. Throughput: 0: 11013.7. Samples: 197831680. Policy #0 lag: (min: 15.0, avg: 156.5, max: 271.0) [2024-06-15 16:30:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:30:03,111][1653645] Updated weights for policy 0, policy_version 386292 (0.0015) [2024-06-15 16:30:05,216][1653645] Updated weights for policy 0, policy_version 386336 (0.0136) [2024-06-15 16:30:05,958][1648982] Fps is (10 sec: 42617.2, 60 sec: 45329.0, 300 sec: 43653.6). Total num frames: 791281664. Throughput: 0: 11025.1. Samples: 197868544. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:05,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:30:06,527][1653645] Updated weights for policy 0, policy_version 386375 (0.0013) [2024-06-15 16:30:07,991][1653645] Updated weights for policy 0, policy_version 386433 (0.0013) [2024-06-15 16:30:10,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 791543808. Throughput: 0: 11070.6. Samples: 197927936. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:30:13,287][1653645] Updated weights for policy 0, policy_version 386497 (0.0012) [2024-06-15 16:30:14,619][1653645] Updated weights for policy 0, policy_version 386549 (0.0017) [2024-06-15 16:30:15,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 791674880. Throughput: 0: 11241.2. Samples: 198003712. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:15,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:30:17,400][1653645] Updated weights for policy 0, policy_version 386608 (0.0014) [2024-06-15 16:30:19,773][1653645] Updated weights for policy 0, policy_version 386672 (0.0013) [2024-06-15 16:30:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 791969792. Throughput: 0: 11343.7. Samples: 198031360. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:30:21,603][1653645] Updated weights for policy 0, policy_version 386743 (0.0014) [2024-06-15 16:30:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 792068096. Throughput: 0: 11070.6. Samples: 198094848. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:30:26,579][1653645] Updated weights for policy 0, policy_version 386800 (0.0022) [2024-06-15 16:30:29,551][1653645] Updated weights for policy 0, policy_version 386877 (0.0015) [2024-06-15 16:30:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45328.9, 300 sec: 43653.6). Total num frames: 792363008. Throughput: 0: 11355.0. Samples: 198169088. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:30:31,690][1653645] Updated weights for policy 0, policy_version 386932 (0.0015) [2024-06-15 16:30:33,586][1653645] Updated weights for policy 0, policy_version 387003 (0.0013) [2024-06-15 16:30:35,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 792592384. Throughput: 0: 10968.2. Samples: 198188544. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:30:37,849][1651596] Signal inference workers to stop experience collection... (20000 times) [2024-06-15 16:30:37,935][1653645] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-06-15 16:30:38,088][1651596] Signal inference workers to resume experience collection... (20000 times) [2024-06-15 16:30:38,089][1653645] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-06-15 16:30:38,265][1653645] Updated weights for policy 0, policy_version 387064 (0.0015) [2024-06-15 16:30:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 792723456. Throughput: 0: 11253.7. Samples: 198270464. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:30:41,605][1653645] Updated weights for policy 0, policy_version 387106 (0.0129) [2024-06-15 16:30:43,727][1653645] Updated weights for policy 0, policy_version 387200 (0.0078) [2024-06-15 16:30:45,225][1653645] Updated weights for policy 0, policy_version 387264 (0.0046) [2024-06-15 16:30:45,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 793116672. Throughput: 0: 10888.5. Samples: 198321664. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:30:49,908][1653645] Updated weights for policy 0, policy_version 387328 (0.0012) [2024-06-15 16:30:50,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 44783.0, 300 sec: 43542.6). Total num frames: 793247744. Throughput: 0: 11013.7. Samples: 198364160. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:30:54,920][1653645] Updated weights for policy 0, policy_version 387397 (0.0152) [2024-06-15 16:30:55,960][1648982] Fps is (10 sec: 36036.6, 60 sec: 43692.2, 300 sec: 43875.5). Total num frames: 793477120. Throughput: 0: 11058.7. Samples: 198425600. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:30:55,960][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:30:56,615][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000387472_793542656.pth... [2024-06-15 16:30:56,754][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000382336_783024128.pth [2024-06-15 16:30:56,760][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000387472_793542656.pth [2024-06-15 16:30:57,319][1653645] Updated weights for policy 0, policy_version 387491 (0.0142) [2024-06-15 16:31:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 793640960. Throughput: 0: 10786.1. Samples: 198489088. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:31:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:31:02,248][1653645] Updated weights for policy 0, policy_version 387556 (0.0011) [2024-06-15 16:31:05,958][1648982] Fps is (10 sec: 32775.4, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 793804800. Throughput: 0: 10899.9. Samples: 198521856. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:31:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:31:06,003][1653645] Updated weights for policy 0, policy_version 387616 (0.0013) [2024-06-15 16:31:07,844][1653645] Updated weights for policy 0, policy_version 387680 (0.0013) [2024-06-15 16:31:09,533][1653645] Updated weights for policy 0, policy_version 387744 (0.0044) [2024-06-15 16:31:10,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 794165248. Throughput: 0: 10808.8. Samples: 198581248. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:31:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:31:13,531][1653645] Updated weights for policy 0, policy_version 387778 (0.0014) [2024-06-15 16:31:14,390][1653645] Updated weights for policy 0, policy_version 387828 (0.0013) [2024-06-15 16:31:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 794296320. Throughput: 0: 10797.5. Samples: 198654976. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:31:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:31:18,171][1653645] Updated weights for policy 0, policy_version 387897 (0.0013) [2024-06-15 16:31:20,887][1653645] Updated weights for policy 0, policy_version 387984 (0.0092) [2024-06-15 16:31:20,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 794591232. Throughput: 0: 11082.0. Samples: 198687232. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:31:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:31:21,303][1651596] Signal inference workers to stop experience collection... (20050 times) [2024-06-15 16:31:21,373][1653645] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-06-15 16:31:21,552][1651596] Signal inference workers to resume experience collection... (20050 times) [2024-06-15 16:31:21,553][1653645] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-06-15 16:31:25,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.5, 300 sec: 43431.5). Total num frames: 794689536. Throughput: 0: 10592.7. Samples: 198747136. Policy #0 lag: (min: 15.0, avg: 110.8, max: 271.0) [2024-06-15 16:31:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:31:26,500][1653645] Updated weights for policy 0, policy_version 388064 (0.0023) [2024-06-15 16:31:30,453][1653645] Updated weights for policy 0, policy_version 388114 (0.0012) [2024-06-15 16:31:30,957][1648982] Fps is (10 sec: 29491.6, 60 sec: 42052.5, 300 sec: 43431.5). Total num frames: 794886144. Throughput: 0: 10831.7. Samples: 198809088. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:31:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:31:32,302][1653645] Updated weights for policy 0, policy_version 388181 (0.0039) [2024-06-15 16:31:33,928][1653645] Updated weights for policy 0, policy_version 388256 (0.0014) [2024-06-15 16:31:35,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 795213824. Throughput: 0: 10467.6. Samples: 198835200. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:31:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:31:39,619][1653645] Updated weights for policy 0, policy_version 388321 (0.0013) [2024-06-15 16:31:40,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 795344896. Throughput: 0: 10604.6. Samples: 198902784. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:31:40,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 16:31:43,639][1653645] Updated weights for policy 0, policy_version 388384 (0.0014) [2024-06-15 16:31:45,707][1653645] Updated weights for policy 0, policy_version 388464 (0.0044) [2024-06-15 16:31:45,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 40959.9, 300 sec: 43542.6). Total num frames: 795574272. Throughput: 0: 10626.8. Samples: 198967296. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:31:45,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:31:47,635][1653645] Updated weights for policy 0, policy_version 388538 (0.0014) [2024-06-15 16:31:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 43098.3). Total num frames: 795738112. Throughput: 0: 10422.1. Samples: 198990848. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:31:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:31:52,662][1653645] Updated weights for policy 0, policy_version 388607 (0.0014) [2024-06-15 16:31:55,960][1648982] Fps is (10 sec: 32759.3, 60 sec: 40413.6, 300 sec: 43210.8). Total num frames: 795901952. Throughput: 0: 10717.3. Samples: 199063552. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:31:55,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:31:56,728][1653645] Updated weights for policy 0, policy_version 388664 (0.0013) [2024-06-15 16:31:58,365][1653645] Updated weights for policy 0, policy_version 388722 (0.0014) [2024-06-15 16:32:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 796262400. Throughput: 0: 10240.0. Samples: 199115776. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:32:04,063][1653645] Updated weights for policy 0, policy_version 388817 (0.0013) [2024-06-15 16:32:05,182][1653645] Updated weights for policy 0, policy_version 388864 (0.0026) [2024-06-15 16:32:05,958][1648982] Fps is (10 sec: 49164.1, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 796393472. Throughput: 0: 10467.5. Samples: 199158272. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:32:09,752][1653645] Updated weights for policy 0, policy_version 388944 (0.0012) [2024-06-15 16:32:09,890][1651596] Signal inference workers to stop experience collection... (20100 times) [2024-06-15 16:32:09,937][1653645] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-06-15 16:32:10,071][1651596] Signal inference workers to resume experience collection... (20100 times) [2024-06-15 16:32:10,072][1653645] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-06-15 16:32:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 43542.5). Total num frames: 796655616. Throughput: 0: 10467.6. Samples: 199218176. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:32:11,171][1653645] Updated weights for policy 0, policy_version 389008 (0.0018) [2024-06-15 16:32:15,588][1653645] Updated weights for policy 0, policy_version 389059 (0.0014) [2024-06-15 16:32:15,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 43098.3). Total num frames: 796819456. Throughput: 0: 10604.0. Samples: 199286272. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:15,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:32:17,091][1653645] Updated weights for policy 0, policy_version 389119 (0.0011) [2024-06-15 16:32:20,958][1648982] Fps is (10 sec: 32766.9, 60 sec: 39867.4, 300 sec: 43098.2). Total num frames: 796983296. Throughput: 0: 10740.5. Samples: 199318528. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:20,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:32:22,196][1653645] Updated weights for policy 0, policy_version 389204 (0.0013) [2024-06-15 16:32:24,440][1653645] Updated weights for policy 0, policy_version 389307 (0.0012) [2024-06-15 16:32:25,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 797310976. Throughput: 0: 10353.8. Samples: 199368704. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:32:28,647][1653645] Updated weights for policy 0, policy_version 389373 (0.0085) [2024-06-15 16:32:30,958][1648982] Fps is (10 sec: 45877.1, 60 sec: 42598.3, 300 sec: 43098.3). Total num frames: 797442048. Throughput: 0: 10558.6. Samples: 199442432. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:32:33,822][1653645] Updated weights for policy 0, policy_version 389428 (0.0014) [2024-06-15 16:32:35,047][1653645] Updated weights for policy 0, policy_version 389477 (0.0091) [2024-06-15 16:32:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 43653.7). Total num frames: 797736960. Throughput: 0: 10888.5. Samples: 199480832. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:32:36,818][1653645] Updated weights for policy 0, policy_version 389556 (0.0023) [2024-06-15 16:32:39,050][1653645] Updated weights for policy 0, policy_version 389600 (0.0014) [2024-06-15 16:32:40,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 43320.4). Total num frames: 797966336. Throughput: 0: 10695.7. Samples: 199544832. Policy #0 lag: (min: 14.0, avg: 111.9, max: 270.0) [2024-06-15 16:32:40,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:32:44,221][1653645] Updated weights for policy 0, policy_version 389648 (0.0011) [2024-06-15 16:32:45,971][1648982] Fps is (10 sec: 42540.6, 60 sec: 43134.8, 300 sec: 43318.4). Total num frames: 798162944. Throughput: 0: 11101.4. Samples: 199615488. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:32:45,973][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:32:46,396][1653645] Updated weights for policy 0, policy_version 389744 (0.0055) [2024-06-15 16:32:47,789][1653645] Updated weights for policy 0, policy_version 389795 (0.0013) [2024-06-15 16:32:50,573][1653645] Updated weights for policy 0, policy_version 389840 (0.0040) [2024-06-15 16:32:50,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.7, 300 sec: 43320.4). Total num frames: 798392320. Throughput: 0: 10786.2. Samples: 199643648. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:32:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:32:51,182][1651596] Signal inference workers to stop experience collection... (20150 times) [2024-06-15 16:32:51,239][1653645] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-06-15 16:32:51,431][1651596] Signal inference workers to resume experience collection... (20150 times) [2024-06-15 16:32:51,432][1653645] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-06-15 16:32:51,734][1653645] Updated weights for policy 0, policy_version 389886 (0.0011) [2024-06-15 16:32:55,958][1648982] Fps is (10 sec: 39374.9, 60 sec: 44238.7, 300 sec: 43209.3). Total num frames: 798556160. Throughput: 0: 11229.9. Samples: 199723520. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:32:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:32:56,511][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000389952_798621696.pth... [2024-06-15 16:32:56,645][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000384896_788267008.pth [2024-06-15 16:32:57,336][1653645] Updated weights for policy 0, policy_version 389984 (0.0014) [2024-06-15 16:32:58,930][1653645] Updated weights for policy 0, policy_version 390048 (0.0016) [2024-06-15 16:33:00,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 43690.6, 300 sec: 43653.7). Total num frames: 798883840. Throughput: 0: 10854.4. Samples: 199774720. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:00,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:33:02,782][1653645] Updated weights for policy 0, policy_version 390112 (0.0014) [2024-06-15 16:33:05,960][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.8, 300 sec: 43098.2). Total num frames: 799014912. Throughput: 0: 10934.1. Samples: 199810560. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:05,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:33:08,484][1653645] Updated weights for policy 0, policy_version 390180 (0.0013) [2024-06-15 16:33:10,369][1653645] Updated weights for policy 0, policy_version 390256 (0.0015) [2024-06-15 16:33:10,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 799277056. Throughput: 0: 11355.0. Samples: 199879680. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:33:11,807][1653645] Updated weights for policy 0, policy_version 390323 (0.0013) [2024-06-15 16:33:14,996][1653645] Updated weights for policy 0, policy_version 390384 (0.0015) [2024-06-15 16:33:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 799539200. Throughput: 0: 11036.4. Samples: 199939072. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:33:20,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 44237.0, 300 sec: 43098.3). Total num frames: 799637504. Throughput: 0: 11059.2. Samples: 199978496. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:20,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:33:21,149][1653645] Updated weights for policy 0, policy_version 390466 (0.0126) [2024-06-15 16:33:22,680][1653645] Updated weights for policy 0, policy_version 390528 (0.0129) [2024-06-15 16:33:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 799932416. Throughput: 0: 10843.1. Samples: 200032768. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:33:26,634][1653645] Updated weights for policy 0, policy_version 390593 (0.0013) [2024-06-15 16:33:30,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 800063488. Throughput: 0: 11005.6. Samples: 200110592. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:33:31,415][1653645] Updated weights for policy 0, policy_version 390659 (0.0016) [2024-06-15 16:33:33,017][1653645] Updated weights for policy 0, policy_version 390736 (0.0013) [2024-06-15 16:33:35,303][1653645] Updated weights for policy 0, policy_version 390816 (0.0013) [2024-06-15 16:33:35,452][1651596] Signal inference workers to stop experience collection... (20200 times) [2024-06-15 16:33:35,497][1653645] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-06-15 16:33:35,671][1651596] Signal inference workers to resume experience collection... (20200 times) [2024-06-15 16:33:35,672][1653645] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-06-15 16:33:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 800456704. Throughput: 0: 11002.3. Samples: 200138752. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:33:39,006][1653645] Updated weights for policy 0, policy_version 390880 (0.0012) [2024-06-15 16:33:39,733][1653645] Updated weights for policy 0, policy_version 390912 (0.0013) [2024-06-15 16:33:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.8, 300 sec: 43320.4). Total num frames: 800587776. Throughput: 0: 10683.8. Samples: 200204288. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:33:45,346][1653645] Updated weights for policy 0, policy_version 391013 (0.0012) [2024-06-15 16:33:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44793.1, 300 sec: 43764.8). Total num frames: 800849920. Throughput: 0: 10979.6. Samples: 200268800. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:33:46,847][1653645] Updated weights for policy 0, policy_version 391076 (0.0127) [2024-06-15 16:33:50,124][1653645] Updated weights for policy 0, policy_version 391124 (0.0011) [2024-06-15 16:33:50,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 44782.9, 300 sec: 43545.0). Total num frames: 801079296. Throughput: 0: 11059.2. Samples: 200308224. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:33:54,876][1653645] Updated weights for policy 0, policy_version 391189 (0.0017) [2024-06-15 16:33:55,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 44782.8, 300 sec: 43542.6). Total num frames: 801243136. Throughput: 0: 11241.2. Samples: 200385536. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:33:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:33:56,147][1653645] Updated weights for policy 0, policy_version 391248 (0.0012) [2024-06-15 16:33:57,524][1653645] Updated weights for policy 0, policy_version 391313 (0.0014) [2024-06-15 16:34:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 801505280. Throughput: 0: 11275.4. Samples: 200446464. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:34:00,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:34:01,360][1653645] Updated weights for policy 0, policy_version 391392 (0.0014) [2024-06-15 16:34:05,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 801636352. Throughput: 0: 11116.1. Samples: 200478720. Policy #0 lag: (min: 93.0, avg: 159.2, max: 349.0) [2024-06-15 16:34:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:34:06,374][1653645] Updated weights for policy 0, policy_version 391427 (0.0014) [2024-06-15 16:34:07,842][1653645] Updated weights for policy 0, policy_version 391504 (0.0014) [2024-06-15 16:34:08,830][1653645] Updated weights for policy 0, policy_version 391552 (0.0012) [2024-06-15 16:34:10,191][1653645] Updated weights for policy 0, policy_version 391612 (0.0018) [2024-06-15 16:34:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 802029568. Throughput: 0: 11468.8. Samples: 200548864. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:34:14,461][1653645] Updated weights for policy 0, policy_version 391673 (0.0013) [2024-06-15 16:34:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 802160640. Throughput: 0: 11161.6. Samples: 200612864. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:15,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:34:18,184][1653645] Updated weights for policy 0, policy_version 391717 (0.0012) [2024-06-15 16:34:20,398][1651596] Signal inference workers to stop experience collection... (20250 times) [2024-06-15 16:34:20,431][1653645] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-06-15 16:34:20,630][1651596] Signal inference workers to resume experience collection... (20250 times) [2024-06-15 16:34:20,631][1653645] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-06-15 16:34:20,949][1653645] Updated weights for policy 0, policy_version 391808 (0.0141) [2024-06-15 16:34:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 46421.5, 300 sec: 43986.9). Total num frames: 802422784. Throughput: 0: 11468.8. Samples: 200654848. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:34:22,339][1653645] Updated weights for policy 0, policy_version 391862 (0.0017) [2024-06-15 16:34:25,365][1653645] Updated weights for policy 0, policy_version 391920 (0.0012) [2024-06-15 16:34:25,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 802684928. Throughput: 0: 11355.0. Samples: 200715264. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:34:29,408][1653645] Updated weights for policy 0, policy_version 391969 (0.0016) [2024-06-15 16:34:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 43542.6). Total num frames: 802816000. Throughput: 0: 11537.0. Samples: 200787968. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:34:32,856][1653645] Updated weights for policy 0, policy_version 392064 (0.0094) [2024-06-15 16:34:34,388][1653645] Updated weights for policy 0, policy_version 392128 (0.0129) [2024-06-15 16:34:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 803078144. Throughput: 0: 11264.0. Samples: 200815104. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:34:37,423][1653645] Updated weights for policy 0, policy_version 392189 (0.0015) [2024-06-15 16:34:40,962][1648982] Fps is (10 sec: 45854.0, 60 sec: 44779.4, 300 sec: 43653.0). Total num frames: 803274752. Throughput: 0: 11046.7. Samples: 200882688. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:40,963][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:34:41,498][1653645] Updated weights for policy 0, policy_version 392246 (0.0013) [2024-06-15 16:34:43,385][1653645] Updated weights for policy 0, policy_version 392277 (0.0027) [2024-06-15 16:34:45,559][1653645] Updated weights for policy 0, policy_version 392368 (0.0221) [2024-06-15 16:34:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44209.1). Total num frames: 803602432. Throughput: 0: 11195.7. Samples: 200950272. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:45,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:34:47,935][1653645] Updated weights for policy 0, policy_version 392416 (0.0016) [2024-06-15 16:34:50,958][1648982] Fps is (10 sec: 45895.8, 60 sec: 44236.7, 300 sec: 43654.3). Total num frames: 803733504. Throughput: 0: 11218.5. Samples: 200983552. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:34:52,209][1653645] Updated weights for policy 0, policy_version 392466 (0.0013) [2024-06-15 16:34:55,054][1653645] Updated weights for policy 0, policy_version 392528 (0.0015) [2024-06-15 16:34:55,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45329.2, 300 sec: 43875.8). Total num frames: 803962880. Throughput: 0: 11332.3. Samples: 201058816. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:34:55,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 16:34:56,742][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000392592_804028416.pth... [2024-06-15 16:34:56,933][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000387472_793542656.pth [2024-06-15 16:34:57,787][1653645] Updated weights for policy 0, policy_version 392624 (0.0012) [2024-06-15 16:35:00,549][1653645] Updated weights for policy 0, policy_version 392693 (0.0014) [2024-06-15 16:35:00,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 804257792. Throughput: 0: 10968.2. Samples: 201106432. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:35:00,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 16:35:05,041][1653645] Updated weights for policy 0, policy_version 392742 (0.0031) [2024-06-15 16:35:05,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 45875.0, 300 sec: 43542.5). Total num frames: 804388864. Throughput: 0: 10922.6. Samples: 201146368. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:35:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:35:07,307][1651596] Signal inference workers to stop experience collection... (20300 times) [2024-06-15 16:35:07,372][1653645] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-06-15 16:35:07,387][1653645] Updated weights for policy 0, policy_version 392786 (0.0024) [2024-06-15 16:35:07,674][1651596] Signal inference workers to resume experience collection... (20300 times) [2024-06-15 16:35:07,675][1653645] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-06-15 16:35:09,245][1653645] Updated weights for policy 0, policy_version 392864 (0.0013) [2024-06-15 16:35:10,341][1653645] Updated weights for policy 0, policy_version 392899 (0.0013) [2024-06-15 16:35:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 804716544. Throughput: 0: 11218.5. Samples: 201220096. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:35:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:35:15,817][1653645] Updated weights for policy 0, policy_version 392963 (0.0013) [2024-06-15 16:35:15,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 804782080. Throughput: 0: 11173.0. Samples: 201290752. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:35:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:35:18,463][1653645] Updated weights for policy 0, policy_version 393040 (0.0015) [2024-06-15 16:35:19,718][1653645] Updated weights for policy 0, policy_version 393085 (0.0018) [2024-06-15 16:35:20,734][1653645] Updated weights for policy 0, policy_version 393122 (0.0012) [2024-06-15 16:35:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 805109760. Throughput: 0: 11355.0. Samples: 201326080. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:35:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:35:21,537][1653645] Updated weights for policy 0, policy_version 393156 (0.0013) [2024-06-15 16:35:25,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 805306368. Throughput: 0: 11231.0. Samples: 201388032. Policy #0 lag: (min: 4.0, avg: 66.8, max: 260.0) [2024-06-15 16:35:25,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:35:27,834][1653645] Updated weights for policy 0, policy_version 393232 (0.0014) [2024-06-15 16:35:30,312][1653645] Updated weights for policy 0, policy_version 393312 (0.0016) [2024-06-15 16:35:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45329.2, 300 sec: 43875.8). Total num frames: 805535744. Throughput: 0: 11332.3. Samples: 201460224. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:35:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:35:31,961][1653645] Updated weights for policy 0, policy_version 393376 (0.0012) [2024-06-15 16:35:34,500][1653645] Updated weights for policy 0, policy_version 393462 (0.0122) [2024-06-15 16:35:35,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 805830656. Throughput: 0: 11252.7. Samples: 201489920. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:35:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:35:39,711][1653645] Updated weights for policy 0, policy_version 393506 (0.0014) [2024-06-15 16:35:40,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44786.5, 300 sec: 43542.6). Total num frames: 805961728. Throughput: 0: 11138.9. Samples: 201560064. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:35:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:35:42,484][1653645] Updated weights for policy 0, policy_version 393568 (0.0012) [2024-06-15 16:35:44,094][1653645] Updated weights for policy 0, policy_version 393639 (0.0013) [2024-06-15 16:35:45,109][1653645] Updated weights for policy 0, policy_version 393668 (0.0011) [2024-06-15 16:35:45,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 44782.7, 300 sec: 44209.0). Total num frames: 806289408. Throughput: 0: 11480.1. Samples: 201623040. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:35:45,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:35:46,374][1653645] Updated weights for policy 0, policy_version 393723 (0.0133) [2024-06-15 16:35:50,958][1648982] Fps is (10 sec: 39319.8, 60 sec: 43690.5, 300 sec: 43653.9). Total num frames: 806354944. Throughput: 0: 11355.0. Samples: 201657344. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:35:50,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:35:51,316][1651596] Signal inference workers to stop experience collection... (20350 times) [2024-06-15 16:35:51,354][1653645] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-06-15 16:35:51,540][1651596] Signal inference workers to resume experience collection... (20350 times) [2024-06-15 16:35:51,541][1653645] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-06-15 16:35:52,167][1653645] Updated weights for policy 0, policy_version 393781 (0.0013) [2024-06-15 16:35:54,300][1653645] Updated weights for policy 0, policy_version 393825 (0.0013) [2024-06-15 16:35:55,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 806682624. Throughput: 0: 11309.5. Samples: 201729024. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:35:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:35:56,024][1653645] Updated weights for policy 0, policy_version 393904 (0.0160) [2024-06-15 16:35:57,322][1653645] Updated weights for policy 0, policy_version 393952 (0.0011) [2024-06-15 16:36:00,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 806879232. Throughput: 0: 11002.3. Samples: 201785856. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:36:03,927][1653645] Updated weights for policy 0, policy_version 394021 (0.0014) [2024-06-15 16:36:05,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.9, 300 sec: 43542.6). Total num frames: 807010304. Throughput: 0: 11059.2. Samples: 201823744. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:36:07,106][1653645] Updated weights for policy 0, policy_version 394083 (0.0013) [2024-06-15 16:36:09,269][1653645] Updated weights for policy 0, policy_version 394176 (0.0012) [2024-06-15 16:36:10,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 807403520. Throughput: 0: 10888.6. Samples: 201878016. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:36:15,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.5, 300 sec: 43431.5). Total num frames: 807403520. Throughput: 0: 10786.1. Samples: 201945600. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:36:16,124][1653645] Updated weights for policy 0, policy_version 394256 (0.0015) [2024-06-15 16:36:17,154][1653645] Updated weights for policy 0, policy_version 394304 (0.0015) [2024-06-15 16:36:20,780][1653645] Updated weights for policy 0, policy_version 394400 (0.0013) [2024-06-15 16:36:20,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 807731200. Throughput: 0: 11002.3. Samples: 201985024. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:36:22,247][1653645] Updated weights for policy 0, policy_version 394467 (0.0017) [2024-06-15 16:36:25,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 807927808. Throughput: 0: 10763.4. Samples: 202044416. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:36:28,020][1653645] Updated weights for policy 0, policy_version 394528 (0.0018) [2024-06-15 16:36:30,593][1653645] Updated weights for policy 0, policy_version 394564 (0.0013) [2024-06-15 16:36:30,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42598.3, 300 sec: 43653.6). Total num frames: 808091648. Throughput: 0: 10922.7. Samples: 202114560. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:30,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 16:36:32,255][1653645] Updated weights for policy 0, policy_version 394640 (0.0013) [2024-06-15 16:36:33,036][1651596] Signal inference workers to stop experience collection... (20400 times) [2024-06-15 16:36:33,068][1653645] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-06-15 16:36:33,250][1651596] Signal inference workers to resume experience collection... (20400 times) [2024-06-15 16:36:33,251][1653645] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-06-15 16:36:34,037][1653645] Updated weights for policy 0, policy_version 394720 (0.0012) [2024-06-15 16:36:35,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 808452096. Throughput: 0: 10831.7. Samples: 202144768. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:36:39,847][1653645] Updated weights for policy 0, policy_version 394768 (0.0130) [2024-06-15 16:36:40,933][1653645] Updated weights for policy 0, policy_version 394808 (0.0016) [2024-06-15 16:36:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 43986.9). Total num frames: 808550400. Throughput: 0: 10717.8. Samples: 202211328. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:36:43,231][1653645] Updated weights for policy 0, policy_version 394852 (0.0027) [2024-06-15 16:36:45,505][1653645] Updated weights for policy 0, policy_version 394946 (0.0114) [2024-06-15 16:36:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43144.7, 300 sec: 44542.3). Total num frames: 808878080. Throughput: 0: 10774.8. Samples: 202270720. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:36:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.9, 300 sec: 44320.5). Total num frames: 808976384. Throughput: 0: 10592.7. Samples: 202300416. Policy #0 lag: (min: 3.0, avg: 83.4, max: 259.0) [2024-06-15 16:36:50,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:36:51,829][1653645] Updated weights for policy 0, policy_version 395024 (0.0012) [2024-06-15 16:36:54,972][1653645] Updated weights for policy 0, policy_version 395091 (0.0019) [2024-06-15 16:36:55,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 42598.1, 300 sec: 43986.8). Total num frames: 809238528. Throughput: 0: 11002.2. Samples: 202373120. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:36:55,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:36:56,167][1653645] Updated weights for policy 0, policy_version 395152 (0.0013) [2024-06-15 16:36:56,588][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000395168_809304064.pth... [2024-06-15 16:36:56,729][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000389952_798621696.pth [2024-06-15 16:36:58,436][1653645] Updated weights for policy 0, policy_version 395232 (0.0025) [2024-06-15 16:37:00,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 809500672. Throughput: 0: 10820.3. Samples: 202432512. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:00,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:37:04,387][1653645] Updated weights for policy 0, policy_version 395299 (0.0014) [2024-06-15 16:37:05,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 809631744. Throughput: 0: 10797.5. Samples: 202470912. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:37:07,529][1653645] Updated weights for policy 0, policy_version 395376 (0.0013) [2024-06-15 16:37:09,179][1653645] Updated weights for policy 0, policy_version 395442 (0.0066) [2024-06-15 16:37:10,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 43144.5, 300 sec: 44653.4). Total num frames: 809992192. Throughput: 0: 10843.0. Samples: 202532352. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:37:11,060][1653645] Updated weights for policy 0, policy_version 395517 (0.0018) [2024-06-15 16:37:15,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 810024960. Throughput: 0: 10831.6. Samples: 202601984. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:15,961][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:37:19,167][1653645] Updated weights for policy 0, policy_version 395586 (0.0012) [2024-06-15 16:37:20,134][1651596] Signal inference workers to stop experience collection... (20450 times) [2024-06-15 16:37:20,205][1653645] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-06-15 16:37:20,384][1651596] Signal inference workers to resume experience collection... (20450 times) [2024-06-15 16:37:20,385][1653645] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-06-15 16:37:20,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 810287104. Throughput: 0: 10854.4. Samples: 202633216. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:20,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:37:21,075][1653645] Updated weights for policy 0, policy_version 395664 (0.0014) [2024-06-15 16:37:22,477][1653645] Updated weights for policy 0, policy_version 395717 (0.0016) [2024-06-15 16:37:23,529][1653645] Updated weights for policy 0, policy_version 395772 (0.0034) [2024-06-15 16:37:25,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 810549248. Throughput: 0: 10774.8. Samples: 202696192. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:25,958][1648982] Avg episode reward: [(0, '37.110')] [2024-06-15 16:37:28,242][1653645] Updated weights for policy 0, policy_version 395824 (0.0016) [2024-06-15 16:37:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 810713088. Throughput: 0: 11207.1. Samples: 202775040. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:37:31,366][1653645] Updated weights for policy 0, policy_version 395872 (0.0014) [2024-06-15 16:37:33,134][1653645] Updated weights for policy 0, policy_version 395942 (0.0114) [2024-06-15 16:37:34,878][1653645] Updated weights for policy 0, policy_version 396016 (0.0014) [2024-06-15 16:37:35,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 811073536. Throughput: 0: 11070.6. Samples: 202798592. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:37:39,175][1653645] Updated weights for policy 0, policy_version 396064 (0.0014) [2024-06-15 16:37:40,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 44236.8, 300 sec: 44211.1). Total num frames: 811204608. Throughput: 0: 11070.6. Samples: 202871296. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:37:43,596][1653645] Updated weights for policy 0, policy_version 396129 (0.0013) [2024-06-15 16:37:45,462][1653645] Updated weights for policy 0, policy_version 396210 (0.0015) [2024-06-15 16:37:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 44320.1). Total num frames: 811466752. Throughput: 0: 11161.6. Samples: 202934784. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:45,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:37:47,365][1653645] Updated weights for policy 0, policy_version 396288 (0.0149) [2024-06-15 16:37:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 811597824. Throughput: 0: 10922.7. Samples: 202962432. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:37:53,011][1653645] Updated weights for policy 0, policy_version 396346 (0.0015) [2024-06-15 16:37:55,958][1648982] Fps is (10 sec: 29491.6, 60 sec: 42052.5, 300 sec: 43653.7). Total num frames: 811761664. Throughput: 0: 11036.4. Samples: 203028992. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:37:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:37:56,719][1653645] Updated weights for policy 0, policy_version 396400 (0.0013) [2024-06-15 16:37:58,263][1653645] Updated weights for policy 0, policy_version 396464 (0.0014) [2024-06-15 16:37:59,405][1651596] Signal inference workers to stop experience collection... (20500 times) [2024-06-15 16:37:59,467][1653645] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-06-15 16:37:59,563][1651596] Signal inference workers to resume experience collection... (20500 times) [2024-06-15 16:37:59,563][1653645] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-06-15 16:37:59,566][1653645] Updated weights for policy 0, policy_version 396528 (0.0073) [2024-06-15 16:38:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 812122112. Throughput: 0: 10888.6. Samples: 203091968. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:38:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:38:03,161][1653645] Updated weights for policy 0, policy_version 396564 (0.0012) [2024-06-15 16:38:05,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 812253184. Throughput: 0: 11082.0. Samples: 203131904. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:38:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:38:07,125][1653645] Updated weights for policy 0, policy_version 396624 (0.0023) [2024-06-15 16:38:08,796][1653645] Updated weights for policy 0, policy_version 396691 (0.0103) [2024-06-15 16:38:10,543][1653645] Updated weights for policy 0, policy_version 396769 (0.0014) [2024-06-15 16:38:10,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 812613632. Throughput: 0: 11104.7. Samples: 203195904. Policy #0 lag: (min: 14.0, avg: 102.8, max: 270.0) [2024-06-15 16:38:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:38:15,344][1653645] Updated weights for policy 0, policy_version 396832 (0.0021) [2024-06-15 16:38:15,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.3, 300 sec: 44431.2). Total num frames: 812744704. Throughput: 0: 10956.8. Samples: 203268096. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:38:18,997][1653645] Updated weights for policy 0, policy_version 396912 (0.0098) [2024-06-15 16:38:20,920][1653645] Updated weights for policy 0, policy_version 396992 (0.0092) [2024-06-15 16:38:20,958][1648982] Fps is (10 sec: 42596.9, 60 sec: 45874.9, 300 sec: 44431.1). Total num frames: 813039616. Throughput: 0: 11195.7. Samples: 203302400. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:20,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:38:22,217][1653645] Updated weights for policy 0, policy_version 397056 (0.0013) [2024-06-15 16:38:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 813170688. Throughput: 0: 10945.4. Samples: 203363840. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:38:30,279][1653645] Updated weights for policy 0, policy_version 397125 (0.0015) [2024-06-15 16:38:30,960][1648982] Fps is (10 sec: 32769.0, 60 sec: 44236.7, 300 sec: 43764.7). Total num frames: 813367296. Throughput: 0: 11093.4. Samples: 203433984. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:30,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:38:32,216][1653645] Updated weights for policy 0, policy_version 397200 (0.0013) [2024-06-15 16:38:33,624][1653645] Updated weights for policy 0, policy_version 397276 (0.0076) [2024-06-15 16:38:34,398][1653645] Updated weights for policy 0, policy_version 397311 (0.0011) [2024-06-15 16:38:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 813694976. Throughput: 0: 10990.9. Samples: 203457024. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:38:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 813826048. Throughput: 0: 11195.7. Samples: 203532800. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:40,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:38:42,843][1653645] Updated weights for policy 0, policy_version 397408 (0.0181) [2024-06-15 16:38:43,643][1651596] Signal inference workers to stop experience collection... (20550 times) [2024-06-15 16:38:43,692][1653645] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-06-15 16:38:43,863][1651596] Signal inference workers to resume experience collection... (20550 times) [2024-06-15 16:38:43,864][1653645] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-06-15 16:38:44,919][1653645] Updated weights for policy 0, policy_version 397504 (0.0011) [2024-06-15 16:38:45,958][1648982] Fps is (10 sec: 45873.3, 60 sec: 44782.7, 300 sec: 44320.1). Total num frames: 814153728. Throughput: 0: 10990.9. Samples: 203586560. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:38:46,401][1653645] Updated weights for policy 0, policy_version 397558 (0.0014) [2024-06-15 16:38:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 814219264. Throughput: 0: 10808.9. Samples: 203618304. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:38:52,220][1653645] Updated weights for policy 0, policy_version 397616 (0.0013) [2024-06-15 16:38:54,766][1653645] Updated weights for policy 0, policy_version 397669 (0.0020) [2024-06-15 16:38:55,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 45875.0, 300 sec: 44097.9). Total num frames: 814514176. Throughput: 0: 11036.4. Samples: 203692544. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:38:55,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:38:56,312][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000397728_814546944.pth... [2024-06-15 16:38:56,496][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000392592_804028416.pth [2024-06-15 16:38:57,363][1653645] Updated weights for policy 0, policy_version 397760 (0.0105) [2024-06-15 16:38:58,603][1653645] Updated weights for policy 0, policy_version 397813 (0.0013) [2024-06-15 16:39:00,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 814743552. Throughput: 0: 10604.1. Samples: 203745280. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:00,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:39:04,452][1653645] Updated weights for policy 0, policy_version 397872 (0.0015) [2024-06-15 16:39:05,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 814874624. Throughput: 0: 10809.0. Samples: 203788800. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:39:07,400][1653645] Updated weights for policy 0, policy_version 397940 (0.0013) [2024-06-15 16:39:08,592][1653645] Updated weights for policy 0, policy_version 397984 (0.0011) [2024-06-15 16:39:10,574][1653645] Updated weights for policy 0, policy_version 398049 (0.0017) [2024-06-15 16:39:10,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 815235072. Throughput: 0: 10683.8. Samples: 203844608. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:39:15,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 43764.7). Total num frames: 815333376. Throughput: 0: 10774.7. Samples: 203918848. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:15,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:39:16,109][1653645] Updated weights for policy 0, policy_version 398113 (0.0019) [2024-06-15 16:39:19,142][1653645] Updated weights for policy 0, policy_version 398160 (0.0013) [2024-06-15 16:39:20,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42052.5, 300 sec: 43653.6). Total num frames: 815562752. Throughput: 0: 10956.8. Samples: 203950080. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:39:21,130][1653645] Updated weights for policy 0, policy_version 398240 (0.0012) [2024-06-15 16:39:22,895][1653645] Updated weights for policy 0, policy_version 398307 (0.0013) [2024-06-15 16:39:25,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 815792128. Throughput: 0: 10649.6. Samples: 204012032. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:25,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:39:26,859][1653645] Updated weights for policy 0, policy_version 398338 (0.0012) [2024-06-15 16:39:27,626][1651596] Signal inference workers to stop experience collection... (20600 times) [2024-06-15 16:39:27,681][1653645] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-06-15 16:39:27,914][1651596] Signal inference workers to resume experience collection... (20600 times) [2024-06-15 16:39:27,915][1653645] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-06-15 16:39:30,966][1648982] Fps is (10 sec: 36013.8, 60 sec: 42592.3, 300 sec: 43541.3). Total num frames: 815923200. Throughput: 0: 11114.1. Samples: 204086784. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:30,967][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:39:31,143][1653645] Updated weights for policy 0, policy_version 398416 (0.0092) [2024-06-15 16:39:32,697][1653645] Updated weights for policy 0, policy_version 398496 (0.0026) [2024-06-15 16:39:34,826][1653645] Updated weights for policy 0, policy_version 398580 (0.0013) [2024-06-15 16:39:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44209.7). Total num frames: 816316416. Throughput: 0: 10968.2. Samples: 204111872. Policy #0 lag: (min: 15.0, avg: 115.5, max: 271.0) [2024-06-15 16:39:35,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 16:39:39,431][1653645] Updated weights for policy 0, policy_version 398628 (0.0014) [2024-06-15 16:39:40,958][1648982] Fps is (10 sec: 52473.2, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 816447488. Throughput: 0: 10797.5. Samples: 204178432. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:39:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:39:42,411][1653645] Updated weights for policy 0, policy_version 398673 (0.0022) [2024-06-15 16:39:44,123][1653645] Updated weights for policy 0, policy_version 398752 (0.0014) [2024-06-15 16:39:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43691.0, 300 sec: 44209.1). Total num frames: 816775168. Throughput: 0: 11059.2. Samples: 204242944. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:39:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:39:46,225][1653645] Updated weights for policy 0, policy_version 398832 (0.0012) [2024-06-15 16:39:50,423][1653645] Updated weights for policy 0, policy_version 398865 (0.0014) [2024-06-15 16:39:50,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 816906240. Throughput: 0: 10888.6. Samples: 204278784. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:39:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:39:53,920][1653645] Updated weights for policy 0, policy_version 398913 (0.0013) [2024-06-15 16:39:55,906][1653645] Updated weights for policy 0, policy_version 398992 (0.0014) [2024-06-15 16:39:55,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 817135616. Throughput: 0: 11229.8. Samples: 204349952. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:39:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:39:57,155][1653645] Updated weights for policy 0, policy_version 399040 (0.0019) [2024-06-15 16:39:58,704][1653645] Updated weights for policy 0, policy_version 399093 (0.0012) [2024-06-15 16:40:00,959][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 817364992. Throughput: 0: 10843.0. Samples: 204406784. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:00,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:40:02,897][1653645] Updated weights for policy 0, policy_version 399136 (0.0012) [2024-06-15 16:40:05,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 817496064. Throughput: 0: 10843.0. Samples: 204438016. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:05,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 16:40:06,168][1653645] Updated weights for policy 0, policy_version 399171 (0.0012) [2024-06-15 16:40:08,055][1653645] Updated weights for policy 0, policy_version 399253 (0.0013) [2024-06-15 16:40:09,619][1651596] Signal inference workers to stop experience collection... (20650 times) [2024-06-15 16:40:09,700][1653645] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-06-15 16:40:09,976][1651596] Signal inference workers to resume experience collection... (20650 times) [2024-06-15 16:40:09,977][1653645] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-06-15 16:40:10,114][1653645] Updated weights for policy 0, policy_version 399314 (0.0020) [2024-06-15 16:40:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 817889280. Throughput: 0: 10979.5. Samples: 204506112. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:40:13,821][1653645] Updated weights for policy 0, policy_version 399376 (0.0014) [2024-06-15 16:40:15,289][1653645] Updated weights for policy 0, policy_version 399422 (0.0013) [2024-06-15 16:40:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44783.1, 300 sec: 43764.7). Total num frames: 818020352. Throughput: 0: 10833.7. Samples: 204574208. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:40:18,380][1653645] Updated weights for policy 0, policy_version 399482 (0.0012) [2024-06-15 16:40:20,635][1653645] Updated weights for policy 0, policy_version 399549 (0.0012) [2024-06-15 16:40:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 818282496. Throughput: 0: 11150.2. Samples: 204613632. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:20,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:40:22,207][1653645] Updated weights for policy 0, policy_version 399610 (0.0110) [2024-06-15 16:40:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 818446336. Throughput: 0: 11070.6. Samples: 204676608. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:40:29,017][1653645] Updated weights for policy 0, policy_version 399681 (0.0013) [2024-06-15 16:40:30,623][1653645] Updated weights for policy 0, policy_version 399745 (0.0013) [2024-06-15 16:40:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46428.0, 300 sec: 43653.6). Total num frames: 818708480. Throughput: 0: 11173.0. Samples: 204745728. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:40:32,168][1653645] Updated weights for policy 0, policy_version 399808 (0.0132) [2024-06-15 16:40:34,470][1653645] Updated weights for policy 0, policy_version 399872 (0.0014) [2024-06-15 16:40:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 818937856. Throughput: 0: 11002.3. Samples: 204773888. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:40:39,860][1653645] Updated weights for policy 0, policy_version 399934 (0.0012) [2024-06-15 16:40:40,963][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 819068928. Throughput: 0: 10968.2. Samples: 204843520. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:40,964][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:40:42,261][1653645] Updated weights for policy 0, policy_version 399984 (0.0012) [2024-06-15 16:40:43,334][1653645] Updated weights for policy 0, policy_version 400020 (0.0011) [2024-06-15 16:40:45,324][1653645] Updated weights for policy 0, policy_version 400096 (0.0018) [2024-06-15 16:40:45,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 44320.2). Total num frames: 819429376. Throughput: 0: 11013.7. Samples: 204902400. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:40:50,850][1653645] Updated weights for policy 0, policy_version 400145 (0.0012) [2024-06-15 16:40:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 819494912. Throughput: 0: 11127.5. Samples: 204938752. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:40:52,995][1653645] Updated weights for policy 0, policy_version 400208 (0.0012) [2024-06-15 16:40:54,319][1653645] Updated weights for policy 0, policy_version 400250 (0.0032) [2024-06-15 16:40:55,348][1653645] Updated weights for policy 0, policy_version 400308 (0.0125) [2024-06-15 16:40:55,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 45328.9, 300 sec: 43986.9). Total num frames: 819855360. Throughput: 0: 11127.4. Samples: 205006848. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:40:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:40:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000400320_819855360.pth... [2024-06-15 16:40:56,036][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000395168_809304064.pth [2024-06-15 16:40:56,480][1651596] Signal inference workers to stop experience collection... (20700 times) [2024-06-15 16:40:56,509][1653645] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-06-15 16:40:56,847][1651596] Signal inference workers to resume experience collection... (20700 times) [2024-06-15 16:40:56,848][1653645] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-06-15 16:40:58,123][1653645] Updated weights for policy 0, policy_version 400380 (0.0029) [2024-06-15 16:41:00,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 819986432. Throughput: 0: 11025.0. Samples: 205070336. Policy #0 lag: (min: 15.0, avg: 117.0, max: 271.0) [2024-06-15 16:41:00,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:41:02,797][1653645] Updated weights for policy 0, policy_version 400441 (0.0015) [2024-06-15 16:41:05,693][1653645] Updated weights for policy 0, policy_version 400512 (0.0013) [2024-06-15 16:41:05,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 45875.3, 300 sec: 43542.6). Total num frames: 820248576. Throughput: 0: 11070.6. Samples: 205111808. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:05,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:41:07,077][1653645] Updated weights for policy 0, policy_version 400564 (0.0013) [2024-06-15 16:41:08,863][1653645] Updated weights for policy 0, policy_version 400608 (0.0012) [2024-06-15 16:41:10,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 820510720. Throughput: 0: 11093.3. Samples: 205175808. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:10,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:41:14,509][1653645] Updated weights for policy 0, policy_version 400673 (0.0014) [2024-06-15 16:41:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 820641792. Throughput: 0: 11195.7. Samples: 205249536. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:41:16,370][1653645] Updated weights for policy 0, policy_version 400721 (0.0012) [2024-06-15 16:41:18,437][1653645] Updated weights for policy 0, policy_version 400816 (0.0013) [2024-06-15 16:41:19,833][1653645] Updated weights for policy 0, policy_version 400850 (0.0025) [2024-06-15 16:41:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 821035008. Throughput: 0: 11309.5. Samples: 205282816. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:41:24,877][1653645] Updated weights for policy 0, policy_version 400912 (0.0022) [2024-06-15 16:41:25,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 821133312. Throughput: 0: 11355.1. Samples: 205354496. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:41:25,964][1653645] Updated weights for policy 0, policy_version 400960 (0.0016) [2024-06-15 16:41:28,914][1653645] Updated weights for policy 0, policy_version 401024 (0.0126) [2024-06-15 16:41:30,327][1653645] Updated weights for policy 0, policy_version 401085 (0.0013) [2024-06-15 16:41:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 821428224. Throughput: 0: 11491.5. Samples: 205419520. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:41:35,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 821559296. Throughput: 0: 11343.6. Samples: 205449216. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:35,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 16:41:36,875][1653645] Updated weights for policy 0, policy_version 401184 (0.0014) [2024-06-15 16:41:39,891][1653645] Updated weights for policy 0, policy_version 401234 (0.0013) [2024-06-15 16:41:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 43764.7). Total num frames: 821788672. Throughput: 0: 11537.1. Samples: 205526016. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:41:41,545][1651596] Signal inference workers to stop experience collection... (20750 times) [2024-06-15 16:41:41,669][1653645] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-06-15 16:41:41,851][1651596] Signal inference workers to resume experience collection... (20750 times) [2024-06-15 16:41:41,852][1653645] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-06-15 16:41:42,593][1653645] Updated weights for policy 0, policy_version 401336 (0.0096) [2024-06-15 16:41:45,062][1653645] Updated weights for policy 0, policy_version 401399 (0.0019) [2024-06-15 16:41:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 822083584. Throughput: 0: 11275.4. Samples: 205577728. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:45,959][1648982] Avg episode reward: [(0, '37.070')] [2024-06-15 16:41:49,538][1653645] Updated weights for policy 0, policy_version 401467 (0.0014) [2024-06-15 16:41:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 822214656. Throughput: 0: 11207.1. Samples: 205616128. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:50,958][1648982] Avg episode reward: [(0, '36.820')] [2024-06-15 16:41:53,096][1653645] Updated weights for policy 0, policy_version 401521 (0.0028) [2024-06-15 16:41:54,507][1653645] Updated weights for policy 0, policy_version 401586 (0.0015) [2024-06-15 16:41:55,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44237.0, 300 sec: 44098.0). Total num frames: 822509568. Throughput: 0: 11127.5. Samples: 205676544. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:41:55,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:41:56,468][1653645] Updated weights for policy 0, policy_version 401634 (0.0020) [2024-06-15 16:42:00,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 822673408. Throughput: 0: 11013.7. Samples: 205745152. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:42:00,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:42:01,307][1653645] Updated weights for policy 0, policy_version 401719 (0.0015) [2024-06-15 16:42:04,286][1653645] Updated weights for policy 0, policy_version 401760 (0.0012) [2024-06-15 16:42:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 822935552. Throughput: 0: 11138.8. Samples: 205784064. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:42:05,958][1648982] Avg episode reward: [(0, '36.580')] [2024-06-15 16:42:06,089][1653645] Updated weights for policy 0, policy_version 401826 (0.0013) [2024-06-15 16:42:08,032][1653645] Updated weights for policy 0, policy_version 401874 (0.0023) [2024-06-15 16:42:10,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 823132160. Throughput: 0: 10808.8. Samples: 205840896. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:42:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:42:11,780][1653645] Updated weights for policy 0, policy_version 401936 (0.0018) [2024-06-15 16:42:15,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 823263232. Throughput: 0: 11013.7. Samples: 205915136. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:42:15,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 16:42:16,174][1653645] Updated weights for policy 0, policy_version 401986 (0.0015) [2024-06-15 16:42:18,804][1653645] Updated weights for policy 0, policy_version 402103 (0.0111) [2024-06-15 16:42:20,886][1653645] Updated weights for policy 0, policy_version 402168 (0.0012) [2024-06-15 16:42:20,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43144.4, 300 sec: 44320.1). Total num frames: 823623680. Throughput: 0: 10877.1. Samples: 205938688. Policy #0 lag: (min: 15.0, avg: 98.6, max: 271.0) [2024-06-15 16:42:20,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:42:24,820][1653645] Updated weights for policy 0, policy_version 402224 (0.0013) [2024-06-15 16:42:25,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 823787520. Throughput: 0: 10592.7. Samples: 206002688. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:42:29,503][1653645] Updated weights for policy 0, policy_version 402288 (0.0011) [2024-06-15 16:42:29,634][1651596] Signal inference workers to stop experience collection... (20800 times) [2024-06-15 16:42:29,699][1653645] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-06-15 16:42:29,825][1651596] Signal inference workers to resume experience collection... (20800 times) [2024-06-15 16:42:29,826][1653645] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-06-15 16:42:30,790][1653645] Updated weights for policy 0, policy_version 402337 (0.0012) [2024-06-15 16:42:30,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 42598.5, 300 sec: 43764.8). Total num frames: 823984128. Throughput: 0: 11013.7. Samples: 206073344. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:42:35,723][1653645] Updated weights for policy 0, policy_version 402433 (0.0016) [2024-06-15 16:42:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 824180736. Throughput: 0: 10774.8. Samples: 206100992. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:42:40,658][1653645] Updated weights for policy 0, policy_version 402498 (0.0021) [2024-06-15 16:42:40,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 42598.5, 300 sec: 43653.7). Total num frames: 824344576. Throughput: 0: 10968.2. Samples: 206170112. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:42:42,232][1653645] Updated weights for policy 0, policy_version 402576 (0.0012) [2024-06-15 16:42:43,838][1653645] Updated weights for policy 0, policy_version 402630 (0.0014) [2024-06-15 16:42:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 824705024. Throughput: 0: 10843.1. Samples: 206233088. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:45,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:42:47,984][1653645] Updated weights for policy 0, policy_version 402704 (0.0098) [2024-06-15 16:42:50,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 824836096. Throughput: 0: 10706.5. Samples: 206265856. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:42:52,231][1653645] Updated weights for policy 0, policy_version 402760 (0.0011) [2024-06-15 16:42:53,655][1653645] Updated weights for policy 0, policy_version 402819 (0.0110) [2024-06-15 16:42:55,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 825131008. Throughput: 0: 10888.5. Samples: 206330880. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:42:55,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:42:56,490][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000402928_825196544.pth... [2024-06-15 16:42:56,534][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000397728_814546944.pth [2024-06-15 16:42:56,671][1653645] Updated weights for policy 0, policy_version 402933 (0.0093) [2024-06-15 16:43:00,373][1653645] Updated weights for policy 0, policy_version 402976 (0.0015) [2024-06-15 16:43:00,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 825327616. Throughput: 0: 10763.4. Samples: 206399488. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:43:00,983][1653645] Updated weights for policy 0, policy_version 403008 (0.0012) [2024-06-15 16:43:04,794][1653645] Updated weights for policy 0, policy_version 403059 (0.0014) [2024-06-15 16:43:05,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 825556992. Throughput: 0: 11116.1. Samples: 206438912. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:43:06,337][1653645] Updated weights for policy 0, policy_version 403131 (0.0154) [2024-06-15 16:43:08,415][1653645] Updated weights for policy 0, policy_version 403184 (0.0073) [2024-06-15 16:43:10,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 825753600. Throughput: 0: 11070.5. Samples: 206500864. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:10,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 16:43:11,142][1653645] Updated weights for policy 0, policy_version 403217 (0.0011) [2024-06-15 16:43:12,020][1653645] Updated weights for policy 0, policy_version 403261 (0.0011) [2024-06-15 16:43:15,969][1648982] Fps is (10 sec: 32732.0, 60 sec: 43682.4, 300 sec: 43540.9). Total num frames: 825884672. Throughput: 0: 11158.8. Samples: 206575616. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:15,972][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:43:16,685][1651596] Signal inference workers to stop experience collection... (20850 times) [2024-06-15 16:43:16,726][1653645] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-06-15 16:43:16,958][1651596] Signal inference workers to resume experience collection... (20850 times) [2024-06-15 16:43:16,959][1653645] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-06-15 16:43:17,297][1653645] Updated weights for policy 0, policy_version 403328 (0.0014) [2024-06-15 16:43:18,839][1653645] Updated weights for policy 0, policy_version 403392 (0.0114) [2024-06-15 16:43:20,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 826277888. Throughput: 0: 11173.0. Samples: 206603776. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:43:22,662][1653645] Updated weights for policy 0, policy_version 403457 (0.0013) [2024-06-15 16:43:23,989][1653645] Updated weights for policy 0, policy_version 403520 (0.0112) [2024-06-15 16:43:25,958][1648982] Fps is (10 sec: 52486.7, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 826408960. Throughput: 0: 10990.9. Samples: 206664704. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:43:29,453][1653645] Updated weights for policy 0, policy_version 403584 (0.0014) [2024-06-15 16:43:30,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 826638336. Throughput: 0: 11093.3. Samples: 206732288. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:43:31,328][1653645] Updated weights for policy 0, policy_version 403644 (0.0045) [2024-06-15 16:43:32,727][1653645] Updated weights for policy 0, policy_version 403703 (0.0013) [2024-06-15 16:43:35,911][1653645] Updated weights for policy 0, policy_version 403760 (0.0013) [2024-06-15 16:43:35,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 826900480. Throughput: 0: 11002.3. Samples: 206760960. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:35,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:43:40,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 44236.7, 300 sec: 43542.6). Total num frames: 826998784. Throughput: 0: 11161.6. Samples: 206833152. Policy #0 lag: (min: 7.0, avg: 123.7, max: 263.0) [2024-06-15 16:43:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:43:41,057][1653645] Updated weights for policy 0, policy_version 403810 (0.0165) [2024-06-15 16:43:42,666][1653645] Updated weights for policy 0, policy_version 403874 (0.0013) [2024-06-15 16:43:43,583][1653645] Updated weights for policy 0, policy_version 403905 (0.0013) [2024-06-15 16:43:45,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 827326464. Throughput: 0: 10843.0. Samples: 206887424. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:43:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:43:47,307][1653645] Updated weights for policy 0, policy_version 404005 (0.0029) [2024-06-15 16:43:50,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 827457536. Throughput: 0: 10717.9. Samples: 206921216. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:43:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:43:53,273][1653645] Updated weights for policy 0, policy_version 404065 (0.0016) [2024-06-15 16:43:55,325][1653645] Updated weights for policy 0, policy_version 404149 (0.0096) [2024-06-15 16:43:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 827752448. Throughput: 0: 10899.9. Samples: 206991360. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:43:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:43:56,456][1651596] Signal inference workers to stop experience collection... (20900 times) [2024-06-15 16:43:56,494][1653645] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-06-15 16:43:56,659][1651596] Signal inference workers to resume experience collection... (20900 times) [2024-06-15 16:43:56,660][1653645] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-06-15 16:43:56,662][1653645] Updated weights for policy 0, policy_version 404208 (0.0013) [2024-06-15 16:43:58,739][1653645] Updated weights for policy 0, policy_version 404256 (0.0119) [2024-06-15 16:44:00,960][1648982] Fps is (10 sec: 52420.2, 60 sec: 44235.4, 300 sec: 44430.9). Total num frames: 827981824. Throughput: 0: 10629.0. Samples: 207053824. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:00,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:44:05,162][1653645] Updated weights for policy 0, policy_version 404336 (0.0113) [2024-06-15 16:44:05,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 42598.6, 300 sec: 43653.6). Total num frames: 828112896. Throughput: 0: 10854.4. Samples: 207092224. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:44:06,635][1653645] Updated weights for policy 0, policy_version 404385 (0.0017) [2024-06-15 16:44:07,883][1653645] Updated weights for policy 0, policy_version 404448 (0.0019) [2024-06-15 16:44:10,242][1653645] Updated weights for policy 0, policy_version 404496 (0.0013) [2024-06-15 16:44:10,958][1648982] Fps is (10 sec: 45883.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 828440576. Throughput: 0: 10922.7. Samples: 207156224. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:44:15,954][1653645] Updated weights for policy 0, policy_version 404548 (0.0014) [2024-06-15 16:44:15,973][1648982] Fps is (10 sec: 39261.5, 60 sec: 43687.8, 300 sec: 43873.5). Total num frames: 828506112. Throughput: 0: 11032.7. Samples: 207228928. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:15,974][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:44:17,939][1653645] Updated weights for policy 0, policy_version 404640 (0.0016) [2024-06-15 16:44:19,320][1653645] Updated weights for policy 0, policy_version 404688 (0.0022) [2024-06-15 16:44:20,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 828899328. Throughput: 0: 10968.2. Samples: 207254528. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:44:22,267][1653645] Updated weights for policy 0, policy_version 404768 (0.0016) [2024-06-15 16:44:25,958][1648982] Fps is (10 sec: 52509.0, 60 sec: 43690.8, 300 sec: 44432.5). Total num frames: 829030400. Throughput: 0: 10865.8. Samples: 207322112. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:44:28,297][1653645] Updated weights for policy 0, policy_version 404816 (0.0013) [2024-06-15 16:44:30,118][1653645] Updated weights for policy 0, policy_version 404880 (0.0012) [2024-06-15 16:44:30,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 829259776. Throughput: 0: 11161.6. Samples: 207389696. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:44:31,869][1653645] Updated weights for policy 0, policy_version 404960 (0.0014) [2024-06-15 16:44:34,169][1653645] Updated weights for policy 0, policy_version 405024 (0.0014) [2024-06-15 16:44:35,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 829554688. Throughput: 0: 10991.0. Samples: 207415808. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:44:40,626][1653645] Updated weights for policy 0, policy_version 405057 (0.0015) [2024-06-15 16:44:40,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 829587456. Throughput: 0: 10968.2. Samples: 207484928. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:44:41,705][1653645] Updated weights for policy 0, policy_version 405110 (0.0012) [2024-06-15 16:44:42,461][1653645] Updated weights for policy 0, policy_version 405137 (0.0014) [2024-06-15 16:44:43,285][1651596] Signal inference workers to stop experience collection... (20950 times) [2024-06-15 16:44:43,312][1653645] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-06-15 16:44:43,535][1651596] Signal inference workers to resume experience collection... (20950 times) [2024-06-15 16:44:43,535][1653645] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-06-15 16:44:44,151][1653645] Updated weights for policy 0, policy_version 405203 (0.0013) [2024-06-15 16:44:45,319][1653645] Updated weights for policy 0, policy_version 405246 (0.0013) [2024-06-15 16:44:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 829980672. Throughput: 0: 10900.4. Samples: 207544320. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:44:46,924][1653645] Updated weights for policy 0, policy_version 405305 (0.0014) [2024-06-15 16:44:50,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 830078976. Throughput: 0: 10740.6. Samples: 207575552. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:44:53,634][1653645] Updated weights for policy 0, policy_version 405344 (0.0013) [2024-06-15 16:44:55,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 830341120. Throughput: 0: 10979.6. Samples: 207650304. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:44:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:44:56,404][1653645] Updated weights for policy 0, policy_version 405461 (0.0014) [2024-06-15 16:44:56,584][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000405472_830406656.pth... [2024-06-15 16:44:56,731][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000400320_819855360.pth [2024-06-15 16:44:58,596][1653645] Updated weights for policy 0, policy_version 405552 (0.0123) [2024-06-15 16:45:00,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43692.0, 300 sec: 44431.2). Total num frames: 830603264. Throughput: 0: 10562.1. Samples: 207704064. Policy #0 lag: (min: 58.0, avg: 193.0, max: 311.0) [2024-06-15 16:45:00,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:45:05,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 830668800. Throughput: 0: 10843.0. Samples: 207742464. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:45:06,273][1653645] Updated weights for policy 0, policy_version 405621 (0.0015) [2024-06-15 16:45:08,426][1653645] Updated weights for policy 0, policy_version 405712 (0.0108) [2024-06-15 16:45:10,272][1653645] Updated weights for policy 0, policy_version 405779 (0.0013) [2024-06-15 16:45:10,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 831094784. Throughput: 0: 10752.0. Samples: 207805952. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:45:15,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 43701.6, 300 sec: 43542.5). Total num frames: 831127552. Throughput: 0: 10774.7. Samples: 207874560. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:45:17,508][1653645] Updated weights for policy 0, policy_version 405843 (0.0014) [2024-06-15 16:45:20,091][1653645] Updated weights for policy 0, policy_version 405941 (0.0089) [2024-06-15 16:45:20,957][1648982] Fps is (10 sec: 32768.6, 60 sec: 42052.4, 300 sec: 43986.9). Total num frames: 831422464. Throughput: 0: 10979.6. Samples: 207909888. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:20,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:45:21,771][1653645] Updated weights for policy 0, policy_version 406003 (0.0033) [2024-06-15 16:45:22,453][1651596] Signal inference workers to stop experience collection... (21000 times) [2024-06-15 16:45:22,498][1653645] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-06-15 16:45:22,739][1651596] Signal inference workers to resume experience collection... (21000 times) [2024-06-15 16:45:22,740][1653645] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-06-15 16:45:23,398][1653645] Updated weights for policy 0, policy_version 406078 (0.0087) [2024-06-15 16:45:25,960][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 831651840. Throughput: 0: 10558.6. Samples: 207960064. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:25,960][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:45:30,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 831750144. Throughput: 0: 10922.7. Samples: 208035840. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:45:31,280][1653645] Updated weights for policy 0, policy_version 406149 (0.0034) [2024-06-15 16:45:33,610][1653645] Updated weights for policy 0, policy_version 406225 (0.0014) [2024-06-15 16:45:35,331][1653645] Updated weights for policy 0, policy_version 406304 (0.0011) [2024-06-15 16:45:35,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 832143360. Throughput: 0: 10820.3. Samples: 208062464. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:45:40,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43144.4, 300 sec: 43209.3). Total num frames: 832176128. Throughput: 0: 10660.9. Samples: 208130048. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:45:42,064][1653645] Updated weights for policy 0, policy_version 406340 (0.0021) [2024-06-15 16:45:43,626][1653645] Updated weights for policy 0, policy_version 406402 (0.0013) [2024-06-15 16:45:45,419][1653645] Updated weights for policy 0, policy_version 406472 (0.0013) [2024-06-15 16:45:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 44098.0). Total num frames: 832503808. Throughput: 0: 10854.4. Samples: 208192512. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:45:47,289][1653645] Updated weights for policy 0, policy_version 406556 (0.0138) [2024-06-15 16:45:50,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 832700416. Throughput: 0: 10581.3. Samples: 208218624. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:45:53,891][1653645] Updated weights for policy 0, policy_version 406596 (0.0013) [2024-06-15 16:45:55,934][1653645] Updated weights for policy 0, policy_version 406688 (0.0014) [2024-06-15 16:45:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 43764.8). Total num frames: 832897024. Throughput: 0: 10797.5. Samples: 208291840. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:45:55,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 16:45:57,680][1653645] Updated weights for policy 0, policy_version 406738 (0.0014) [2024-06-15 16:45:58,504][1653645] Updated weights for policy 0, policy_version 406776 (0.0010) [2024-06-15 16:46:00,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 833224704. Throughput: 0: 10513.1. Samples: 208347648. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:46:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:46:05,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 42598.4, 300 sec: 43098.3). Total num frames: 833224704. Throughput: 0: 10604.1. Samples: 208387072. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:46:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:46:06,171][1653645] Updated weights for policy 0, policy_version 406849 (0.0013) [2024-06-15 16:46:08,411][1653645] Updated weights for policy 0, policy_version 406944 (0.0014) [2024-06-15 16:46:08,523][1651596] Signal inference workers to stop experience collection... (21050 times) [2024-06-15 16:46:08,594][1653645] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-06-15 16:46:08,741][1651596] Signal inference workers to resume experience collection... (21050 times) [2024-06-15 16:46:08,764][1653645] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-06-15 16:46:10,752][1653645] Updated weights for policy 0, policy_version 407034 (0.0012) [2024-06-15 16:46:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 42052.1, 300 sec: 43986.8). Total num frames: 833617920. Throughput: 0: 10774.7. Samples: 208444928. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:46:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:46:12,467][1653645] Updated weights for policy 0, policy_version 407092 (0.0032) [2024-06-15 16:46:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.9, 300 sec: 43098.3). Total num frames: 833748992. Throughput: 0: 10490.3. Samples: 208507904. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:46:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:46:19,381][1653645] Updated weights for policy 0, policy_version 407136 (0.0016) [2024-06-15 16:46:20,958][1648982] Fps is (10 sec: 32768.9, 60 sec: 42052.2, 300 sec: 43431.5). Total num frames: 833945600. Throughput: 0: 10820.3. Samples: 208549376. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:46:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:46:21,345][1653645] Updated weights for policy 0, policy_version 407219 (0.0014) [2024-06-15 16:46:22,842][1653645] Updated weights for policy 0, policy_version 407280 (0.0018) [2024-06-15 16:46:23,966][1653645] Updated weights for policy 0, policy_version 407325 (0.0017) [2024-06-15 16:46:24,742][1653645] Updated weights for policy 0, policy_version 407360 (0.0012) [2024-06-15 16:46:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 834273280. Throughput: 0: 10570.0. Samples: 208605696. Policy #0 lag: (min: 7.0, avg: 43.4, max: 231.0) [2024-06-15 16:46:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:46:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 43320.4). Total num frames: 834338816. Throughput: 0: 10843.0. Samples: 208680448. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:46:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:46:31,311][1653645] Updated weights for policy 0, policy_version 407418 (0.0012) [2024-06-15 16:46:33,684][1653645] Updated weights for policy 0, policy_version 407489 (0.0013) [2024-06-15 16:46:35,108][1653645] Updated weights for policy 0, policy_version 407548 (0.0011) [2024-06-15 16:46:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 834732032. Throughput: 0: 10854.4. Samples: 208707072. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:46:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:46:36,601][1653645] Updated weights for policy 0, policy_version 407616 (0.0089) [2024-06-15 16:46:40,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43690.8, 300 sec: 43098.2). Total num frames: 834797568. Throughput: 0: 10592.7. Samples: 208768512. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:46:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:46:43,558][1653645] Updated weights for policy 0, policy_version 407680 (0.0046) [2024-06-15 16:46:45,088][1653645] Updated weights for policy 0, policy_version 407744 (0.0014) [2024-06-15 16:46:45,959][1648982] Fps is (10 sec: 39318.3, 60 sec: 43690.0, 300 sec: 43764.6). Total num frames: 835125248. Throughput: 0: 10945.2. Samples: 208840192. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:46:45,964][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:46:46,672][1653645] Updated weights for policy 0, policy_version 407808 (0.0012) [2024-06-15 16:46:48,406][1653645] Updated weights for policy 0, policy_version 407861 (0.0014) [2024-06-15 16:46:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 835321856. Throughput: 0: 10649.6. Samples: 208866304. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:46:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:46:54,017][1651596] Signal inference workers to stop experience collection... (21100 times) [2024-06-15 16:46:54,067][1653645] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-06-15 16:46:54,248][1651596] Signal inference workers to resume experience collection... (21100 times) [2024-06-15 16:46:54,249][1653645] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-06-15 16:46:54,410][1653645] Updated weights for policy 0, policy_version 407890 (0.0013) [2024-06-15 16:46:55,958][1648982] Fps is (10 sec: 36047.0, 60 sec: 43144.4, 300 sec: 43431.5). Total num frames: 835485696. Throughput: 0: 11207.1. Samples: 208949248. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:46:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:46:56,205][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000407968_835518464.pth... [2024-06-15 16:46:56,206][1653645] Updated weights for policy 0, policy_version 407968 (0.0025) [2024-06-15 16:46:56,372][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000402928_825196544.pth [2024-06-15 16:46:58,137][1653645] Updated weights for policy 0, policy_version 408032 (0.0130) [2024-06-15 16:46:59,764][1653645] Updated weights for policy 0, policy_version 408098 (0.0012) [2024-06-15 16:47:00,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 835846144. Throughput: 0: 10922.7. Samples: 208999424. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:00,958][1648982] Avg episode reward: [(0, '37.060')] [2024-06-15 16:47:05,966][1648982] Fps is (10 sec: 36014.7, 60 sec: 43684.4, 300 sec: 43097.0). Total num frames: 835846144. Throughput: 0: 10852.3. Samples: 209037824. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:05,967][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:47:06,754][1653645] Updated weights for policy 0, policy_version 408161 (0.0015) [2024-06-15 16:47:08,198][1653645] Updated weights for policy 0, policy_version 408208 (0.0016) [2024-06-15 16:47:09,403][1653645] Updated weights for policy 0, policy_version 408256 (0.0014) [2024-06-15 16:47:10,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 836206592. Throughput: 0: 11127.4. Samples: 209106432. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:10,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:47:11,215][1653645] Updated weights for policy 0, policy_version 408322 (0.0018) [2024-06-15 16:47:12,726][1653645] Updated weights for policy 0, policy_version 408384 (0.0012) [2024-06-15 16:47:15,958][1648982] Fps is (10 sec: 52474.0, 60 sec: 43690.6, 300 sec: 43209.4). Total num frames: 836370432. Throughput: 0: 10820.3. Samples: 209167360. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:47:19,475][1653645] Updated weights for policy 0, policy_version 408440 (0.0120) [2024-06-15 16:47:20,891][1653645] Updated weights for policy 0, policy_version 408480 (0.0012) [2024-06-15 16:47:20,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.4, 300 sec: 43320.4). Total num frames: 836567040. Throughput: 0: 11104.6. Samples: 209206784. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:20,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 16:47:23,013][1653645] Updated weights for policy 0, policy_version 408560 (0.0151) [2024-06-15 16:47:24,581][1653645] Updated weights for policy 0, policy_version 408629 (0.0011) [2024-06-15 16:47:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 836894720. Throughput: 0: 10865.8. Samples: 209257472. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:47:30,958][1648982] Fps is (10 sec: 36045.9, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 836927488. Throughput: 0: 11036.6. Samples: 209336832. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:47:31,060][1653645] Updated weights for policy 0, policy_version 408672 (0.0013) [2024-06-15 16:47:32,527][1653645] Updated weights for policy 0, policy_version 408722 (0.0013) [2024-06-15 16:47:33,788][1651596] Signal inference workers to stop experience collection... (21150 times) [2024-06-15 16:47:33,840][1653645] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-06-15 16:47:34,027][1651596] Signal inference workers to resume experience collection... (21150 times) [2024-06-15 16:47:34,029][1653645] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-06-15 16:47:34,870][1653645] Updated weights for policy 0, policy_version 408816 (0.0014) [2024-06-15 16:47:35,959][1648982] Fps is (10 sec: 42594.9, 60 sec: 43144.0, 300 sec: 43986.8). Total num frames: 837320704. Throughput: 0: 11093.2. Samples: 209365504. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:35,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:47:36,581][1653645] Updated weights for policy 0, policy_version 408880 (0.0012) [2024-06-15 16:47:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 43098.2). Total num frames: 837419008. Throughput: 0: 10570.0. Samples: 209424896. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:47:43,586][1653645] Updated weights for policy 0, policy_version 408944 (0.0011) [2024-06-15 16:47:45,456][1653645] Updated weights for policy 0, policy_version 408993 (0.0026) [2024-06-15 16:47:45,958][1648982] Fps is (10 sec: 32769.5, 60 sec: 42052.6, 300 sec: 43431.4). Total num frames: 837648384. Throughput: 0: 10990.8. Samples: 209494016. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:45,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:47:47,634][1653645] Updated weights for policy 0, policy_version 409090 (0.0012) [2024-06-15 16:47:48,585][1653645] Updated weights for policy 0, policy_version 409146 (0.0095) [2024-06-15 16:47:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 837943296. Throughput: 0: 10663.0. Samples: 209517568. Policy #0 lag: (min: 10.0, avg: 71.9, max: 266.0) [2024-06-15 16:47:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:47:55,637][1653645] Updated weights for policy 0, policy_version 409190 (0.0012) [2024-06-15 16:47:55,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 43098.2). Total num frames: 838041600. Throughput: 0: 10922.7. Samples: 209597952. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:47:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:47:56,452][1653645] Updated weights for policy 0, policy_version 409220 (0.0012) [2024-06-15 16:47:58,564][1653645] Updated weights for policy 0, policy_version 409312 (0.0116) [2024-06-15 16:48:00,518][1653645] Updated weights for policy 0, policy_version 409401 (0.0014) [2024-06-15 16:48:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43764.8). Total num frames: 838467584. Throughput: 0: 10638.2. Samples: 209646080. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:00,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:48:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43696.8, 300 sec: 43098.3). Total num frames: 838467584. Throughput: 0: 10695.1. Samples: 209688064. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:48:07,597][1653645] Updated weights for policy 0, policy_version 409466 (0.0014) [2024-06-15 16:48:09,528][1653645] Updated weights for policy 0, policy_version 409520 (0.0132) [2024-06-15 16:48:10,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.7, 300 sec: 43877.5). Total num frames: 838828032. Throughput: 0: 11138.8. Samples: 209758720. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:48:11,429][1653645] Updated weights for policy 0, policy_version 409601 (0.0013) [2024-06-15 16:48:12,963][1653645] Updated weights for policy 0, policy_version 409661 (0.0012) [2024-06-15 16:48:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 838991872. Throughput: 0: 10729.2. Samples: 209819648. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:48:18,652][1651596] Signal inference workers to stop experience collection... (21200 times) [2024-06-15 16:48:18,718][1653645] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-06-15 16:48:18,934][1651596] Signal inference workers to resume experience collection... (21200 times) [2024-06-15 16:48:18,936][1653645] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-06-15 16:48:20,452][1653645] Updated weights for policy 0, policy_version 409732 (0.0015) [2024-06-15 16:48:20,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 839188480. Throughput: 0: 11002.4. Samples: 209860608. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:20,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:48:22,689][1653645] Updated weights for policy 0, policy_version 409825 (0.0025) [2024-06-15 16:48:24,341][1653645] Updated weights for policy 0, policy_version 409893 (0.0015) [2024-06-15 16:48:25,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 839516160. Throughput: 0: 10717.9. Samples: 209907200. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:48:30,958][1648982] Fps is (10 sec: 32768.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 839516160. Throughput: 0: 11059.3. Samples: 209991680. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:30,961][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 16:48:31,415][1653645] Updated weights for policy 0, policy_version 409936 (0.0013) [2024-06-15 16:48:33,263][1653645] Updated weights for policy 0, policy_version 410002 (0.0031) [2024-06-15 16:48:35,281][1653645] Updated weights for policy 0, policy_version 410096 (0.0015) [2024-06-15 16:48:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43145.1, 300 sec: 43764.7). Total num frames: 839909376. Throughput: 0: 11116.1. Samples: 210017792. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:35,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 16:48:37,287][1653645] Updated weights for policy 0, policy_version 410166 (0.0014) [2024-06-15 16:48:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 840040448. Throughput: 0: 10638.3. Samples: 210076672. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:48:44,739][1653645] Updated weights for policy 0, policy_version 410224 (0.0023) [2024-06-15 16:48:45,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 42598.7, 300 sec: 43209.4). Total num frames: 840204288. Throughput: 0: 11081.9. Samples: 210144768. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:45,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:48:47,104][1653645] Updated weights for policy 0, policy_version 410320 (0.0121) [2024-06-15 16:48:48,927][1653645] Updated weights for policy 0, policy_version 410388 (0.0094) [2024-06-15 16:48:50,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 43431.5). Total num frames: 840564736. Throughput: 0: 10604.1. Samples: 210165248. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:50,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:48:55,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 42052.3, 300 sec: 42654.2). Total num frames: 840564736. Throughput: 0: 10638.2. Samples: 210237440. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:48:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:48:55,988][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000410432_840564736.pth... [2024-06-15 16:48:56,039][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000405472_830406656.pth [2024-06-15 16:48:57,166][1653645] Updated weights for policy 0, policy_version 410464 (0.0014) [2024-06-15 16:48:58,723][1653645] Updated weights for policy 0, policy_version 410513 (0.0032) [2024-06-15 16:48:59,382][1651596] Signal inference workers to stop experience collection... (21250 times) [2024-06-15 16:48:59,433][1653645] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-06-15 16:48:59,691][1651596] Signal inference workers to resume experience collection... (21250 times) [2024-06-15 16:48:59,692][1653645] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-06-15 16:49:00,324][1653645] Updated weights for policy 0, policy_version 410592 (0.0011) [2024-06-15 16:49:00,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 40960.0, 300 sec: 43431.5). Total num frames: 840925184. Throughput: 0: 10535.9. Samples: 210293760. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:49:00,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 16:49:02,517][1653645] Updated weights for policy 0, policy_version 410685 (0.0153) [2024-06-15 16:49:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.8, 300 sec: 42876.1). Total num frames: 841089024. Throughput: 0: 10194.6. Samples: 210319360. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:49:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:49:10,249][1653645] Updated weights for policy 0, policy_version 410736 (0.0016) [2024-06-15 16:49:10,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 39867.7, 300 sec: 43100.5). Total num frames: 841220096. Throughput: 0: 10843.0. Samples: 210395136. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:49:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:49:11,698][1653645] Updated weights for policy 0, policy_version 410787 (0.0013) [2024-06-15 16:49:13,560][1653645] Updated weights for policy 0, policy_version 410872 (0.0073) [2024-06-15 16:49:14,637][1653645] Updated weights for policy 0, policy_version 410928 (0.0013) [2024-06-15 16:49:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 841613312. Throughput: 0: 10217.2. Samples: 210451456. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:49:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:49:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 40414.1, 300 sec: 42653.9). Total num frames: 841613312. Throughput: 0: 10444.8. Samples: 210487808. Policy #0 lag: (min: 15.0, avg: 78.7, max: 271.0) [2024-06-15 16:49:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:49:21,279][1653645] Updated weights for policy 0, policy_version 410964 (0.0014) [2024-06-15 16:49:22,882][1653645] Updated weights for policy 0, policy_version 411029 (0.0013) [2024-06-15 16:49:24,292][1653645] Updated weights for policy 0, policy_version 411095 (0.0012) [2024-06-15 16:49:25,770][1653645] Updated weights for policy 0, policy_version 411168 (0.0012) [2024-06-15 16:49:25,989][1648982] Fps is (10 sec: 45731.6, 60 sec: 42576.2, 300 sec: 43426.9). Total num frames: 842072064. Throughput: 0: 10619.4. Samples: 210554880. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:25,993][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 16:49:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 842137600. Throughput: 0: 10638.2. Samples: 210623488. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:49:33,725][1653645] Updated weights for policy 0, policy_version 411252 (0.0013) [2024-06-15 16:49:35,147][1653645] Updated weights for policy 0, policy_version 411328 (0.0109) [2024-06-15 16:49:35,958][1648982] Fps is (10 sec: 39445.4, 60 sec: 42598.4, 300 sec: 43653.7). Total num frames: 842465280. Throughput: 0: 10956.8. Samples: 210658304. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:49:36,406][1653645] Updated weights for policy 0, policy_version 411389 (0.0013) [2024-06-15 16:49:37,345][1651596] Signal inference workers to stop experience collection... (21300 times) [2024-06-15 16:49:37,383][1653645] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-06-15 16:49:37,575][1651596] Signal inference workers to resume experience collection... (21300 times) [2024-06-15 16:49:37,576][1653645] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-06-15 16:49:38,338][1653645] Updated weights for policy 0, policy_version 411456 (0.0015) [2024-06-15 16:49:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 842661888. Throughput: 0: 10581.4. Samples: 210713600. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:40,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 16:49:45,302][1653645] Updated weights for policy 0, policy_version 411517 (0.0122) [2024-06-15 16:49:45,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43144.4, 300 sec: 43098.2). Total num frames: 842792960. Throughput: 0: 11013.7. Samples: 210789376. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:49:47,146][1653645] Updated weights for policy 0, policy_version 411570 (0.0127) [2024-06-15 16:49:48,749][1653645] Updated weights for policy 0, policy_version 411632 (0.0012) [2024-06-15 16:49:50,383][1653645] Updated weights for policy 0, policy_version 411709 (0.0013) [2024-06-15 16:49:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 843186176. Throughput: 0: 10979.6. Samples: 210813440. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:49:55,960][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 843186176. Throughput: 0: 10934.0. Samples: 210887168. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:49:55,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:49:57,512][1653645] Updated weights for policy 0, policy_version 411774 (0.0012) [2024-06-15 16:49:58,825][1653645] Updated weights for policy 0, policy_version 411824 (0.0012) [2024-06-15 16:49:59,815][1653645] Updated weights for policy 0, policy_version 411859 (0.0014) [2024-06-15 16:50:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 843579392. Throughput: 0: 11059.2. Samples: 210949120. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:50:01,034][1653645] Updated weights for policy 0, policy_version 411921 (0.0011) [2024-06-15 16:50:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 843710464. Throughput: 0: 10899.9. Samples: 210978304. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:50:08,288][1653645] Updated weights for policy 0, policy_version 411986 (0.0014) [2024-06-15 16:50:10,209][1653645] Updated weights for policy 0, policy_version 412052 (0.0013) [2024-06-15 16:50:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45329.0, 300 sec: 43431.5). Total num frames: 843939840. Throughput: 0: 11055.5. Samples: 211052032. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:10,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 16:50:11,504][1653645] Updated weights for policy 0, policy_version 412101 (0.0011) [2024-06-15 16:50:12,942][1653645] Updated weights for policy 0, policy_version 412160 (0.0044) [2024-06-15 16:50:15,957][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 844234752. Throughput: 0: 10808.9. Samples: 211109888. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:50:19,499][1653645] Updated weights for policy 0, policy_version 412225 (0.0043) [2024-06-15 16:50:20,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45875.0, 300 sec: 43098.2). Total num frames: 844365824. Throughput: 0: 11047.8. Samples: 211155456. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:20,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:50:22,029][1653645] Updated weights for policy 0, policy_version 412320 (0.0013) [2024-06-15 16:50:22,493][1651596] Signal inference workers to stop experience collection... (21350 times) [2024-06-15 16:50:22,570][1653645] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-06-15 16:50:22,680][1651596] Signal inference workers to resume experience collection... (21350 times) [2024-06-15 16:50:22,681][1653645] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-06-15 16:50:23,499][1653645] Updated weights for policy 0, policy_version 412384 (0.0012) [2024-06-15 16:50:25,047][1653645] Updated weights for policy 0, policy_version 412440 (0.0015) [2024-06-15 16:50:25,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44806.4, 300 sec: 44098.0). Total num frames: 844759040. Throughput: 0: 11082.0. Samples: 211212288. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:50:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.5, 300 sec: 42765.0). Total num frames: 844759040. Throughput: 0: 11184.3. Samples: 211292672. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:30,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:50:31,367][1653645] Updated weights for policy 0, policy_version 412496 (0.0016) [2024-06-15 16:50:32,872][1653645] Updated weights for policy 0, policy_version 412565 (0.0017) [2024-06-15 16:50:34,727][1653645] Updated weights for policy 0, policy_version 412640 (0.0013) [2024-06-15 16:50:35,941][1653645] Updated weights for policy 0, policy_version 412678 (0.0030) [2024-06-15 16:50:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 845152256. Throughput: 0: 11389.2. Samples: 211325952. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:50:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.5, 300 sec: 43320.4). Total num frames: 845283328. Throughput: 0: 11184.3. Samples: 211390464. Policy #0 lag: (min: 15.0, avg: 73.5, max: 271.0) [2024-06-15 16:50:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:50:42,316][1653645] Updated weights for policy 0, policy_version 412741 (0.0014) [2024-06-15 16:50:43,441][1653645] Updated weights for policy 0, policy_version 412800 (0.0014) [2024-06-15 16:50:45,234][1653645] Updated weights for policy 0, policy_version 412854 (0.0156) [2024-06-15 16:50:45,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46421.4, 300 sec: 43653.6). Total num frames: 845578240. Throughput: 0: 11377.8. Samples: 211461120. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:50:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:50:46,810][1653645] Updated weights for policy 0, policy_version 412927 (0.0013) [2024-06-15 16:50:49,154][1653645] Updated weights for policy 0, policy_version 412992 (0.0039) [2024-06-15 16:50:50,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 845807616. Throughput: 0: 11355.0. Samples: 211489280. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:50:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 16:50:55,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 45329.0, 300 sec: 42987.2). Total num frames: 845905920. Throughput: 0: 11411.9. Samples: 211565568. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:50:55,959][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 16:50:56,208][1653645] Updated weights for policy 0, policy_version 413057 (0.0128) [2024-06-15 16:50:56,389][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000413072_845971456.pth... [2024-06-15 16:50:56,562][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000407968_835518464.pth [2024-06-15 16:50:56,570][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000413072_845971456.pth [2024-06-15 16:50:58,562][1653645] Updated weights for policy 0, policy_version 413152 (0.0014) [2024-06-15 16:51:00,250][1653645] Updated weights for policy 0, policy_version 413204 (0.0018) [2024-06-15 16:51:00,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 846299136. Throughput: 0: 11332.2. Samples: 211619840. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:51:05,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 846331904. Throughput: 0: 11218.6. Samples: 211660288. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:51:06,567][1651596] Signal inference workers to stop experience collection... (21400 times) [2024-06-15 16:51:06,643][1653645] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-06-15 16:51:06,775][1651596] Signal inference workers to resume experience collection... (21400 times) [2024-06-15 16:51:06,776][1653645] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-06-15 16:51:06,778][1653645] Updated weights for policy 0, policy_version 413280 (0.0104) [2024-06-15 16:51:08,160][1653645] Updated weights for policy 0, policy_version 413330 (0.0013) [2024-06-15 16:51:09,793][1653645] Updated weights for policy 0, policy_version 413381 (0.0032) [2024-06-15 16:51:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 45875.1, 300 sec: 43875.8). Total num frames: 846692352. Throughput: 0: 11354.9. Samples: 211723264. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:10,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:51:11,825][1653645] Updated weights for policy 0, policy_version 413441 (0.0012) [2024-06-15 16:51:15,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 846856192. Throughput: 0: 10968.2. Samples: 211786240. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:51:19,526][1653645] Updated weights for policy 0, policy_version 413526 (0.0241) [2024-06-15 16:51:20,958][1648982] Fps is (10 sec: 32769.3, 60 sec: 44237.0, 300 sec: 43209.4). Total num frames: 847020032. Throughput: 0: 11047.8. Samples: 211823104. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:20,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 16:51:21,498][1653645] Updated weights for policy 0, policy_version 413616 (0.0013) [2024-06-15 16:51:22,828][1653645] Updated weights for policy 0, policy_version 413664 (0.0013) [2024-06-15 16:51:24,635][1653645] Updated weights for policy 0, policy_version 413713 (0.0013) [2024-06-15 16:51:25,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 847380480. Throughput: 0: 10922.7. Samples: 211881984. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:51:30,644][1653645] Updated weights for policy 0, policy_version 413761 (0.0014) [2024-06-15 16:51:30,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.8, 300 sec: 42876.1). Total num frames: 847380480. Throughput: 0: 10911.3. Samples: 211952128. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:30,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:51:32,342][1653645] Updated weights for policy 0, policy_version 413826 (0.0014) [2024-06-15 16:51:33,633][1653645] Updated weights for policy 0, policy_version 413888 (0.0013) [2024-06-15 16:51:35,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 847740928. Throughput: 0: 10797.5. Samples: 211975168. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:35,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 16:51:36,016][1653645] Updated weights for policy 0, policy_version 413951 (0.0063) [2024-06-15 16:51:40,960][1648982] Fps is (10 sec: 52417.7, 60 sec: 43689.2, 300 sec: 43320.2). Total num frames: 847904768. Throughput: 0: 10535.4. Samples: 212039680. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:40,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:51:42,757][1653645] Updated weights for policy 0, policy_version 414018 (0.0012) [2024-06-15 16:51:44,254][1653645] Updated weights for policy 0, policy_version 414081 (0.0138) [2024-06-15 16:51:45,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 848166912. Throughput: 0: 10899.9. Samples: 212110336. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:51:46,592][1653645] Updated weights for policy 0, policy_version 414160 (0.0019) [2024-06-15 16:51:48,905][1653645] Updated weights for policy 0, policy_version 414214 (0.0026) [2024-06-15 16:51:49,715][1651596] Signal inference workers to stop experience collection... (21450 times) [2024-06-15 16:51:49,765][1653645] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-06-15 16:51:49,918][1651596] Signal inference workers to resume experience collection... (21450 times) [2024-06-15 16:51:49,919][1653645] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-06-15 16:51:50,101][1653645] Updated weights for policy 0, policy_version 414267 (0.0014) [2024-06-15 16:51:50,958][1648982] Fps is (10 sec: 52440.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 848429056. Throughput: 0: 10729.2. Samples: 212143104. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:51:55,861][1653645] Updated weights for policy 0, policy_version 414325 (0.0012) [2024-06-15 16:51:55,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.8, 300 sec: 42987.2). Total num frames: 848527360. Throughput: 0: 10900.0. Samples: 212213760. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:51:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:51:57,873][1653645] Updated weights for policy 0, policy_version 414401 (0.0012) [2024-06-15 16:51:59,321][1653645] Updated weights for policy 0, policy_version 414461 (0.0011) [2024-06-15 16:52:00,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 44099.2). Total num frames: 848855040. Throughput: 0: 10820.2. Samples: 212273152. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:52:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:52:01,438][1653645] Updated weights for policy 0, policy_version 414512 (0.0013) [2024-06-15 16:52:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 848953344. Throughput: 0: 10740.6. Samples: 212306432. Policy #0 lag: (min: 0.0, avg: 56.7, max: 256.0) [2024-06-15 16:52:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:52:07,040][1653645] Updated weights for policy 0, policy_version 414563 (0.0013) [2024-06-15 16:52:08,688][1653645] Updated weights for policy 0, policy_version 414640 (0.0012) [2024-06-15 16:52:10,084][1653645] Updated weights for policy 0, policy_version 414679 (0.0012) [2024-06-15 16:52:10,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 849346560. Throughput: 0: 11036.4. Samples: 212378624. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:10,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:52:12,201][1653645] Updated weights for policy 0, policy_version 414729 (0.0013) [2024-06-15 16:52:13,489][1653645] Updated weights for policy 0, policy_version 414780 (0.0013) [2024-06-15 16:52:15,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 43690.4, 300 sec: 43764.7). Total num frames: 849477632. Throughput: 0: 10922.6. Samples: 212443648. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:15,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:52:20,081][1653645] Updated weights for policy 0, policy_version 414864 (0.0014) [2024-06-15 16:52:20,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 44236.7, 300 sec: 43320.4). Total num frames: 849674240. Throughput: 0: 11343.6. Samples: 212485632. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:52:21,882][1653645] Updated weights for policy 0, policy_version 414932 (0.0130) [2024-06-15 16:52:24,622][1653645] Updated weights for policy 0, policy_version 414980 (0.0013) [2024-06-15 16:52:25,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 849969152. Throughput: 0: 11196.2. Samples: 212543488. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:52:26,067][1653645] Updated weights for policy 0, policy_version 415034 (0.0013) [2024-06-15 16:52:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 43209.4). Total num frames: 850067456. Throughput: 0: 11161.6. Samples: 212612608. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:52:31,222][1653645] Updated weights for policy 0, policy_version 415089 (0.0012) [2024-06-15 16:52:33,556][1651596] Signal inference workers to stop experience collection... (21500 times) [2024-06-15 16:52:33,785][1653645] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-06-15 16:52:33,798][1653645] Updated weights for policy 0, policy_version 415194 (0.0021) [2024-06-15 16:52:33,854][1651596] Signal inference workers to resume experience collection... (21500 times) [2024-06-15 16:52:33,855][1653645] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-06-15 16:52:35,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 850395136. Throughput: 0: 10934.1. Samples: 212635136. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:52:37,755][1653645] Updated weights for policy 0, policy_version 415248 (0.0012) [2024-06-15 16:52:40,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 43692.2, 300 sec: 43653.7). Total num frames: 850526208. Throughput: 0: 10888.5. Samples: 212703744. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:52:41,945][1653645] Updated weights for policy 0, policy_version 415319 (0.0097) [2024-06-15 16:52:43,713][1653645] Updated weights for policy 0, policy_version 415393 (0.0013) [2024-06-15 16:52:45,657][1653645] Updated weights for policy 0, policy_version 415456 (0.0058) [2024-06-15 16:52:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 850853888. Throughput: 0: 11104.8. Samples: 212772864. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:52:49,043][1653645] Updated weights for policy 0, policy_version 415504 (0.0017) [2024-06-15 16:52:50,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 851050496. Throughput: 0: 11082.0. Samples: 212805120. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:52:53,473][1653645] Updated weights for policy 0, policy_version 415568 (0.0012) [2024-06-15 16:52:54,878][1653645] Updated weights for policy 0, policy_version 415632 (0.0013) [2024-06-15 16:52:55,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45875.2, 300 sec: 43431.5). Total num frames: 851279872. Throughput: 0: 11161.6. Samples: 212880896. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:52:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:52:56,182][1653645] Updated weights for policy 0, policy_version 415677 (0.0032) [2024-06-15 16:52:56,214][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000415680_851312640.pth... [2024-06-15 16:52:56,263][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000410432_840564736.pth [2024-06-15 16:52:58,048][1653645] Updated weights for policy 0, policy_version 415731 (0.0015) [2024-06-15 16:53:00,074][1653645] Updated weights for policy 0, policy_version 415776 (0.0096) [2024-06-15 16:53:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 851574784. Throughput: 0: 11059.3. Samples: 212941312. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:53:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:53:05,237][1653645] Updated weights for policy 0, policy_version 415824 (0.0012) [2024-06-15 16:53:05,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 851673088. Throughput: 0: 11059.2. Samples: 212983296. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:53:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:53:07,200][1653645] Updated weights for policy 0, policy_version 415904 (0.0012) [2024-06-15 16:53:09,862][1653645] Updated weights for policy 0, policy_version 415968 (0.0013) [2024-06-15 16:53:10,520][1653645] Updated weights for policy 0, policy_version 416000 (0.0012) [2024-06-15 16:53:10,963][1648982] Fps is (10 sec: 39300.1, 60 sec: 43686.7, 300 sec: 43986.1). Total num frames: 851968000. Throughput: 0: 11103.4. Samples: 213043200. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:53:10,966][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:53:12,053][1653645] Updated weights for policy 0, policy_version 416055 (0.0013) [2024-06-15 16:53:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43691.0, 300 sec: 43764.8). Total num frames: 852099072. Throughput: 0: 11320.9. Samples: 213122048. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:53:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:53:17,663][1653645] Updated weights for policy 0, policy_version 416115 (0.0131) [2024-06-15 16:53:18,806][1651596] Signal inference workers to stop experience collection... (21550 times) [2024-06-15 16:53:18,854][1653645] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-06-15 16:53:19,078][1651596] Signal inference workers to resume experience collection... (21550 times) [2024-06-15 16:53:19,080][1653645] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-06-15 16:53:19,257][1653645] Updated weights for policy 0, policy_version 416183 (0.0012) [2024-06-15 16:53:20,958][1648982] Fps is (10 sec: 42621.5, 60 sec: 45329.0, 300 sec: 43653.6). Total num frames: 852393984. Throughput: 0: 11457.4. Samples: 213150720. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:53:20,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:53:21,143][1653645] Updated weights for policy 0, policy_version 416224 (0.0014) [2024-06-15 16:53:22,124][1653645] Updated weights for policy 0, policy_version 416257 (0.0013) [2024-06-15 16:53:23,314][1653645] Updated weights for policy 0, policy_version 416311 (0.0016) [2024-06-15 16:53:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 852623360. Throughput: 0: 11377.8. Samples: 213215744. Policy #0 lag: (min: 1.0, avg: 65.6, max: 257.0) [2024-06-15 16:53:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:53:28,818][1653645] Updated weights for policy 0, policy_version 416368 (0.0021) [2024-06-15 16:53:30,214][1653645] Updated weights for policy 0, policy_version 416419 (0.0010) [2024-06-15 16:53:30,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 46967.4, 300 sec: 43986.9). Total num frames: 852885504. Throughput: 0: 11400.5. Samples: 213285888. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:53:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:53:32,504][1653645] Updated weights for policy 0, policy_version 416471 (0.0013) [2024-06-15 16:53:34,397][1653645] Updated weights for policy 0, policy_version 416547 (0.0013) [2024-06-15 16:53:34,928][1653645] Updated weights for policy 0, policy_version 416573 (0.0012) [2024-06-15 16:53:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 853147648. Throughput: 0: 11377.8. Samples: 213317120. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:53:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:53:40,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 853245952. Throughput: 0: 11411.9. Samples: 213394432. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:53:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:53:41,564][1653645] Updated weights for policy 0, policy_version 416656 (0.0013) [2024-06-15 16:53:44,584][1653645] Updated weights for policy 0, policy_version 416720 (0.0012) [2024-06-15 16:53:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44782.8, 300 sec: 43986.9). Total num frames: 853540864. Throughput: 0: 11309.5. Samples: 213450240. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:53:45,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 16:53:46,937][1653645] Updated weights for policy 0, policy_version 416816 (0.0012) [2024-06-15 16:53:50,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 853671936. Throughput: 0: 11082.0. Samples: 213481984. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:53:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 16:53:53,249][1653645] Updated weights for policy 0, policy_version 416896 (0.0014) [2024-06-15 16:53:54,461][1653645] Updated weights for policy 0, policy_version 416960 (0.0014) [2024-06-15 16:53:55,963][1648982] Fps is (10 sec: 39300.0, 60 sec: 44232.7, 300 sec: 44097.1). Total num frames: 853934080. Throughput: 0: 11138.8. Samples: 213544448. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:53:55,966][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 16:53:58,403][1653645] Updated weights for policy 0, policy_version 417027 (0.0013) [2024-06-15 16:53:59,795][1653645] Updated weights for policy 0, policy_version 417084 (0.0011) [2024-06-15 16:54:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 854196224. Throughput: 0: 10865.8. Samples: 213611008. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:54:03,747][1651596] Signal inference workers to stop experience collection... (21600 times) [2024-06-15 16:54:03,803][1653645] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-06-15 16:54:04,031][1651596] Signal inference workers to resume experience collection... (21600 times) [2024-06-15 16:54:04,051][1653645] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-06-15 16:54:04,795][1653645] Updated weights for policy 0, policy_version 417136 (0.0023) [2024-06-15 16:54:05,958][1648982] Fps is (10 sec: 45901.0, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 854392832. Throughput: 0: 11082.0. Samples: 213649408. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:05,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 16:54:09,243][1653645] Updated weights for policy 0, policy_version 417232 (0.0141) [2024-06-15 16:54:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43694.7, 300 sec: 43986.9). Total num frames: 854589440. Throughput: 0: 11059.2. Samples: 213713408. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:54:12,020][1653645] Updated weights for policy 0, policy_version 417328 (0.0116) [2024-06-15 16:54:15,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 854720512. Throughput: 0: 10843.0. Samples: 213773824. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:54:16,576][1653645] Updated weights for policy 0, policy_version 417376 (0.0012) [2024-06-15 16:54:18,664][1653645] Updated weights for policy 0, policy_version 417456 (0.0012) [2024-06-15 16:54:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 43769.4). Total num frames: 854982656. Throughput: 0: 10763.4. Samples: 213801472. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:54:22,225][1653645] Updated weights for policy 0, policy_version 417504 (0.0014) [2024-06-15 16:54:23,815][1653645] Updated weights for policy 0, policy_version 417570 (0.0014) [2024-06-15 16:54:25,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 855244800. Throughput: 0: 10478.9. Samples: 213865984. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:54:28,446][1653645] Updated weights for policy 0, policy_version 417617 (0.0014) [2024-06-15 16:54:30,254][1653645] Updated weights for policy 0, policy_version 417705 (0.0013) [2024-06-15 16:54:30,880][1653645] Updated weights for policy 0, policy_version 417728 (0.0011) [2024-06-15 16:54:30,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 855506944. Throughput: 0: 10717.9. Samples: 213932544. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:54:34,589][1653645] Updated weights for policy 0, policy_version 417777 (0.0044) [2024-06-15 16:54:35,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 855703552. Throughput: 0: 10899.9. Samples: 213972480. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:54:36,303][1653645] Updated weights for policy 0, policy_version 417855 (0.0013) [2024-06-15 16:54:40,747][1653645] Updated weights for policy 0, policy_version 417904 (0.0012) [2024-06-15 16:54:40,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 855867392. Throughput: 0: 10946.8. Samples: 214036992. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:54:42,929][1653645] Updated weights for policy 0, policy_version 417977 (0.0011) [2024-06-15 16:54:45,935][1651596] Signal inference workers to stop experience collection... (21650 times) [2024-06-15 16:54:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 856064000. Throughput: 0: 10956.8. Samples: 214104064. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:54:45,965][1653645] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-06-15 16:54:46,147][1651596] Signal inference workers to resume experience collection... (21650 times) [2024-06-15 16:54:46,159][1653645] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-06-15 16:54:46,833][1653645] Updated weights for policy 0, policy_version 418048 (0.0013) [2024-06-15 16:54:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 856293376. Throughput: 0: 10604.1. Samples: 214126592. Policy #0 lag: (min: 15.0, avg: 83.3, max: 271.0) [2024-06-15 16:54:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:54:51,358][1653645] Updated weights for policy 0, policy_version 418114 (0.0026) [2024-06-15 16:54:52,520][1653645] Updated weights for policy 0, policy_version 418167 (0.0087) [2024-06-15 16:54:53,644][1653645] Updated weights for policy 0, policy_version 418208 (0.0011) [2024-06-15 16:54:55,972][1648982] Fps is (10 sec: 49084.4, 60 sec: 43684.7, 300 sec: 43984.8). Total num frames: 856555520. Throughput: 0: 10714.6. Samples: 214195712. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:54:55,972][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:54:55,977][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000418240_856555520.pth... [2024-06-15 16:54:56,044][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000413072_845971456.pth [2024-06-15 16:54:57,900][1653645] Updated weights for policy 0, policy_version 418261 (0.0014) [2024-06-15 16:55:00,494][1653645] Updated weights for policy 0, policy_version 418359 (0.0130) [2024-06-15 16:55:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 856817664. Throughput: 0: 10649.6. Samples: 214253056. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:00,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 16:55:04,945][1653645] Updated weights for policy 0, policy_version 418416 (0.0014) [2024-06-15 16:55:05,958][1648982] Fps is (10 sec: 39375.1, 60 sec: 42598.2, 300 sec: 44097.9). Total num frames: 856948736. Throughput: 0: 10899.9. Samples: 214291968. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:05,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 16:55:07,135][1653645] Updated weights for policy 0, policy_version 418493 (0.0015) [2024-06-15 16:55:10,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 42052.2, 300 sec: 43653.6). Total num frames: 857112576. Throughput: 0: 10865.8. Samples: 214354944. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:10,958][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 16:55:11,605][1653645] Updated weights for policy 0, policy_version 418549 (0.0022) [2024-06-15 16:55:15,388][1653645] Updated weights for policy 0, policy_version 418627 (0.0012) [2024-06-15 16:55:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 857407488. Throughput: 0: 10945.4. Samples: 214425088. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:15,959][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 16:55:17,627][1653645] Updated weights for policy 0, policy_version 418691 (0.0014) [2024-06-15 16:55:19,167][1653645] Updated weights for policy 0, policy_version 418752 (0.0033) [2024-06-15 16:55:20,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 857604096. Throughput: 0: 10831.6. Samples: 214459904. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:55:23,960][1653645] Updated weights for policy 0, policy_version 418847 (0.0140) [2024-06-15 16:55:25,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 857866240. Throughput: 0: 10672.3. Samples: 214517248. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:55:27,310][1653645] Updated weights for policy 0, policy_version 418896 (0.0014) [2024-06-15 16:55:29,707][1651596] Signal inference workers to stop experience collection... (21700 times) [2024-06-15 16:55:29,738][1653645] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-06-15 16:55:29,975][1651596] Signal inference workers to resume experience collection... (21700 times) [2024-06-15 16:55:29,976][1653645] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-06-15 16:55:29,979][1653645] Updated weights for policy 0, policy_version 418976 (0.0014) [2024-06-15 16:55:30,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 858128384. Throughput: 0: 10717.8. Samples: 214586368. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:55:33,849][1653645] Updated weights for policy 0, policy_version 419011 (0.0015) [2024-06-15 16:55:35,638][1653645] Updated weights for policy 0, policy_version 419078 (0.0015) [2024-06-15 16:55:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 858292224. Throughput: 0: 11002.3. Samples: 214621696. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:35,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 16:55:36,888][1653645] Updated weights for policy 0, policy_version 419132 (0.0011) [2024-06-15 16:55:40,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 858488832. Throughput: 0: 10948.7. Samples: 214688256. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:40,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 16:55:41,065][1653645] Updated weights for policy 0, policy_version 419196 (0.0012) [2024-06-15 16:55:42,702][1653645] Updated weights for policy 0, policy_version 419248 (0.0100) [2024-06-15 16:55:45,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 43764.7). Total num frames: 858718208. Throughput: 0: 11104.7. Samples: 214752768. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:55:46,001][1653645] Updated weights for policy 0, policy_version 419301 (0.0013) [2024-06-15 16:55:47,540][1653645] Updated weights for policy 0, policy_version 419344 (0.0014) [2024-06-15 16:55:50,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 858914816. Throughput: 0: 10888.6. Samples: 214781952. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 16:55:51,758][1653645] Updated weights for policy 0, policy_version 419409 (0.0015) [2024-06-15 16:55:52,748][1653645] Updated weights for policy 0, policy_version 419453 (0.0088) [2024-06-15 16:55:54,298][1653645] Updated weights for policy 0, policy_version 419505 (0.0014) [2024-06-15 16:55:55,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43700.7, 300 sec: 43653.6). Total num frames: 859176960. Throughput: 0: 11082.0. Samples: 214853632. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:55:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 16:55:56,995][1653645] Updated weights for policy 0, policy_version 419554 (0.0019) [2024-06-15 16:55:59,669][1653645] Updated weights for policy 0, policy_version 419620 (0.0029) [2024-06-15 16:56:00,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 859439104. Throughput: 0: 11013.7. Samples: 214920704. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:56:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:56:03,361][1653645] Updated weights for policy 0, policy_version 419667 (0.0016) [2024-06-15 16:56:05,088][1653645] Updated weights for policy 0, policy_version 419753 (0.0014) [2024-06-15 16:56:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.4, 300 sec: 44098.0). Total num frames: 859701248. Throughput: 0: 11207.1. Samples: 214964224. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:56:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:56:07,054][1653645] Updated weights for policy 0, policy_version 419777 (0.0069) [2024-06-15 16:56:08,313][1653645] Updated weights for policy 0, policy_version 419832 (0.0013) [2024-06-15 16:56:10,018][1653645] Updated weights for policy 0, policy_version 419872 (0.0015) [2024-06-15 16:56:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 47513.6, 300 sec: 44431.2). Total num frames: 859963392. Throughput: 0: 11514.3. Samples: 215035392. Policy #0 lag: (min: 14.0, avg: 110.4, max: 270.0) [2024-06-15 16:56:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:56:13,859][1653645] Updated weights for policy 0, policy_version 419921 (0.0017) [2024-06-15 16:56:15,362][1653645] Updated weights for policy 0, policy_version 419970 (0.0014) [2024-06-15 16:56:15,770][1651596] Signal inference workers to stop experience collection... (21750 times) [2024-06-15 16:56:15,813][1653645] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-06-15 16:56:15,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 860127232. Throughput: 0: 11537.2. Samples: 215105536. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:56:16,053][1651596] Signal inference workers to resume experience collection... (21750 times) [2024-06-15 16:56:16,054][1653645] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-06-15 16:56:16,723][1653645] Updated weights for policy 0, policy_version 420026 (0.0023) [2024-06-15 16:56:19,744][1653645] Updated weights for policy 0, policy_version 420090 (0.0125) [2024-06-15 16:56:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 860356608. Throughput: 0: 11548.4. Samples: 215141376. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:56:22,686][1653645] Updated weights for policy 0, policy_version 420151 (0.0014) [2024-06-15 16:56:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 860553216. Throughput: 0: 11457.5. Samples: 215203840. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:56:26,302][1653645] Updated weights for policy 0, policy_version 420215 (0.0023) [2024-06-15 16:56:27,351][1653645] Updated weights for policy 0, policy_version 420244 (0.0018) [2024-06-15 16:56:30,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 860782592. Throughput: 0: 11537.0. Samples: 215271936. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:56:31,353][1653645] Updated weights for policy 0, policy_version 420322 (0.0015) [2024-06-15 16:56:34,598][1653645] Updated weights for policy 0, policy_version 420391 (0.0014) [2024-06-15 16:56:35,960][1648982] Fps is (10 sec: 45863.4, 60 sec: 45327.1, 300 sec: 44431.1). Total num frames: 861011968. Throughput: 0: 11581.9. Samples: 215303168. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:35,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:56:37,231][1653645] Updated weights for policy 0, policy_version 420421 (0.0014) [2024-06-15 16:56:38,393][1653645] Updated weights for policy 0, policy_version 420475 (0.0013) [2024-06-15 16:56:39,876][1653645] Updated weights for policy 0, policy_version 420528 (0.0021) [2024-06-15 16:56:40,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 46421.5, 300 sec: 44431.2). Total num frames: 861274112. Throughput: 0: 11434.7. Samples: 215368192. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:56:42,774][1653645] Updated weights for policy 0, policy_version 420562 (0.0013) [2024-06-15 16:56:45,958][1648982] Fps is (10 sec: 45887.1, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 861470720. Throughput: 0: 11468.8. Samples: 215436800. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:56:46,006][1653645] Updated weights for policy 0, policy_version 420656 (0.0013) [2024-06-15 16:56:49,706][1653645] Updated weights for policy 0, policy_version 420707 (0.0013) [2024-06-15 16:56:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46421.3, 300 sec: 44653.3). Total num frames: 861700096. Throughput: 0: 11207.1. Samples: 215468544. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:56:51,655][1653645] Updated weights for policy 0, policy_version 420784 (0.0012) [2024-06-15 16:56:55,516][1653645] Updated weights for policy 0, policy_version 420860 (0.0013) [2024-06-15 16:56:55,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 45875.0, 300 sec: 44320.1). Total num frames: 861929472. Throughput: 0: 11150.2. Samples: 215537152. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:56:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:56:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000420864_861929472.pth... [2024-06-15 16:56:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000415680_851312640.pth [2024-06-15 16:56:59,048][1653645] Updated weights for policy 0, policy_version 420928 (0.0012) [2024-06-15 16:57:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 862060544. Throughput: 0: 10968.2. Samples: 215599104. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:57:02,611][1653645] Updated weights for policy 0, policy_version 420997 (0.0014) [2024-06-15 16:57:03,339][1651596] Signal inference workers to stop experience collection... (21800 times) [2024-06-15 16:57:03,390][1653645] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-06-15 16:57:03,607][1651596] Signal inference workers to resume experience collection... (21800 times) [2024-06-15 16:57:03,608][1653645] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-06-15 16:57:03,892][1653645] Updated weights for policy 0, policy_version 421056 (0.0013) [2024-06-15 16:57:05,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 862322688. Throughput: 0: 10854.4. Samples: 215629824. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:05,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:57:07,741][1653645] Updated weights for policy 0, policy_version 421111 (0.0013) [2024-06-15 16:57:10,499][1653645] Updated weights for policy 0, policy_version 421168 (0.0012) [2024-06-15 16:57:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43144.5, 300 sec: 44320.2). Total num frames: 862552064. Throughput: 0: 11059.2. Samples: 215701504. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:57:12,835][1653645] Updated weights for policy 0, policy_version 421200 (0.0011) [2024-06-15 16:57:15,176][1653645] Updated weights for policy 0, policy_version 421296 (0.0014) [2024-06-15 16:57:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 862846976. Throughput: 0: 10763.4. Samples: 215756288. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:57:19,471][1653645] Updated weights for policy 0, policy_version 421328 (0.0013) [2024-06-15 16:57:20,501][1653645] Updated weights for policy 0, policy_version 421376 (0.0144) [2024-06-15 16:57:20,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 862978048. Throughput: 0: 10980.1. Samples: 215797248. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 16:57:24,786][1653645] Updated weights for policy 0, policy_version 421441 (0.0026) [2024-06-15 16:57:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 863207424. Throughput: 0: 11002.3. Samples: 215863296. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:57:26,237][1653645] Updated weights for policy 0, policy_version 421505 (0.0014) [2024-06-15 16:57:27,585][1653645] Updated weights for policy 0, policy_version 421564 (0.0013) [2024-06-15 16:57:30,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 863371264. Throughput: 0: 11002.3. Samples: 215931904. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:30,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:57:32,028][1653645] Updated weights for policy 0, policy_version 421627 (0.0094) [2024-06-15 16:57:34,404][1653645] Updated weights for policy 0, policy_version 421668 (0.0014) [2024-06-15 16:57:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43692.6, 300 sec: 44431.2). Total num frames: 863633408. Throughput: 0: 11036.4. Samples: 215965184. Policy #0 lag: (min: 15.0, avg: 110.1, max: 271.0) [2024-06-15 16:57:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:57:36,905][1653645] Updated weights for policy 0, policy_version 421728 (0.0147) [2024-06-15 16:57:37,744][1653645] Updated weights for policy 0, policy_version 421760 (0.0012) [2024-06-15 16:57:40,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 863895552. Throughput: 0: 10729.3. Samples: 216019968. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:57:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 16:57:42,926][1653645] Updated weights for policy 0, policy_version 421840 (0.0025) [2024-06-15 16:57:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 43986.9). Total num frames: 864026624. Throughput: 0: 11081.9. Samples: 216097792. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:57:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:57:46,303][1653645] Updated weights for policy 0, policy_version 421904 (0.0090) [2024-06-15 16:57:48,567][1653645] Updated weights for policy 0, policy_version 421989 (0.0013) [2024-06-15 16:57:49,912][1651596] Signal inference workers to stop experience collection... (21850 times) [2024-06-15 16:57:49,977][1653645] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-06-15 16:57:50,137][1651596] Signal inference workers to resume experience collection... (21850 times) [2024-06-15 16:57:50,137][1653645] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-06-15 16:57:50,315][1653645] Updated weights for policy 0, policy_version 422049 (0.0012) [2024-06-15 16:57:50,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 864387072. Throughput: 0: 10968.1. Samples: 216123392. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:57:50,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 16:57:55,705][1653645] Updated weights for policy 0, policy_version 422128 (0.0015) [2024-06-15 16:57:55,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.7, 300 sec: 43875.8). Total num frames: 864518144. Throughput: 0: 11116.1. Samples: 216201728. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:57:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 16:57:57,636][1653645] Updated weights for policy 0, policy_version 422169 (0.0011) [2024-06-15 16:57:59,552][1653645] Updated weights for policy 0, policy_version 422224 (0.0013) [2024-06-15 16:58:00,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 864813056. Throughput: 0: 11173.0. Samples: 216259072. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:58:01,356][1653645] Updated weights for policy 0, policy_version 422291 (0.0034) [2024-06-15 16:58:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 43987.7). Total num frames: 864944128. Throughput: 0: 10922.7. Samples: 216288768. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 16:58:06,742][1653645] Updated weights for policy 0, policy_version 422352 (0.0022) [2024-06-15 16:58:07,714][1653645] Updated weights for policy 0, policy_version 422396 (0.0010) [2024-06-15 16:58:10,398][1653645] Updated weights for policy 0, policy_version 422456 (0.0032) [2024-06-15 16:58:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 865206272. Throughput: 0: 11275.4. Samples: 216370688. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 16:58:12,167][1653645] Updated weights for policy 0, policy_version 422517 (0.0014) [2024-06-15 16:58:13,383][1653645] Updated weights for policy 0, policy_version 422549 (0.0018) [2024-06-15 16:58:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 865468416. Throughput: 0: 11047.8. Samples: 216429056. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:58:18,153][1653645] Updated weights for policy 0, policy_version 422608 (0.0014) [2024-06-15 16:58:20,679][1653645] Updated weights for policy 0, policy_version 422658 (0.0013) [2024-06-15 16:58:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 865599488. Throughput: 0: 11298.1. Samples: 216473600. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:20,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 16:58:23,627][1653645] Updated weights for policy 0, policy_version 422768 (0.0125) [2024-06-15 16:58:25,559][1653645] Updated weights for policy 0, policy_version 422840 (0.0015) [2024-06-15 16:58:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 865992704. Throughput: 0: 11229.9. Samples: 216525312. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:58:30,172][1653645] Updated weights for policy 0, policy_version 422880 (0.0014) [2024-06-15 16:58:30,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 866123776. Throughput: 0: 11355.1. Samples: 216608768. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:58:32,079][1653645] Updated weights for policy 0, policy_version 422928 (0.0012) [2024-06-15 16:58:34,583][1653645] Updated weights for policy 0, policy_version 423013 (0.0020) [2024-06-15 16:58:35,837][1651596] Signal inference workers to stop experience collection... (21900 times) [2024-06-15 16:58:35,912][1653645] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-06-15 16:58:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.1, 300 sec: 44542.2). Total num frames: 866385920. Throughput: 0: 11468.8. Samples: 216639488. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 16:58:36,143][1651596] Signal inference workers to resume experience collection... (21900 times) [2024-06-15 16:58:36,145][1653645] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-06-15 16:58:37,204][1653645] Updated weights for policy 0, policy_version 423104 (0.0014) [2024-06-15 16:58:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 866516992. Throughput: 0: 11173.0. Samples: 216704512. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 16:58:42,857][1653645] Updated weights for policy 0, policy_version 423163 (0.0014) [2024-06-15 16:58:44,789][1653645] Updated weights for policy 0, policy_version 423216 (0.0043) [2024-06-15 16:58:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 866779136. Throughput: 0: 11298.1. Samples: 216767488. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:45,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:58:47,286][1653645] Updated weights for policy 0, policy_version 423292 (0.0014) [2024-06-15 16:58:48,788][1653645] Updated weights for policy 0, policy_version 423330 (0.0011) [2024-06-15 16:58:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44237.0, 300 sec: 44432.0). Total num frames: 867041280. Throughput: 0: 11241.3. Samples: 216794624. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:58:54,068][1653645] Updated weights for policy 0, policy_version 423392 (0.0013) [2024-06-15 16:58:55,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 867172352. Throughput: 0: 11059.2. Samples: 216868352. Policy #0 lag: (min: 10.0, avg: 122.6, max: 266.0) [2024-06-15 16:58:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 16:58:56,256][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000423440_867205120.pth... [2024-06-15 16:58:56,415][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000418240_856555520.pth [2024-06-15 16:58:57,204][1653645] Updated weights for policy 0, policy_version 423479 (0.0013) [2024-06-15 16:59:00,848][1653645] Updated weights for policy 0, policy_version 423568 (0.0226) [2024-06-15 16:59:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 867467264. Throughput: 0: 11059.2. Samples: 216926720. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 16:59:05,853][1653645] Updated weights for policy 0, policy_version 423632 (0.0103) [2024-06-15 16:59:05,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 867598336. Throughput: 0: 10820.3. Samples: 216960512. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 16:59:08,511][1653645] Updated weights for policy 0, policy_version 423715 (0.0157) [2024-06-15 16:59:09,151][1653645] Updated weights for policy 0, policy_version 423744 (0.0012) [2024-06-15 16:59:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 867893248. Throughput: 0: 11252.6. Samples: 217031680. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:59:11,769][1653645] Updated weights for policy 0, policy_version 423809 (0.0013) [2024-06-15 16:59:15,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 868089856. Throughput: 0: 10877.1. Samples: 217098240. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:59:18,713][1653645] Updated weights for policy 0, policy_version 423904 (0.0012) [2024-06-15 16:59:20,487][1653645] Updated weights for policy 0, policy_version 423953 (0.0013) [2024-06-15 16:59:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 868286464. Throughput: 0: 11070.6. Samples: 217137664. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:20,960][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 16:59:21,521][1653645] Updated weights for policy 0, policy_version 424000 (0.0013) [2024-06-15 16:59:22,366][1651596] Signal inference workers to stop experience collection... (21950 times) [2024-06-15 16:59:22,439][1653645] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-06-15 16:59:22,616][1651596] Signal inference workers to resume experience collection... (21950 times) [2024-06-15 16:59:22,617][1653645] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-06-15 16:59:23,288][1653645] Updated weights for policy 0, policy_version 424068 (0.0016) [2024-06-15 16:59:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 868614144. Throughput: 0: 10695.1. Samples: 217185792. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 16:59:30,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 43875.8). Total num frames: 868646912. Throughput: 0: 11127.5. Samples: 217268224. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:30,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 16:59:31,086][1653645] Updated weights for policy 0, policy_version 424147 (0.0022) [2024-06-15 16:59:32,996][1653645] Updated weights for policy 0, policy_version 424241 (0.0330) [2024-06-15 16:59:34,947][1653645] Updated weights for policy 0, policy_version 424316 (0.0013) [2024-06-15 16:59:35,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 869072896. Throughput: 0: 11081.9. Samples: 217293312. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 16:59:36,585][1653645] Updated weights for policy 0, policy_version 424376 (0.0017) [2024-06-15 16:59:40,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 869138432. Throughput: 0: 11025.1. Samples: 217364480. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 16:59:42,566][1653645] Updated weights for policy 0, policy_version 424402 (0.0025) [2024-06-15 16:59:43,338][1653645] Updated weights for policy 0, policy_version 424444 (0.0013) [2024-06-15 16:59:44,895][1653645] Updated weights for policy 0, policy_version 424482 (0.0012) [2024-06-15 16:59:45,958][1648982] Fps is (10 sec: 36045.9, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 869433344. Throughput: 0: 11161.6. Samples: 217428992. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 16:59:46,631][1653645] Updated weights for policy 0, policy_version 424560 (0.0015) [2024-06-15 16:59:48,448][1653645] Updated weights for policy 0, policy_version 424624 (0.0021) [2024-06-15 16:59:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44433.3). Total num frames: 869662720. Throughput: 0: 10899.9. Samples: 217451008. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:59:55,023][1653645] Updated weights for policy 0, policy_version 424676 (0.0012) [2024-06-15 16:59:55,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 869793792. Throughput: 0: 11172.9. Samples: 217534464. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 16:59:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 16:59:56,668][1653645] Updated weights for policy 0, policy_version 424736 (0.0013) [2024-06-15 16:59:58,871][1653645] Updated weights for policy 0, policy_version 424816 (0.0024) [2024-06-15 17:00:00,888][1653645] Updated weights for policy 0, policy_version 424896 (0.0013) [2024-06-15 17:00:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 870187008. Throughput: 0: 10820.3. Samples: 217585152. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 17:00:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:00:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 870187008. Throughput: 0: 10820.3. Samples: 217624576. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 17:00:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:00:07,022][1651596] Signal inference workers to stop experience collection... (22000 times) [2024-06-15 17:00:07,092][1653645] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-06-15 17:00:07,257][1651596] Signal inference workers to resume experience collection... (22000 times) [2024-06-15 17:00:07,258][1653645] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-06-15 17:00:08,144][1653645] Updated weights for policy 0, policy_version 424961 (0.0078) [2024-06-15 17:00:09,832][1653645] Updated weights for policy 0, policy_version 425025 (0.0019) [2024-06-15 17:00:10,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 870547456. Throughput: 0: 11286.7. Samples: 217693696. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 17:00:10,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:00:11,479][1653645] Updated weights for policy 0, policy_version 425107 (0.0113) [2024-06-15 17:00:12,508][1653645] Updated weights for policy 0, policy_version 425152 (0.0013) [2024-06-15 17:00:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 870711296. Throughput: 0: 10843.0. Samples: 217756160. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 17:00:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:00:20,087][1653645] Updated weights for policy 0, policy_version 425219 (0.0013) [2024-06-15 17:00:20,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 870907904. Throughput: 0: 11070.6. Samples: 217791488. Policy #0 lag: (min: 15.0, avg: 115.0, max: 207.0) [2024-06-15 17:00:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:00:21,828][1653645] Updated weights for policy 0, policy_version 425296 (0.0012) [2024-06-15 17:00:22,882][1653645] Updated weights for policy 0, policy_version 425347 (0.0011) [2024-06-15 17:00:23,905][1653645] Updated weights for policy 0, policy_version 425397 (0.0014) [2024-06-15 17:00:25,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 871235584. Throughput: 0: 10843.0. Samples: 217852416. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:25,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 17:00:30,959][1648982] Fps is (10 sec: 36040.3, 60 sec: 43689.8, 300 sec: 43986.7). Total num frames: 871268352. Throughput: 0: 11047.5. Samples: 217926144. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:30,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 17:00:31,167][1653645] Updated weights for policy 0, policy_version 425440 (0.0012) [2024-06-15 17:00:32,970][1653645] Updated weights for policy 0, policy_version 425507 (0.0012) [2024-06-15 17:00:35,733][1653645] Updated weights for policy 0, policy_version 425618 (0.0014) [2024-06-15 17:00:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 44653.4). Total num frames: 871661568. Throughput: 0: 11036.4. Samples: 217947648. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:00:36,881][1653645] Updated weights for policy 0, policy_version 425663 (0.0014) [2024-06-15 17:00:40,958][1648982] Fps is (10 sec: 49157.7, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 871759872. Throughput: 0: 10479.0. Samples: 218006016. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:40,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 17:00:44,989][1653645] Updated weights for policy 0, policy_version 425760 (0.0118) [2024-06-15 17:00:45,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 43144.4, 300 sec: 44431.2). Total num frames: 872022016. Throughput: 0: 10865.7. Samples: 218074112. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:45,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:00:46,412][1653645] Updated weights for policy 0, policy_version 425824 (0.0013) [2024-06-15 17:00:46,515][1651596] Signal inference workers to stop experience collection... (22050 times) [2024-06-15 17:00:46,583][1653645] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-06-15 17:00:46,874][1651596] Signal inference workers to resume experience collection... (22050 times) [2024-06-15 17:00:46,874][1653645] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-06-15 17:00:48,138][1653645] Updated weights for policy 0, policy_version 425877 (0.0010) [2024-06-15 17:00:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 872284160. Throughput: 0: 10672.4. Samples: 218104832. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:00:54,556][1653645] Updated weights for policy 0, policy_version 425923 (0.0015) [2024-06-15 17:00:55,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 872415232. Throughput: 0: 10808.9. Samples: 218180096. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:00:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:00:56,004][1653645] Updated weights for policy 0, policy_version 425989 (0.0047) [2024-06-15 17:00:56,462][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000426016_872480768.pth... [2024-06-15 17:00:56,609][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000420864_861929472.pth [2024-06-15 17:00:57,157][1653645] Updated weights for policy 0, policy_version 426048 (0.0034) [2024-06-15 17:00:59,400][1653645] Updated weights for policy 0, policy_version 426111 (0.0044) [2024-06-15 17:01:00,862][1653645] Updated weights for policy 0, policy_version 426169 (0.0013) [2024-06-15 17:01:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 872775680. Throughput: 0: 10672.4. Samples: 218236416. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:01:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 872808448. Throughput: 0: 10592.7. Samples: 218268160. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:05,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:01:07,753][1653645] Updated weights for policy 0, policy_version 426225 (0.0030) [2024-06-15 17:01:09,314][1653645] Updated weights for policy 0, policy_version 426275 (0.0013) [2024-06-15 17:01:10,899][1653645] Updated weights for policy 0, policy_version 426336 (0.0103) [2024-06-15 17:01:10,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 873136128. Throughput: 0: 10786.1. Samples: 218337792. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:01:12,828][1653645] Updated weights for policy 0, policy_version 426416 (0.0020) [2024-06-15 17:01:15,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 873332736. Throughput: 0: 10467.9. Samples: 218397184. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:01:20,174][1653645] Updated weights for policy 0, policy_version 426496 (0.0051) [2024-06-15 17:01:20,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 873463808. Throughput: 0: 10922.7. Samples: 218439168. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:01:22,439][1653645] Updated weights for policy 0, policy_version 426566 (0.0013) [2024-06-15 17:01:23,684][1653645] Updated weights for policy 0, policy_version 426616 (0.0040) [2024-06-15 17:01:25,154][1653645] Updated weights for policy 0, policy_version 426672 (0.0014) [2024-06-15 17:01:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 873857024. Throughput: 0: 10820.3. Samples: 218492928. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:01:30,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 43145.1, 300 sec: 43542.9). Total num frames: 873857024. Throughput: 0: 10922.6. Samples: 218565632. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:01:31,653][1653645] Updated weights for policy 0, policy_version 426720 (0.0013) [2024-06-15 17:01:32,526][1653645] Updated weights for policy 0, policy_version 426752 (0.0082) [2024-06-15 17:01:33,548][1651596] Signal inference workers to stop experience collection... (22100 times) [2024-06-15 17:01:33,588][1653645] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-06-15 17:01:33,813][1651596] Signal inference workers to resume experience collection... (22100 times) [2024-06-15 17:01:33,814][1653645] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-06-15 17:01:34,041][1653645] Updated weights for policy 0, policy_version 426806 (0.0012) [2024-06-15 17:01:35,236][1653645] Updated weights for policy 0, policy_version 426848 (0.0012) [2024-06-15 17:01:35,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 42598.3, 300 sec: 43875.8). Total num frames: 874217472. Throughput: 0: 10831.6. Samples: 218592256. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:01:36,561][1653645] Updated weights for policy 0, policy_version 426896 (0.0014) [2024-06-15 17:01:40,959][1648982] Fps is (10 sec: 52424.8, 60 sec: 43689.8, 300 sec: 43764.5). Total num frames: 874381312. Throughput: 0: 10592.4. Samples: 218656768. Policy #0 lag: (min: 62.0, avg: 145.5, max: 302.0) [2024-06-15 17:01:40,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:01:43,564][1653645] Updated weights for policy 0, policy_version 426961 (0.0014) [2024-06-15 17:01:44,915][1653645] Updated weights for policy 0, policy_version 427009 (0.0011) [2024-06-15 17:01:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 43764.7). Total num frames: 874610688. Throughput: 0: 10990.9. Samples: 218731008. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:01:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:01:47,538][1653645] Updated weights for policy 0, policy_version 427121 (0.0257) [2024-06-15 17:01:48,530][1653645] Updated weights for policy 0, policy_version 427156 (0.0013) [2024-06-15 17:01:50,958][1648982] Fps is (10 sec: 52435.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 874905600. Throughput: 0: 10797.5. Samples: 218754048. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:01:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:01:55,645][1653645] Updated weights for policy 0, policy_version 427221 (0.0012) [2024-06-15 17:01:55,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 42598.2, 300 sec: 43764.7). Total num frames: 874971136. Throughput: 0: 10968.1. Samples: 218831360. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:01:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:01:57,447][1653645] Updated weights for policy 0, policy_version 427296 (0.0138) [2024-06-15 17:02:00,078][1653645] Updated weights for policy 0, policy_version 427408 (0.0042) [2024-06-15 17:02:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 875397120. Throughput: 0: 10774.7. Samples: 218882048. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:02:01,188][1653645] Updated weights for policy 0, policy_version 427456 (0.0014) [2024-06-15 17:02:05,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.4, 300 sec: 43653.6). Total num frames: 875429888. Throughput: 0: 10706.4. Samples: 218920960. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:02:08,264][1653645] Updated weights for policy 0, policy_version 427518 (0.0014) [2024-06-15 17:02:09,763][1653645] Updated weights for policy 0, policy_version 427575 (0.0107) [2024-06-15 17:02:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43144.6, 300 sec: 43653.6). Total num frames: 875724800. Throughput: 0: 11150.2. Samples: 218994688. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:02:11,142][1653645] Updated weights for policy 0, policy_version 427618 (0.0057) [2024-06-15 17:02:12,546][1653645] Updated weights for policy 0, policy_version 427682 (0.0015) [2024-06-15 17:02:12,555][1651596] Signal inference workers to stop experience collection... (22150 times) [2024-06-15 17:02:12,614][1653645] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-06-15 17:02:12,682][1651596] Signal inference workers to resume experience collection... (22150 times) [2024-06-15 17:02:12,683][1653645] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-06-15 17:02:15,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 875954176. Throughput: 0: 10979.7. Samples: 219059712. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:02:18,774][1653645] Updated weights for policy 0, policy_version 427732 (0.0014) [2024-06-15 17:02:20,104][1653645] Updated weights for policy 0, policy_version 427776 (0.0015) [2024-06-15 17:02:20,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 876085248. Throughput: 0: 11286.8. Samples: 219100160. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:02:22,888][1653645] Updated weights for policy 0, policy_version 427856 (0.0013) [2024-06-15 17:02:24,954][1653645] Updated weights for policy 0, policy_version 427957 (0.0014) [2024-06-15 17:02:25,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 876478464. Throughput: 0: 10968.4. Samples: 219150336. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:02:30,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44237.1, 300 sec: 43653.6). Total num frames: 876511232. Throughput: 0: 11093.4. Samples: 219230208. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:02:31,111][1653645] Updated weights for policy 0, policy_version 427988 (0.0012) [2024-06-15 17:02:31,960][1653645] Updated weights for policy 0, policy_version 428032 (0.0017) [2024-06-15 17:02:33,429][1653645] Updated weights for policy 0, policy_version 428093 (0.0118) [2024-06-15 17:02:35,468][1653645] Updated weights for policy 0, policy_version 428160 (0.0013) [2024-06-15 17:02:35,958][1648982] Fps is (10 sec: 42599.9, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 876904448. Throughput: 0: 11184.3. Samples: 219257344. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:35,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:02:36,482][1653645] Updated weights for policy 0, policy_version 428208 (0.0013) [2024-06-15 17:02:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43691.5, 300 sec: 43986.9). Total num frames: 877002752. Throughput: 0: 11252.7. Samples: 219337728. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:40,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:02:42,336][1653645] Updated weights for policy 0, policy_version 428272 (0.0013) [2024-06-15 17:02:43,451][1653645] Updated weights for policy 0, policy_version 428320 (0.0012) [2024-06-15 17:02:45,740][1653645] Updated weights for policy 0, policy_version 428400 (0.0015) [2024-06-15 17:02:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 877363200. Throughput: 0: 11605.3. Samples: 219404288. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:02:46,861][1653645] Updated weights for policy 0, policy_version 428448 (0.0012) [2024-06-15 17:02:50,957][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 877527040. Throughput: 0: 11457.6. Samples: 219436544. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:02:52,554][1653645] Updated weights for policy 0, policy_version 428499 (0.0034) [2024-06-15 17:02:53,999][1653645] Updated weights for policy 0, policy_version 428563 (0.0013) [2024-06-15 17:02:54,890][1651596] Signal inference workers to stop experience collection... (22200 times) [2024-06-15 17:02:54,942][1653645] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-06-15 17:02:55,173][1651596] Signal inference workers to resume experience collection... (22200 times) [2024-06-15 17:02:55,175][1653645] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-06-15 17:02:55,532][1653645] Updated weights for policy 0, policy_version 428640 (0.0125) [2024-06-15 17:02:55,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 48605.9, 300 sec: 44320.1). Total num frames: 877887488. Throughput: 0: 11696.3. Samples: 219521024. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:02:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:02:56,232][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000428672_877920256.pth... [2024-06-15 17:02:56,427][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000423440_867205120.pth [2024-06-15 17:02:57,213][1653645] Updated weights for policy 0, policy_version 428706 (0.0012) [2024-06-15 17:03:00,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 878051328. Throughput: 0: 11764.6. Samples: 219589120. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:03:00,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:03:02,961][1653645] Updated weights for policy 0, policy_version 428752 (0.0012) [2024-06-15 17:03:04,532][1653645] Updated weights for policy 0, policy_version 428816 (0.0114) [2024-06-15 17:03:05,875][1653645] Updated weights for policy 0, policy_version 428866 (0.0014) [2024-06-15 17:03:05,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 48060.0, 300 sec: 44431.2). Total num frames: 878313472. Throughput: 0: 11798.8. Samples: 219631104. Policy #0 lag: (min: 11.0, avg: 79.8, max: 267.0) [2024-06-15 17:03:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:03:08,134][1653645] Updated weights for policy 0, policy_version 428960 (0.0013) [2024-06-15 17:03:10,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 47513.4, 300 sec: 44431.1). Total num frames: 878575616. Throughput: 0: 11992.2. Samples: 219689984. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:03:14,519][1653645] Updated weights for policy 0, policy_version 429009 (0.0013) [2024-06-15 17:03:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 878739456. Throughput: 0: 12049.1. Samples: 219772416. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 17:03:15,966][1653645] Updated weights for policy 0, policy_version 429073 (0.0013) [2024-06-15 17:03:17,837][1653645] Updated weights for policy 0, policy_version 429152 (0.0160) [2024-06-15 17:03:20,138][1653645] Updated weights for policy 0, policy_version 429240 (0.0139) [2024-06-15 17:03:20,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 50244.1, 300 sec: 44431.2). Total num frames: 879099904. Throughput: 0: 12014.8. Samples: 219798016. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:20,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:03:25,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 879099904. Throughput: 0: 11798.8. Samples: 219868672. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 17:03:26,761][1653645] Updated weights for policy 0, policy_version 429284 (0.0076) [2024-06-15 17:03:28,548][1653645] Updated weights for policy 0, policy_version 429347 (0.0011) [2024-06-15 17:03:29,934][1653645] Updated weights for policy 0, policy_version 429412 (0.0012) [2024-06-15 17:03:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 49697.9, 300 sec: 44431.2). Total num frames: 879493120. Throughput: 0: 11707.7. Samples: 219931136. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:30,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:03:31,317][1653645] Updated weights for policy 0, policy_version 429465 (0.0013) [2024-06-15 17:03:32,170][1653645] Updated weights for policy 0, policy_version 429503 (0.0013) [2024-06-15 17:03:35,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 879624192. Throughput: 0: 11719.1. Samples: 219963904. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:03:37,889][1651596] Signal inference workers to stop experience collection... (22250 times) [2024-06-15 17:03:37,951][1653645] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-06-15 17:03:38,166][1651596] Signal inference workers to resume experience collection... (22250 times) [2024-06-15 17:03:38,169][1653645] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-06-15 17:03:38,830][1653645] Updated weights for policy 0, policy_version 429557 (0.0030) [2024-06-15 17:03:40,344][1653645] Updated weights for policy 0, policy_version 429617 (0.0012) [2024-06-15 17:03:40,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 48059.7, 300 sec: 44431.2). Total num frames: 879886336. Throughput: 0: 11491.6. Samples: 220038144. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:03:41,231][1653645] Updated weights for policy 0, policy_version 429651 (0.0011) [2024-06-15 17:03:42,811][1653645] Updated weights for policy 0, policy_version 429713 (0.0013) [2024-06-15 17:03:43,811][1653645] Updated weights for policy 0, policy_version 429757 (0.0030) [2024-06-15 17:03:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 880148480. Throughput: 0: 11275.4. Samples: 220096512. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:03:50,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 45328.9, 300 sec: 44320.1). Total num frames: 880246784. Throughput: 0: 11286.8. Samples: 220139008. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:03:51,010][1653645] Updated weights for policy 0, policy_version 429824 (0.0017) [2024-06-15 17:03:53,313][1653645] Updated weights for policy 0, policy_version 429904 (0.0016) [2024-06-15 17:03:54,807][1653645] Updated weights for policy 0, policy_version 429954 (0.0012) [2024-06-15 17:03:55,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 45329.3, 300 sec: 44542.3). Total num frames: 880607232. Throughput: 0: 11229.9. Samples: 220195328. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:03:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:03:56,368][1653645] Updated weights for policy 0, policy_version 430014 (0.0036) [2024-06-15 17:04:00,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 880672768. Throughput: 0: 10934.1. Samples: 220264448. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:04:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:04:03,744][1653645] Updated weights for policy 0, policy_version 430096 (0.0015) [2024-06-15 17:04:05,497][1653645] Updated weights for policy 0, policy_version 430160 (0.0014) [2024-06-15 17:04:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 881000448. Throughput: 0: 11093.4. Samples: 220297216. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:04:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:04:07,866][1653645] Updated weights for policy 0, policy_version 430240 (0.0193) [2024-06-15 17:04:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 881197056. Throughput: 0: 10638.2. Samples: 220347392. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:04:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:04:15,061][1653645] Updated weights for policy 0, policy_version 430304 (0.0077) [2024-06-15 17:04:15,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 881328128. Throughput: 0: 10899.9. Samples: 220421632. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:04:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:04:16,241][1653645] Updated weights for policy 0, policy_version 430352 (0.0013) [2024-06-15 17:04:17,370][1651596] Signal inference workers to stop experience collection... (22300 times) [2024-06-15 17:04:17,437][1653645] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-06-15 17:04:17,642][1651596] Signal inference workers to resume experience collection... (22300 times) [2024-06-15 17:04:17,652][1653645] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-06-15 17:04:18,263][1653645] Updated weights for policy 0, policy_version 430420 (0.0054) [2024-06-15 17:04:19,820][1653645] Updated weights for policy 0, policy_version 430496 (0.0013) [2024-06-15 17:04:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 881721344. Throughput: 0: 10763.4. Samples: 220448256. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:04:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:04:25,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 881721344. Throughput: 0: 10740.6. Samples: 220521472. Policy #0 lag: (min: 15.0, avg: 143.3, max: 276.0) [2024-06-15 17:04:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:04:26,307][1653645] Updated weights for policy 0, policy_version 430544 (0.0013) [2024-06-15 17:04:27,467][1653645] Updated weights for policy 0, policy_version 430589 (0.0017) [2024-06-15 17:04:29,941][1653645] Updated weights for policy 0, policy_version 430672 (0.0012) [2024-06-15 17:04:30,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 882081792. Throughput: 0: 10683.7. Samples: 220577280. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:04:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:04:32,308][1653645] Updated weights for policy 0, policy_version 430780 (0.0016) [2024-06-15 17:04:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 882245632. Throughput: 0: 10342.4. Samples: 220604416. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:04:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:04:39,598][1653645] Updated weights for policy 0, policy_version 430832 (0.0013) [2024-06-15 17:04:40,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42052.2, 300 sec: 43986.8). Total num frames: 882409472. Throughput: 0: 10854.4. Samples: 220683776. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:04:40,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:04:41,064][1653645] Updated weights for policy 0, policy_version 430880 (0.0013) [2024-06-15 17:04:42,653][1653645] Updated weights for policy 0, policy_version 430947 (0.0012) [2024-06-15 17:04:44,279][1653645] Updated weights for policy 0, policy_version 431013 (0.0025) [2024-06-15 17:04:45,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 882769920. Throughput: 0: 10478.9. Samples: 220736000. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:04:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:04:50,716][1653645] Updated weights for policy 0, policy_version 431056 (0.0013) [2024-06-15 17:04:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.2, 300 sec: 44097.9). Total num frames: 882802688. Throughput: 0: 10638.2. Samples: 220775936. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:04:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:04:53,182][1653645] Updated weights for policy 0, policy_version 431141 (0.0118) [2024-06-15 17:04:55,027][1653645] Updated weights for policy 0, policy_version 431216 (0.0046) [2024-06-15 17:04:55,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 44097.9). Total num frames: 883195904. Throughput: 0: 10990.9. Samples: 220841984. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:04:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:04:56,268][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000431264_883228672.pth... [2024-06-15 17:04:56,369][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000426016_872480768.pth [2024-06-15 17:05:00,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 883294208. Throughput: 0: 10717.9. Samples: 220903936. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:00,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:05:02,736][1651596] Signal inference workers to stop experience collection... (22350 times) [2024-06-15 17:05:02,773][1653645] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-06-15 17:05:02,783][1653645] Updated weights for policy 0, policy_version 431299 (0.0013) [2024-06-15 17:05:03,005][1651596] Signal inference workers to resume experience collection... (22350 times) [2024-06-15 17:05:03,006][1653645] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-06-15 17:05:04,031][1653645] Updated weights for policy 0, policy_version 431360 (0.0012) [2024-06-15 17:05:05,964][1648982] Fps is (10 sec: 29491.7, 60 sec: 41506.2, 300 sec: 43875.8). Total num frames: 883490816. Throughput: 0: 10877.2. Samples: 220937728. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:05,964][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:05:06,627][1653645] Updated weights for policy 0, policy_version 431424 (0.0010) [2024-06-15 17:05:08,284][1653645] Updated weights for policy 0, policy_version 431496 (0.0011) [2024-06-15 17:05:10,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 883818496. Throughput: 0: 10592.7. Samples: 220998144. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 17:05:15,072][1653645] Updated weights for policy 0, policy_version 431568 (0.0189) [2024-06-15 17:05:15,958][1648982] Fps is (10 sec: 42596.7, 60 sec: 43144.4, 300 sec: 44097.9). Total num frames: 883916800. Throughput: 0: 11059.2. Samples: 221074944. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:15,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:05:17,160][1653645] Updated weights for policy 0, policy_version 431632 (0.0254) [2024-06-15 17:05:19,491][1653645] Updated weights for policy 0, policy_version 431732 (0.0232) [2024-06-15 17:05:20,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 884310016. Throughput: 0: 10968.2. Samples: 221097984. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:05:21,128][1653645] Updated weights for policy 0, policy_version 431800 (0.0112) [2024-06-15 17:05:25,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 44320.3). Total num frames: 884342784. Throughput: 0: 10626.8. Samples: 221161984. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:25,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:05:27,446][1653645] Updated weights for policy 0, policy_version 431840 (0.0013) [2024-06-15 17:05:29,006][1653645] Updated weights for policy 0, policy_version 431877 (0.0012) [2024-06-15 17:05:30,002][1653645] Updated weights for policy 0, policy_version 431926 (0.0013) [2024-06-15 17:05:30,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43144.7, 300 sec: 44097.9). Total num frames: 884670464. Throughput: 0: 11104.7. Samples: 221235712. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:05:31,671][1653645] Updated weights for policy 0, policy_version 432000 (0.0012) [2024-06-15 17:05:35,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 884867072. Throughput: 0: 10763.4. Samples: 221260288. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:05:39,073][1653645] Updated weights for policy 0, policy_version 432067 (0.0014) [2024-06-15 17:05:40,183][1653645] Updated weights for policy 0, policy_version 432122 (0.0010) [2024-06-15 17:05:40,967][1648982] Fps is (10 sec: 36012.7, 60 sec: 43684.3, 300 sec: 44096.6). Total num frames: 885030912. Throughput: 0: 10931.9. Samples: 221334016. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:40,968][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:05:42,002][1653645] Updated weights for policy 0, policy_version 432192 (0.0012) [2024-06-15 17:05:42,459][1651596] Signal inference workers to stop experience collection... (22400 times) [2024-06-15 17:05:42,523][1653645] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-06-15 17:05:42,682][1651596] Signal inference workers to resume experience collection... (22400 times) [2024-06-15 17:05:42,683][1653645] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-06-15 17:05:43,298][1653645] Updated weights for policy 0, policy_version 432256 (0.0012) [2024-06-15 17:05:44,736][1653645] Updated weights for policy 0, policy_version 432310 (0.0011) [2024-06-15 17:05:45,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 885391360. Throughput: 0: 10911.3. Samples: 221394944. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:05:50,958][1648982] Fps is (10 sec: 36077.0, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 885391360. Throughput: 0: 10990.9. Samples: 221432320. Policy #0 lag: (min: 31.0, avg: 128.6, max: 288.0) [2024-06-15 17:05:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:05:52,324][1653645] Updated weights for policy 0, policy_version 432354 (0.0013) [2024-06-15 17:05:53,400][1653645] Updated weights for policy 0, policy_version 432416 (0.0013) [2024-06-15 17:05:55,143][1653645] Updated weights for policy 0, policy_version 432482 (0.0024) [2024-06-15 17:05:55,958][1648982] Fps is (10 sec: 39319.1, 60 sec: 43144.1, 300 sec: 44097.8). Total num frames: 885784576. Throughput: 0: 11161.4. Samples: 221500416. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:05:55,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 17:05:56,965][1653645] Updated weights for policy 0, policy_version 432575 (0.0012) [2024-06-15 17:06:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 885915648. Throughput: 0: 10934.1. Samples: 221566976. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:06:04,027][1653645] Updated weights for policy 0, policy_version 432638 (0.0170) [2024-06-15 17:06:05,503][1653645] Updated weights for policy 0, policy_version 432691 (0.0013) [2024-06-15 17:06:05,958][1648982] Fps is (10 sec: 39324.3, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 886177792. Throughput: 0: 11218.5. Samples: 221602816. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:06:06,678][1653645] Updated weights for policy 0, policy_version 432737 (0.0011) [2024-06-15 17:06:08,050][1653645] Updated weights for policy 0, policy_version 432803 (0.0014) [2024-06-15 17:06:10,959][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 886439936. Throughput: 0: 11150.3. Samples: 221663744. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:10,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:06:14,419][1653645] Updated weights for policy 0, policy_version 432835 (0.0013) [2024-06-15 17:06:15,470][1653645] Updated weights for policy 0, policy_version 432886 (0.0143) [2024-06-15 17:06:15,959][1648982] Fps is (10 sec: 39317.7, 60 sec: 44236.2, 300 sec: 44431.0). Total num frames: 886571008. Throughput: 0: 11081.7. Samples: 221734400. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:06:16,402][1653645] Updated weights for policy 0, policy_version 432915 (0.0011) [2024-06-15 17:06:18,202][1653645] Updated weights for policy 0, policy_version 432995 (0.0105) [2024-06-15 17:06:19,519][1651596] Signal inference workers to stop experience collection... (22450 times) [2024-06-15 17:06:19,564][1653645] Updated weights for policy 0, policy_version 433059 (0.0107) [2024-06-15 17:06:19,589][1653645] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-06-15 17:06:19,803][1651596] Signal inference workers to resume experience collection... (22450 times) [2024-06-15 17:06:19,803][1653645] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-06-15 17:06:20,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 886964224. Throughput: 0: 11172.9. Samples: 221763072. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:06:25,976][1648982] Fps is (10 sec: 39253.0, 60 sec: 43677.3, 300 sec: 44428.5). Total num frames: 886964224. Throughput: 0: 11068.2. Samples: 221832192. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:25,977][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:06:26,094][1653645] Updated weights for policy 0, policy_version 433090 (0.0012) [2024-06-15 17:06:27,601][1653645] Updated weights for policy 0, policy_version 433147 (0.0012) [2024-06-15 17:06:29,137][1653645] Updated weights for policy 0, policy_version 433204 (0.0106) [2024-06-15 17:06:30,468][1653645] Updated weights for policy 0, policy_version 433264 (0.0011) [2024-06-15 17:06:30,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 887357440. Throughput: 0: 11161.6. Samples: 221897216. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:06:32,278][1653645] Updated weights for policy 0, policy_version 433339 (0.0076) [2024-06-15 17:06:35,958][1648982] Fps is (10 sec: 52526.0, 60 sec: 43690.6, 300 sec: 44431.4). Total num frames: 887488512. Throughput: 0: 10934.1. Samples: 221924352. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:06:39,555][1653645] Updated weights for policy 0, policy_version 433392 (0.0016) [2024-06-15 17:06:40,723][1653645] Updated weights for policy 0, policy_version 433444 (0.0012) [2024-06-15 17:06:40,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44789.6, 300 sec: 44431.2). Total num frames: 887717888. Throughput: 0: 11082.1. Samples: 221999104. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:06:42,748][1653645] Updated weights for policy 0, policy_version 433520 (0.0114) [2024-06-15 17:06:44,421][1653645] Updated weights for policy 0, policy_version 433594 (0.0011) [2024-06-15 17:06:45,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 888012800. Throughput: 0: 10843.0. Samples: 222054912. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:45,959][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 17:06:50,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 44236.9, 300 sec: 44320.2). Total num frames: 888045568. Throughput: 0: 10877.2. Samples: 222092288. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:06:51,563][1653645] Updated weights for policy 0, policy_version 433659 (0.0074) [2024-06-15 17:06:53,679][1653645] Updated weights for policy 0, policy_version 433733 (0.0014) [2024-06-15 17:06:55,647][1653645] Updated weights for policy 0, policy_version 433811 (0.0014) [2024-06-15 17:06:55,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.3, 300 sec: 44320.1). Total num frames: 888471552. Throughput: 0: 10899.9. Samples: 222154240. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:06:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:06:56,255][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000433840_888504320.pth... [2024-06-15 17:06:56,311][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000428672_877920256.pth [2024-06-15 17:06:56,580][1653645] Updated weights for policy 0, policy_version 433853 (0.0023) [2024-06-15 17:07:00,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 888537088. Throughput: 0: 10820.5. Samples: 222221312. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:07:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:07:04,004][1653645] Updated weights for policy 0, policy_version 433918 (0.0013) [2024-06-15 17:07:05,958][1648982] Fps is (10 sec: 29491.6, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 888766464. Throughput: 0: 10956.8. Samples: 222256128. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:07:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:07:05,996][1651596] Signal inference workers to stop experience collection... (22500 times) [2024-06-15 17:07:06,024][1653645] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-06-15 17:07:06,267][1651596] Signal inference workers to resume experience collection... (22500 times) [2024-06-15 17:07:06,268][1653645] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-06-15 17:07:06,610][1653645] Updated weights for policy 0, policy_version 434000 (0.0012) [2024-06-15 17:07:07,906][1653645] Updated weights for policy 0, policy_version 434050 (0.0020) [2024-06-15 17:07:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 889061376. Throughput: 0: 10540.2. Samples: 222306304. Policy #0 lag: (min: 13.0, avg: 71.1, max: 269.0) [2024-06-15 17:07:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:07:14,988][1653645] Updated weights for policy 0, policy_version 434118 (0.0013) [2024-06-15 17:07:15,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42599.1, 300 sec: 44209.0). Total num frames: 889126912. Throughput: 0: 10888.5. Samples: 222387200. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:07:16,244][1653645] Updated weights for policy 0, policy_version 434172 (0.0012) [2024-06-15 17:07:18,158][1653645] Updated weights for policy 0, policy_version 434226 (0.0014) [2024-06-15 17:07:19,858][1653645] Updated weights for policy 0, policy_version 434304 (0.0013) [2024-06-15 17:07:20,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 44209.1). Total num frames: 889520128. Throughput: 0: 10843.0. Samples: 222412288. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:07:21,236][1653645] Updated weights for policy 0, policy_version 434366 (0.0011) [2024-06-15 17:07:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43704.1, 300 sec: 44320.1). Total num frames: 889585664. Throughput: 0: 10581.3. Samples: 222475264. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:07:29,244][1653645] Updated weights for policy 0, policy_version 434432 (0.0011) [2024-06-15 17:07:30,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 41506.1, 300 sec: 43875.8). Total num frames: 889847808. Throughput: 0: 10752.0. Samples: 222538752. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:07:31,112][1653645] Updated weights for policy 0, policy_version 434512 (0.0016) [2024-06-15 17:07:32,431][1653645] Updated weights for policy 0, policy_version 434576 (0.0116) [2024-06-15 17:07:35,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 890109952. Throughput: 0: 10444.8. Samples: 222562304. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:35,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 17:07:40,102][1653645] Updated weights for policy 0, policy_version 434627 (0.0013) [2024-06-15 17:07:40,958][1648982] Fps is (10 sec: 32766.9, 60 sec: 40959.8, 300 sec: 43431.4). Total num frames: 890175488. Throughput: 0: 10774.7. Samples: 222639104. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:07:41,874][1653645] Updated weights for policy 0, policy_version 434708 (0.0012) [2024-06-15 17:07:43,410][1653645] Updated weights for policy 0, policy_version 434784 (0.0023) [2024-06-15 17:07:44,907][1651596] Signal inference workers to stop experience collection... (22550 times) [2024-06-15 17:07:44,953][1653645] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-06-15 17:07:45,125][1651596] Signal inference workers to resume experience collection... (22550 times) [2024-06-15 17:07:45,127][1653645] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-06-15 17:07:45,278][1653645] Updated weights for policy 0, policy_version 434850 (0.0012) [2024-06-15 17:07:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 890634240. Throughput: 0: 10478.9. Samples: 222692864. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:07:50,958][1648982] Fps is (10 sec: 45877.0, 60 sec: 43144.5, 300 sec: 43209.4). Total num frames: 890634240. Throughput: 0: 10513.1. Samples: 222729216. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:07:52,856][1653645] Updated weights for policy 0, policy_version 434904 (0.0012) [2024-06-15 17:07:54,292][1653645] Updated weights for policy 0, policy_version 434963 (0.0143) [2024-06-15 17:07:55,525][1653645] Updated weights for policy 0, policy_version 435013 (0.0011) [2024-06-15 17:07:55,958][1648982] Fps is (10 sec: 29490.4, 60 sec: 40959.9, 300 sec: 43653.6). Total num frames: 890929152. Throughput: 0: 10831.6. Samples: 222793728. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:07:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:07:57,881][1653645] Updated weights for policy 0, policy_version 435121 (0.0094) [2024-06-15 17:08:00,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 891158528. Throughput: 0: 10433.4. Samples: 222856704. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:00,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 17:08:05,485][1653645] Updated weights for policy 0, policy_version 435154 (0.0012) [2024-06-15 17:08:05,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 40959.8, 300 sec: 42876.1). Total num frames: 891224064. Throughput: 0: 10660.9. Samples: 222892032. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:05,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:08:07,652][1653645] Updated weights for policy 0, policy_version 435250 (0.0078) [2024-06-15 17:08:09,681][1653645] Updated weights for policy 0, policy_version 435329 (0.0166) [2024-06-15 17:08:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43144.5, 300 sec: 43764.7). Total num frames: 891650048. Throughput: 0: 10501.7. Samples: 222947840. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:08:15,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 891682816. Throughput: 0: 10490.3. Samples: 223010816. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:08:18,048][1653645] Updated weights for policy 0, policy_version 435395 (0.0067) [2024-06-15 17:08:19,976][1653645] Updated weights for policy 0, policy_version 435474 (0.0111) [2024-06-15 17:08:20,958][1648982] Fps is (10 sec: 26214.2, 60 sec: 39867.8, 300 sec: 43431.5). Total num frames: 891912192. Throughput: 0: 10843.0. Samples: 223050240. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:08:21,525][1653645] Updated weights for policy 0, policy_version 435536 (0.0036) [2024-06-15 17:08:24,109][1653645] Updated weights for policy 0, policy_version 435632 (0.0015) [2024-06-15 17:08:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 892207104. Throughput: 0: 9978.4. Samples: 223088128. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:08:30,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 39321.6, 300 sec: 42653.9). Total num frames: 892207104. Throughput: 0: 10570.0. Samples: 223168512. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:08:32,105][1651596] Signal inference workers to stop experience collection... (22600 times) [2024-06-15 17:08:32,179][1653645] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-06-15 17:08:32,182][1653645] Updated weights for policy 0, policy_version 435698 (0.0015) [2024-06-15 17:08:32,490][1651596] Signal inference workers to resume experience collection... (22600 times) [2024-06-15 17:08:32,491][1653645] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-06-15 17:08:34,210][1653645] Updated weights for policy 0, policy_version 435776 (0.0015) [2024-06-15 17:08:35,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 43098.3). Total num frames: 892600320. Throughput: 0: 10331.0. Samples: 223194112. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:08:36,233][1653645] Updated weights for policy 0, policy_version 435856 (0.0074) [2024-06-15 17:08:40,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 892731392. Throughput: 0: 10262.8. Samples: 223255552. Policy #0 lag: (min: 13.0, avg: 71.8, max: 269.0) [2024-06-15 17:08:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:08:44,488][1653645] Updated weights for policy 0, policy_version 435952 (0.0015) [2024-06-15 17:08:45,778][1653645] Updated weights for policy 0, policy_version 436000 (0.0012) [2024-06-15 17:08:45,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 38229.4, 300 sec: 42987.2). Total num frames: 892928000. Throughput: 0: 10308.3. Samples: 223320576. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:08:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:08:48,024][1653645] Updated weights for policy 0, policy_version 436096 (0.0013) [2024-06-15 17:08:49,048][1653645] Updated weights for policy 0, policy_version 436152 (0.0039) [2024-06-15 17:08:50,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 893255680. Throughput: 0: 10012.5. Samples: 223342592. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:08:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:08:55,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 39867.9, 300 sec: 42876.1). Total num frames: 893321216. Throughput: 0: 10547.2. Samples: 223422464. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:08:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:08:56,306][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000436208_893353984.pth... [2024-06-15 17:08:56,323][1653645] Updated weights for policy 0, policy_version 436208 (0.0013) [2024-06-15 17:08:56,475][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000431264_883228672.pth [2024-06-15 17:08:59,020][1653645] Updated weights for policy 0, policy_version 436320 (0.0104) [2024-06-15 17:09:00,530][1653645] Updated weights for policy 0, policy_version 436388 (0.0013) [2024-06-15 17:09:00,959][1648982] Fps is (10 sec: 49145.0, 60 sec: 43143.4, 300 sec: 43209.1). Total num frames: 893747200. Throughput: 0: 10216.9. Samples: 223470592. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:00,960][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 17:09:05,970][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 893779968. Throughput: 0: 10160.4. Samples: 223507456. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:05,971][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:09:08,325][1653645] Updated weights for policy 0, policy_version 436473 (0.0012) [2024-06-15 17:09:10,084][1653645] Updated weights for policy 0, policy_version 436514 (0.0012) [2024-06-15 17:09:10,958][1648982] Fps is (10 sec: 29495.7, 60 sec: 39867.7, 300 sec: 43098.3). Total num frames: 894042112. Throughput: 0: 10991.0. Samples: 223582720. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:09:11,140][1651596] Signal inference workers to stop experience collection... (22650 times) [2024-06-15 17:09:11,183][1653645] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-06-15 17:09:11,360][1651596] Signal inference workers to resume experience collection... (22650 times) [2024-06-15 17:09:11,362][1653645] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-06-15 17:09:11,602][1653645] Updated weights for policy 0, policy_version 436579 (0.0015) [2024-06-15 17:09:13,063][1653645] Updated weights for policy 0, policy_version 436643 (0.0094) [2024-06-15 17:09:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 894304256. Throughput: 0: 10581.3. Samples: 223644672. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:09:18,163][1653645] Updated weights for policy 0, policy_version 436676 (0.0017) [2024-06-15 17:09:19,193][1653645] Updated weights for policy 0, policy_version 436735 (0.0016) [2024-06-15 17:09:20,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 43144.3, 300 sec: 43320.4). Total num frames: 894500864. Throughput: 0: 10865.7. Samples: 223683072. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:09:21,187][1653645] Updated weights for policy 0, policy_version 436784 (0.0023) [2024-06-15 17:09:22,283][1653645] Updated weights for policy 0, policy_version 436832 (0.0015) [2024-06-15 17:09:24,099][1653645] Updated weights for policy 0, policy_version 436901 (0.0012) [2024-06-15 17:09:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.8, 300 sec: 43209.4). Total num frames: 894828544. Throughput: 0: 10865.8. Samples: 223744512. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:09:29,483][1653645] Updated weights for policy 0, policy_version 436948 (0.0087) [2024-06-15 17:09:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.0, 300 sec: 43098.2). Total num frames: 894959616. Throughput: 0: 11104.6. Samples: 223820288. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:30,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 17:09:31,770][1653645] Updated weights for policy 0, policy_version 437008 (0.0013) [2024-06-15 17:09:33,604][1653645] Updated weights for policy 0, policy_version 437093 (0.0022) [2024-06-15 17:09:35,341][1653645] Updated weights for policy 0, policy_version 437136 (0.0014) [2024-06-15 17:09:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 43653.7). Total num frames: 895287296. Throughput: 0: 11343.7. Samples: 223853056. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:09:40,804][1653645] Updated weights for policy 0, policy_version 437187 (0.0013) [2024-06-15 17:09:40,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43690.8, 300 sec: 42653.9). Total num frames: 895352832. Throughput: 0: 11116.1. Samples: 223922688. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:09:43,483][1653645] Updated weights for policy 0, policy_version 437250 (0.0017) [2024-06-15 17:09:44,624][1653645] Updated weights for policy 0, policy_version 437305 (0.0012) [2024-06-15 17:09:45,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45875.1, 300 sec: 43653.7). Total num frames: 895680512. Throughput: 0: 11366.8. Samples: 223982080. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:09:46,392][1653645] Updated weights for policy 0, policy_version 437371 (0.0077) [2024-06-15 17:09:48,532][1653645] Updated weights for policy 0, policy_version 437430 (0.0014) [2024-06-15 17:09:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 895877120. Throughput: 0: 11195.7. Samples: 224011264. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:09:53,863][1653645] Updated weights for policy 0, policy_version 437504 (0.0013) [2024-06-15 17:09:55,958][1648982] Fps is (10 sec: 32767.1, 60 sec: 44782.7, 300 sec: 43098.2). Total num frames: 896008192. Throughput: 0: 11081.9. Samples: 224081408. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:09:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:09:56,235][1651596] Signal inference workers to stop experience collection... (22700 times) [2024-06-15 17:09:56,273][1653645] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-06-15 17:09:56,467][1651596] Signal inference workers to resume experience collection... (22700 times) [2024-06-15 17:09:56,468][1653645] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-06-15 17:09:57,175][1653645] Updated weights for policy 0, policy_version 437563 (0.0014) [2024-06-15 17:09:58,449][1653645] Updated weights for policy 0, policy_version 437603 (0.0012) [2024-06-15 17:09:59,747][1653645] Updated weights for policy 0, policy_version 437651 (0.0012) [2024-06-15 17:10:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44237.9, 300 sec: 43764.7). Total num frames: 896401408. Throughput: 0: 11036.5. Samples: 224141312. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:10:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:10:05,832][1653645] Updated weights for policy 0, policy_version 437749 (0.0013) [2024-06-15 17:10:05,958][1648982] Fps is (10 sec: 49153.7, 60 sec: 45329.1, 300 sec: 42987.2). Total num frames: 896499712. Throughput: 0: 11013.8. Samples: 224178688. Policy #0 lag: (min: 15.0, avg: 74.0, max: 271.0) [2024-06-15 17:10:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:10:09,004][1653645] Updated weights for policy 0, policy_version 437808 (0.0013) [2024-06-15 17:10:10,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 45328.9, 300 sec: 43542.6). Total num frames: 896761856. Throughput: 0: 11138.8. Samples: 224245760. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:10,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:10:10,983][1653645] Updated weights for policy 0, policy_version 437886 (0.0013) [2024-06-15 17:10:12,705][1653645] Updated weights for policy 0, policy_version 437937 (0.0012) [2024-06-15 17:10:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 896925696. Throughput: 0: 10854.5. Samples: 224308736. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:10:17,816][1653645] Updated weights for policy 0, policy_version 437989 (0.0013) [2024-06-15 17:10:20,710][1653645] Updated weights for policy 0, policy_version 438048 (0.0014) [2024-06-15 17:10:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.8, 300 sec: 43320.4). Total num frames: 897122304. Throughput: 0: 10854.4. Samples: 224341504. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:10:22,069][1653645] Updated weights for policy 0, policy_version 438096 (0.0013) [2024-06-15 17:10:24,270][1653645] Updated weights for policy 0, policy_version 438176 (0.0029) [2024-06-15 17:10:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 897449984. Throughput: 0: 10683.7. Samples: 224403456. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:10:29,060][1653645] Updated weights for policy 0, policy_version 438240 (0.0013) [2024-06-15 17:10:30,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.9, 300 sec: 43098.2). Total num frames: 897581056. Throughput: 0: 10945.4. Samples: 224474624. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:10:32,394][1653645] Updated weights for policy 0, policy_version 438304 (0.0013) [2024-06-15 17:10:34,161][1653645] Updated weights for policy 0, policy_version 438344 (0.0021) [2024-06-15 17:10:35,650][1653645] Updated weights for policy 0, policy_version 438398 (0.0120) [2024-06-15 17:10:35,962][1648982] Fps is (10 sec: 39304.4, 60 sec: 42595.2, 300 sec: 43432.1). Total num frames: 897843200. Throughput: 0: 10989.8. Samples: 224505856. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:35,963][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 17:10:37,036][1653645] Updated weights for policy 0, policy_version 438464 (0.0012) [2024-06-15 17:10:40,598][1651596] Signal inference workers to stop experience collection... (22750 times) [2024-06-15 17:10:40,671][1653645] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-06-15 17:10:40,826][1651596] Signal inference workers to resume experience collection... (22750 times) [2024-06-15 17:10:40,826][1653645] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-06-15 17:10:40,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 42987.2). Total num frames: 898072576. Throughput: 0: 10968.3. Samples: 224574976. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:10:40,991][1653645] Updated weights for policy 0, policy_version 438522 (0.0021) [2024-06-15 17:10:44,284][1653645] Updated weights for policy 0, policy_version 438588 (0.0013) [2024-06-15 17:10:45,958][1648982] Fps is (10 sec: 42616.4, 60 sec: 43144.4, 300 sec: 43653.6). Total num frames: 898269184. Throughput: 0: 11161.5. Samples: 224643584. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:45,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:10:46,394][1653645] Updated weights for policy 0, policy_version 438628 (0.0060) [2024-06-15 17:10:48,120][1653645] Updated weights for policy 0, policy_version 438696 (0.0015) [2024-06-15 17:10:50,969][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 898498560. Throughput: 0: 10979.5. Samples: 224672768. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:50,984][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:10:52,081][1653645] Updated weights for policy 0, policy_version 438752 (0.0014) [2024-06-15 17:10:55,430][1653645] Updated weights for policy 0, policy_version 438800 (0.0012) [2024-06-15 17:10:55,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44783.1, 300 sec: 43320.4). Total num frames: 898695168. Throughput: 0: 11127.5. Samples: 224746496. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:10:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:10:56,305][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000438832_898727936.pth... [2024-06-15 17:10:56,347][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000433840_888504320.pth [2024-06-15 17:10:56,352][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000438832_898727936.pth [2024-06-15 17:10:57,120][1653645] Updated weights for policy 0, policy_version 438850 (0.0014) [2024-06-15 17:10:58,441][1653645] Updated weights for policy 0, policy_version 438912 (0.0012) [2024-06-15 17:11:00,169][1653645] Updated weights for policy 0, policy_version 438970 (0.0077) [2024-06-15 17:11:00,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.4, 300 sec: 43542.5). Total num frames: 899022848. Throughput: 0: 11059.1. Samples: 224806400. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:11:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:11:03,992][1653645] Updated weights for policy 0, policy_version 439024 (0.0013) [2024-06-15 17:11:05,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 44236.8, 300 sec: 43098.3). Total num frames: 899153920. Throughput: 0: 11229.9. Samples: 224846848. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:11:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:11:08,051][1653645] Updated weights for policy 0, policy_version 439088 (0.0014) [2024-06-15 17:11:09,421][1653645] Updated weights for policy 0, policy_version 439152 (0.0016) [2024-06-15 17:11:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44782.8, 300 sec: 43653.7). Total num frames: 899448832. Throughput: 0: 11241.2. Samples: 224909312. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:11:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:11:11,202][1653645] Updated weights for policy 0, policy_version 439202 (0.0012) [2024-06-15 17:11:14,855][1653645] Updated weights for policy 0, policy_version 439251 (0.0013) [2024-06-15 17:11:15,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 43098.3). Total num frames: 899678208. Throughput: 0: 11264.0. Samples: 224981504. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:11:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:11:19,765][1653645] Updated weights for policy 0, policy_version 439328 (0.0013) [2024-06-15 17:11:20,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 45329.1, 300 sec: 43656.4). Total num frames: 899842048. Throughput: 0: 11390.3. Samples: 225018368. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:11:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:11:22,047][1653645] Updated weights for policy 0, policy_version 439424 (0.0013) [2024-06-15 17:11:23,268][1653645] Updated weights for policy 0, policy_version 439475 (0.0011) [2024-06-15 17:11:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 900071424. Throughput: 0: 11218.5. Samples: 225079808. Policy #0 lag: (min: 15.0, avg: 102.7, max: 271.0) [2024-06-15 17:11:25,960][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:11:26,548][1651596] Signal inference workers to stop experience collection... (22800 times) [2024-06-15 17:11:26,634][1653645] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-06-15 17:11:26,639][1653645] Updated weights for policy 0, policy_version 439507 (0.0029) [2024-06-15 17:11:26,837][1651596] Signal inference workers to resume experience collection... (22800 times) [2024-06-15 17:11:26,838][1653645] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-06-15 17:11:27,585][1653645] Updated weights for policy 0, policy_version 439552 (0.0015) [2024-06-15 17:11:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 43209.3). Total num frames: 900235264. Throughput: 0: 11389.2. Samples: 225156096. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:11:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:11:32,316][1653645] Updated weights for policy 0, policy_version 439616 (0.0013) [2024-06-15 17:11:35,158][1653645] Updated weights for policy 0, policy_version 439713 (0.0144) [2024-06-15 17:11:35,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45878.7, 300 sec: 43653.7). Total num frames: 900595712. Throughput: 0: 11252.7. Samples: 225179136. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:11:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:11:38,795][1653645] Updated weights for policy 0, policy_version 439745 (0.0011) [2024-06-15 17:11:39,815][1653645] Updated weights for policy 0, policy_version 439807 (0.0108) [2024-06-15 17:11:40,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 44236.6, 300 sec: 43098.2). Total num frames: 900726784. Throughput: 0: 11036.4. Samples: 225243136. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:11:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:11:44,607][1653645] Updated weights for policy 0, policy_version 439872 (0.0154) [2024-06-15 17:11:45,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 900956160. Throughput: 0: 11195.8. Samples: 225310208. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:11:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:11:46,508][1653645] Updated weights for policy 0, policy_version 439940 (0.0037) [2024-06-15 17:11:47,927][1653645] Updated weights for policy 0, policy_version 439998 (0.0028) [2024-06-15 17:11:50,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 43690.8, 300 sec: 42876.1). Total num frames: 901120000. Throughput: 0: 10877.2. Samples: 225336320. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:11:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:11:52,227][1653645] Updated weights for policy 0, policy_version 440054 (0.0018) [2024-06-15 17:11:55,555][1653645] Updated weights for policy 0, policy_version 440080 (0.0046) [2024-06-15 17:11:55,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43144.5, 300 sec: 43209.3). Total num frames: 901283840. Throughput: 0: 11059.2. Samples: 225406976. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:11:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:11:57,308][1653645] Updated weights for policy 0, policy_version 440144 (0.0018) [2024-06-15 17:11:59,096][1653645] Updated weights for policy 0, policy_version 440208 (0.0012) [2024-06-15 17:12:00,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.9, 300 sec: 43653.6). Total num frames: 901644288. Throughput: 0: 10535.8. Samples: 225455616. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:12:04,968][1653645] Updated weights for policy 0, policy_version 440293 (0.0014) [2024-06-15 17:12:05,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 901775360. Throughput: 0: 10581.3. Samples: 225494528. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:12:08,353][1653645] Updated weights for policy 0, policy_version 440352 (0.0014) [2024-06-15 17:12:10,397][1651596] Signal inference workers to stop experience collection... (22850 times) [2024-06-15 17:12:10,454][1653645] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-06-15 17:12:10,719][1651596] Signal inference workers to resume experience collection... (22850 times) [2024-06-15 17:12:10,720][1653645] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-06-15 17:12:10,723][1653645] Updated weights for policy 0, policy_version 440432 (0.0100) [2024-06-15 17:12:10,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42598.6, 300 sec: 43653.6). Total num frames: 902004736. Throughput: 0: 10706.5. Samples: 225561600. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:12:11,994][1653645] Updated weights for policy 0, policy_version 440484 (0.0014) [2024-06-15 17:12:15,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 42876.1). Total num frames: 902168576. Throughput: 0: 10342.4. Samples: 225621504. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:12:17,120][1653645] Updated weights for policy 0, policy_version 440544 (0.0020) [2024-06-15 17:12:20,195][1653645] Updated weights for policy 0, policy_version 440580 (0.0022) [2024-06-15 17:12:20,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42052.3, 300 sec: 43320.4). Total num frames: 902365184. Throughput: 0: 10638.2. Samples: 225657856. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:12:21,813][1653645] Updated weights for policy 0, policy_version 440643 (0.0013) [2024-06-15 17:12:23,606][1653645] Updated weights for policy 0, policy_version 440723 (0.0014) [2024-06-15 17:12:24,389][1653645] Updated weights for policy 0, policy_version 440768 (0.0027) [2024-06-15 17:12:25,967][1648982] Fps is (10 sec: 52378.2, 60 sec: 43683.7, 300 sec: 43541.1). Total num frames: 902692864. Throughput: 0: 10567.8. Samples: 225718784. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:25,968][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:12:29,796][1653645] Updated weights for policy 0, policy_version 440826 (0.0014) [2024-06-15 17:12:30,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 902823936. Throughput: 0: 10604.1. Samples: 225787392. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:12:34,199][1653645] Updated weights for policy 0, policy_version 440928 (0.0012) [2024-06-15 17:12:35,751][1653645] Updated weights for policy 0, policy_version 440997 (0.0013) [2024-06-15 17:12:35,958][1648982] Fps is (10 sec: 49199.4, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 903184384. Throughput: 0: 10786.1. Samples: 225821696. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:12:40,516][1653645] Updated weights for policy 0, policy_version 441025 (0.0011) [2024-06-15 17:12:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 903249920. Throughput: 0: 10581.4. Samples: 225883136. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:12:41,923][1653645] Updated weights for policy 0, policy_version 441086 (0.0011) [2024-06-15 17:12:45,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 42052.4, 300 sec: 43542.6). Total num frames: 903479296. Throughput: 0: 11082.0. Samples: 225954304. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:12:46,029][1653645] Updated weights for policy 0, policy_version 441156 (0.0011) [2024-06-15 17:12:47,823][1653645] Updated weights for policy 0, policy_version 441232 (0.0138) [2024-06-15 17:12:50,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 903741440. Throughput: 0: 10672.4. Samples: 225974784. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 17:12:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:12:53,832][1653645] Updated weights for policy 0, policy_version 441313 (0.0024) [2024-06-15 17:12:55,965][1648982] Fps is (10 sec: 39293.0, 60 sec: 43139.4, 300 sec: 43097.2). Total num frames: 903872512. Throughput: 0: 10727.5. Samples: 226044416. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:12:55,966][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:12:55,983][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000441344_903872512.pth... [2024-06-15 17:12:56,060][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000436208_893353984.pth [2024-06-15 17:12:56,843][1651596] Signal inference workers to stop experience collection... (22900 times) [2024-06-15 17:12:56,901][1653645] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-06-15 17:12:57,252][1651596] Signal inference workers to resume experience collection... (22900 times) [2024-06-15 17:12:57,253][1653645] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-06-15 17:12:58,031][1653645] Updated weights for policy 0, policy_version 441392 (0.0151) [2024-06-15 17:12:59,739][1653645] Updated weights for policy 0, policy_version 441478 (0.0041) [2024-06-15 17:13:00,923][1653645] Updated weights for policy 0, policy_version 441534 (0.0013) [2024-06-15 17:13:00,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 904265728. Throughput: 0: 10729.3. Samples: 226104320. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:00,962][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:13:05,958][1648982] Fps is (10 sec: 45908.9, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 904331264. Throughput: 0: 10763.4. Samples: 226142208. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:13:06,343][1653645] Updated weights for policy 0, policy_version 441599 (0.0107) [2024-06-15 17:13:09,345][1653645] Updated weights for policy 0, policy_version 441656 (0.0013) [2024-06-15 17:13:10,583][1653645] Updated weights for policy 0, policy_version 441696 (0.0015) [2024-06-15 17:13:10,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 904626176. Throughput: 0: 10925.0. Samples: 226210304. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:13:13,048][1653645] Updated weights for policy 0, policy_version 441789 (0.0014) [2024-06-15 17:13:15,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 904790016. Throughput: 0: 10752.0. Samples: 226271232. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:13:17,969][1653645] Updated weights for policy 0, policy_version 441824 (0.0017) [2024-06-15 17:13:20,898][1653645] Updated weights for policy 0, policy_version 441904 (0.0126) [2024-06-15 17:13:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.7, 300 sec: 43431.5). Total num frames: 905019392. Throughput: 0: 10854.4. Samples: 226310144. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:13:22,636][1653645] Updated weights for policy 0, policy_version 441959 (0.0013) [2024-06-15 17:13:24,388][1653645] Updated weights for policy 0, policy_version 442032 (0.0013) [2024-06-15 17:13:25,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43697.5, 300 sec: 44431.2). Total num frames: 905314304. Throughput: 0: 10786.1. Samples: 226368512. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:13:28,352][1653645] Updated weights for policy 0, policy_version 442066 (0.0011) [2024-06-15 17:13:30,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 905445376. Throughput: 0: 11002.3. Samples: 226449408. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:13:32,484][1653645] Updated weights for policy 0, policy_version 442160 (0.0014) [2024-06-15 17:13:33,771][1653645] Updated weights for policy 0, policy_version 442208 (0.0014) [2024-06-15 17:13:35,657][1651596] Signal inference workers to stop experience collection... (22950 times) [2024-06-15 17:13:35,746][1653645] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-06-15 17:13:35,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43144.3, 300 sec: 44209.0). Total num frames: 905773056. Throughput: 0: 11207.0. Samples: 226479104. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:35,961][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:13:35,997][1651596] Signal inference workers to resume experience collection... (22950 times) [2024-06-15 17:13:35,998][1653645] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-06-15 17:13:36,000][1653645] Updated weights for policy 0, policy_version 442288 (0.0138) [2024-06-15 17:13:40,723][1653645] Updated weights for policy 0, policy_version 442338 (0.0011) [2024-06-15 17:13:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 44783.0, 300 sec: 44097.9). Total num frames: 905936896. Throughput: 0: 11186.2. Samples: 226547712. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:13:45,090][1653645] Updated weights for policy 0, policy_version 442436 (0.0018) [2024-06-15 17:13:45,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 906166272. Throughput: 0: 11241.2. Samples: 226610176. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:45,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:13:46,395][1653645] Updated weights for policy 0, policy_version 442483 (0.0014) [2024-06-15 17:13:48,136][1653645] Updated weights for policy 0, policy_version 442554 (0.0013) [2024-06-15 17:13:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 906362880. Throughput: 0: 10968.2. Samples: 226635776. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:13:52,421][1653645] Updated weights for policy 0, policy_version 442619 (0.0014) [2024-06-15 17:13:55,723][1653645] Updated weights for policy 0, policy_version 442658 (0.0025) [2024-06-15 17:13:55,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44788.1, 300 sec: 43431.7). Total num frames: 906559488. Throughput: 0: 11400.4. Samples: 226723328. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:13:55,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:13:58,078][1653645] Updated weights for policy 0, policy_version 442738 (0.0013) [2024-06-15 17:13:59,788][1653645] Updated weights for policy 0, policy_version 442811 (0.0016) [2024-06-15 17:14:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 906887168. Throughput: 0: 11138.9. Samples: 226772480. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:14:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:14:03,890][1653645] Updated weights for policy 0, policy_version 442851 (0.0013) [2024-06-15 17:14:05,958][1648982] Fps is (10 sec: 45877.1, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 907018240. Throughput: 0: 11275.4. Samples: 226817536. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:14:05,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:14:07,653][1653645] Updated weights for policy 0, policy_version 442912 (0.0041) [2024-06-15 17:14:09,821][1653645] Updated weights for policy 0, policy_version 442992 (0.0121) [2024-06-15 17:14:10,957][1648982] Fps is (10 sec: 42598.7, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 907313152. Throughput: 0: 11241.3. Samples: 226874368. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:14:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:14:11,924][1653645] Updated weights for policy 0, policy_version 443069 (0.0013) [2024-06-15 17:14:15,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.4, 300 sec: 43764.7). Total num frames: 907411456. Throughput: 0: 11013.6. Samples: 226945024. Policy #0 lag: (min: 47.0, avg: 152.7, max: 303.0) [2024-06-15 17:14:15,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:14:17,745][1653645] Updated weights for policy 0, policy_version 443134 (0.0195) [2024-06-15 17:14:20,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 907640832. Throughput: 0: 10991.0. Samples: 226973696. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:14:21,427][1651596] Signal inference workers to stop experience collection... (23000 times) [2024-06-15 17:14:21,458][1653645] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-06-15 17:14:21,668][1651596] Signal inference workers to resume experience collection... (23000 times) [2024-06-15 17:14:21,669][1653645] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-06-15 17:14:21,674][1653645] Updated weights for policy 0, policy_version 443216 (0.0029) [2024-06-15 17:14:23,404][1653645] Updated weights for policy 0, policy_version 443280 (0.0016) [2024-06-15 17:14:25,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 907935744. Throughput: 0: 10581.3. Samples: 227023872. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:14:28,803][1653645] Updated weights for policy 0, policy_version 443331 (0.0093) [2024-06-15 17:14:30,368][1653645] Updated weights for policy 0, policy_version 443390 (0.0012) [2024-06-15 17:14:30,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 43690.5, 300 sec: 43320.4). Total num frames: 908066816. Throughput: 0: 10865.8. Samples: 227099136. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:14:32,971][1653645] Updated weights for policy 0, policy_version 443443 (0.0053) [2024-06-15 17:14:34,616][1653645] Updated weights for policy 0, policy_version 443520 (0.0120) [2024-06-15 17:14:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43691.0, 300 sec: 44209.1). Total num frames: 908394496. Throughput: 0: 11013.7. Samples: 227131392. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:14:36,272][1653645] Updated weights for policy 0, policy_version 443581 (0.0011) [2024-06-15 17:14:40,959][1648982] Fps is (10 sec: 42594.4, 60 sec: 42597.5, 300 sec: 43431.3). Total num frames: 908492800. Throughput: 0: 10626.7. Samples: 227201536. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:40,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:14:41,874][1653645] Updated weights for policy 0, policy_version 443648 (0.0014) [2024-06-15 17:14:45,554][1653645] Updated weights for policy 0, policy_version 443717 (0.0015) [2024-06-15 17:14:45,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43144.6, 300 sec: 43653.6). Total num frames: 908754944. Throughput: 0: 10820.2. Samples: 227259392. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:14:46,618][1653645] Updated weights for policy 0, policy_version 443773 (0.0012) [2024-06-15 17:14:48,015][1653645] Updated weights for policy 0, policy_version 443840 (0.0012) [2024-06-15 17:14:50,958][1648982] Fps is (10 sec: 49157.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 908984320. Throughput: 0: 10535.8. Samples: 227291648. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:14:53,411][1653645] Updated weights for policy 0, policy_version 443904 (0.0014) [2024-06-15 17:14:55,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43144.7, 300 sec: 43209.3). Total num frames: 909148160. Throughput: 0: 10934.0. Samples: 227366400. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:14:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:14:56,381][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000443952_909213696.pth... [2024-06-15 17:14:56,500][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000438832_898727936.pth [2024-06-15 17:14:57,629][1653645] Updated weights for policy 0, policy_version 444000 (0.0018) [2024-06-15 17:14:59,213][1653645] Updated weights for policy 0, policy_version 444064 (0.0014) [2024-06-15 17:14:59,986][1653645] Updated weights for policy 0, policy_version 444094 (0.0023) [2024-06-15 17:15:00,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 909508608. Throughput: 0: 10763.4. Samples: 227429376. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:15:04,466][1653645] Updated weights for policy 0, policy_version 444131 (0.0041) [2024-06-15 17:15:05,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.6, 300 sec: 43653.7). Total num frames: 909639680. Throughput: 0: 11093.3. Samples: 227472896. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:15:07,344][1651596] Signal inference workers to stop experience collection... (23050 times) [2024-06-15 17:15:07,384][1653645] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-06-15 17:15:07,601][1651596] Signal inference workers to resume experience collection... (23050 times) [2024-06-15 17:15:07,602][1653645] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-06-15 17:15:08,050][1653645] Updated weights for policy 0, policy_version 444208 (0.0014) [2024-06-15 17:15:09,551][1653645] Updated weights for policy 0, policy_version 444276 (0.0013) [2024-06-15 17:15:10,437][1653645] Updated weights for policy 0, policy_version 444307 (0.0013) [2024-06-15 17:15:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 909967360. Throughput: 0: 11229.8. Samples: 227529216. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:15:11,342][1653645] Updated weights for policy 0, policy_version 444345 (0.0086) [2024-06-15 17:15:15,783][1653645] Updated weights for policy 0, policy_version 444386 (0.0012) [2024-06-15 17:15:15,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 44782.9, 300 sec: 43986.8). Total num frames: 910098432. Throughput: 0: 11343.6. Samples: 227609600. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:15,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:15:18,596][1653645] Updated weights for policy 0, policy_version 444419 (0.0015) [2024-06-15 17:15:19,769][1653645] Updated weights for policy 0, policy_version 444484 (0.0014) [2024-06-15 17:15:20,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 46421.4, 300 sec: 43986.9). Total num frames: 910426112. Throughput: 0: 11343.6. Samples: 227641856. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:15:22,263][1653645] Updated weights for policy 0, policy_version 444580 (0.0239) [2024-06-15 17:15:25,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 910557184. Throughput: 0: 11389.4. Samples: 227714048. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:25,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 17:15:26,506][1653645] Updated weights for policy 0, policy_version 444624 (0.0014) [2024-06-15 17:15:29,528][1653645] Updated weights for policy 0, policy_version 444683 (0.0014) [2024-06-15 17:15:30,718][1653645] Updated weights for policy 0, policy_version 444738 (0.0012) [2024-06-15 17:15:30,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 46421.4, 300 sec: 44098.6). Total num frames: 910852096. Throughput: 0: 11616.7. Samples: 227782144. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:15:31,478][1653645] Updated weights for policy 0, policy_version 444784 (0.0013) [2024-06-15 17:15:32,962][1653645] Updated weights for policy 0, policy_version 444833 (0.0013) [2024-06-15 17:15:35,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 911081472. Throughput: 0: 11639.5. Samples: 227815424. Policy #0 lag: (min: 45.0, avg: 132.6, max: 301.0) [2024-06-15 17:15:35,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:15:37,313][1653645] Updated weights for policy 0, policy_version 444896 (0.0013) [2024-06-15 17:15:40,520][1653645] Updated weights for policy 0, policy_version 444945 (0.0015) [2024-06-15 17:15:40,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 46968.4, 300 sec: 44209.1). Total num frames: 911310848. Throughput: 0: 11832.9. Samples: 227898880. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:15:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:15:41,965][1653645] Updated weights for policy 0, policy_version 445030 (0.0013) [2024-06-15 17:15:43,154][1653645] Updated weights for policy 0, policy_version 445089 (0.0015) [2024-06-15 17:15:45,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 47513.5, 300 sec: 44431.2). Total num frames: 911605760. Throughput: 0: 11980.8. Samples: 227968512. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:15:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:15:47,339][1651596] Signal inference workers to stop experience collection... (23100 times) [2024-06-15 17:15:47,390][1653645] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-06-15 17:15:47,666][1651596] Signal inference workers to resume experience collection... (23100 times) [2024-06-15 17:15:47,667][1653645] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-06-15 17:15:47,836][1653645] Updated weights for policy 0, policy_version 445138 (0.0013) [2024-06-15 17:15:48,691][1653645] Updated weights for policy 0, policy_version 445181 (0.0015) [2024-06-15 17:15:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44209.1). Total num frames: 911736832. Throughput: 0: 11889.8. Samples: 228007936. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:15:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:15:51,856][1653645] Updated weights for policy 0, policy_version 445232 (0.0011) [2024-06-15 17:15:53,270][1653645] Updated weights for policy 0, policy_version 445297 (0.0014) [2024-06-15 17:15:54,681][1653645] Updated weights for policy 0, policy_version 445374 (0.0013) [2024-06-15 17:15:55,959][1648982] Fps is (10 sec: 52428.9, 60 sec: 49698.2, 300 sec: 44431.2). Total num frames: 912130048. Throughput: 0: 12071.8. Samples: 228072448. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:15:55,960][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 17:15:59,596][1653645] Updated weights for policy 0, policy_version 445433 (0.0012) [2024-06-15 17:16:00,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 912261120. Throughput: 0: 12037.8. Samples: 228151296. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:16:03,161][1653645] Updated weights for policy 0, policy_version 445520 (0.0016) [2024-06-15 17:16:04,992][1653645] Updated weights for policy 0, policy_version 445584 (0.0136) [2024-06-15 17:16:05,828][1653645] Updated weights for policy 0, policy_version 445625 (0.0014) [2024-06-15 17:16:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 50244.3, 300 sec: 44764.5). Total num frames: 912654336. Throughput: 0: 12014.9. Samples: 228182528. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:16:10,778][1653645] Updated weights for policy 0, policy_version 445691 (0.0014) [2024-06-15 17:16:10,963][1648982] Fps is (10 sec: 52403.9, 60 sec: 46963.8, 300 sec: 44430.5). Total num frames: 912785408. Throughput: 0: 12195.8. Samples: 228262912. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:10,963][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:16:13,814][1653645] Updated weights for policy 0, policy_version 445748 (0.0014) [2024-06-15 17:16:15,028][1653645] Updated weights for policy 0, policy_version 445808 (0.0015) [2024-06-15 17:16:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 49152.2, 300 sec: 44764.4). Total num frames: 913047552. Throughput: 0: 12015.0. Samples: 228322816. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:16:16,919][1653645] Updated weights for policy 0, policy_version 445881 (0.0011) [2024-06-15 17:16:20,958][1648982] Fps is (10 sec: 39339.4, 60 sec: 45875.0, 300 sec: 44431.1). Total num frames: 913178624. Throughput: 0: 12219.7. Samples: 228365312. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:16:21,402][1653645] Updated weights for policy 0, policy_version 445920 (0.0010) [2024-06-15 17:16:23,531][1653645] Updated weights for policy 0, policy_version 445956 (0.0030) [2024-06-15 17:16:24,815][1653645] Updated weights for policy 0, policy_version 446018 (0.0114) [2024-06-15 17:16:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 50244.5, 300 sec: 45208.7). Total num frames: 913571840. Throughput: 0: 11923.9. Samples: 228435456. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:16:26,823][1651596] Signal inference workers to stop experience collection... (23150 times) [2024-06-15 17:16:26,884][1653645] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-06-15 17:16:27,102][1651596] Signal inference workers to resume experience collection... (23150 times) [2024-06-15 17:16:27,114][1653645] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-06-15 17:16:27,429][1653645] Updated weights for policy 0, policy_version 446112 (0.0013) [2024-06-15 17:16:30,975][1648982] Fps is (10 sec: 52340.0, 60 sec: 47500.1, 300 sec: 44428.6). Total num frames: 913702912. Throughput: 0: 12055.9. Samples: 228511232. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:30,976][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:16:31,232][1653645] Updated weights for policy 0, policy_version 446160 (0.0014) [2024-06-15 17:16:32,618][1653645] Updated weights for policy 0, policy_version 446208 (0.0013) [2024-06-15 17:16:35,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 47513.6, 300 sec: 44764.5). Total num frames: 913932288. Throughput: 0: 11901.2. Samples: 228543488. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:16:36,312][1653645] Updated weights for policy 0, policy_version 446288 (0.0017) [2024-06-15 17:16:38,604][1653645] Updated weights for policy 0, policy_version 446352 (0.0013) [2024-06-15 17:16:40,958][1648982] Fps is (10 sec: 52520.0, 60 sec: 48606.0, 300 sec: 44986.6). Total num frames: 914227200. Throughput: 0: 11741.9. Samples: 228600832. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:16:42,688][1653645] Updated weights for policy 0, policy_version 446416 (0.0088) [2024-06-15 17:16:43,747][1653645] Updated weights for policy 0, policy_version 446459 (0.0013) [2024-06-15 17:16:45,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 914358272. Throughput: 0: 11776.0. Samples: 228681216. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:16:46,992][1653645] Updated weights for policy 0, policy_version 446512 (0.0013) [2024-06-15 17:16:48,536][1653645] Updated weights for policy 0, policy_version 446592 (0.0014) [2024-06-15 17:16:50,612][1653645] Updated weights for policy 0, policy_version 446653 (0.0012) [2024-06-15 17:16:50,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 50244.1, 300 sec: 45653.0). Total num frames: 914751488. Throughput: 0: 11696.3. Samples: 228708864. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:50,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:16:55,371][1653645] Updated weights for policy 0, policy_version 446720 (0.0015) [2024-06-15 17:16:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 914882560. Throughput: 0: 11583.8. Samples: 228784128. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:16:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:16:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000446720_914882560.pth... [2024-06-15 17:16:56,062][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000441344_903872512.pth [2024-06-15 17:16:59,454][1653645] Updated weights for policy 0, policy_version 446818 (0.0098) [2024-06-15 17:17:00,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 48605.8, 300 sec: 45430.9). Total num frames: 915177472. Throughput: 0: 11650.8. Samples: 228847104. Policy #0 lag: (min: 31.0, avg: 117.1, max: 287.0) [2024-06-15 17:17:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:17:01,305][1653645] Updated weights for policy 0, policy_version 446880 (0.0012) [2024-06-15 17:17:05,932][1653645] Updated weights for policy 0, policy_version 446933 (0.0040) [2024-06-15 17:17:05,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 915308544. Throughput: 0: 11491.6. Samples: 228882432. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:17:09,333][1653645] Updated weights for policy 0, policy_version 446980 (0.0012) [2024-06-15 17:17:10,527][1653645] Updated weights for policy 0, policy_version 447030 (0.0013) [2024-06-15 17:17:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 46425.0, 300 sec: 45430.9). Total num frames: 915570688. Throughput: 0: 11537.1. Samples: 228954624. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:10,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:17:11,065][1651596] Signal inference workers to stop experience collection... (23200 times) [2024-06-15 17:17:11,138][1653645] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-06-15 17:17:11,303][1651596] Signal inference workers to resume experience collection... (23200 times) [2024-06-15 17:17:11,304][1653645] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-06-15 17:17:11,683][1653645] Updated weights for policy 0, policy_version 447088 (0.0192) [2024-06-15 17:17:13,350][1653645] Updated weights for policy 0, policy_version 447159 (0.0014) [2024-06-15 17:17:15,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 915800064. Throughput: 0: 11370.8. Samples: 229022720. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:17:18,325][1653645] Updated weights for policy 0, policy_version 447231 (0.0089) [2024-06-15 17:17:20,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 46421.6, 300 sec: 44988.1). Total num frames: 915963904. Throughput: 0: 11377.8. Samples: 229055488. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:17:22,403][1653645] Updated weights for policy 0, policy_version 447328 (0.0105) [2024-06-15 17:17:24,614][1653645] Updated weights for policy 0, policy_version 447382 (0.0014) [2024-06-15 17:17:25,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 916324352. Throughput: 0: 11480.1. Samples: 229117440. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:17:29,490][1653645] Updated weights for policy 0, policy_version 447456 (0.0053) [2024-06-15 17:17:30,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45888.4, 300 sec: 44986.6). Total num frames: 916455424. Throughput: 0: 11218.5. Samples: 229186048. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:17:33,533][1653645] Updated weights for policy 0, policy_version 447520 (0.0015) [2024-06-15 17:17:35,364][1653645] Updated weights for policy 0, policy_version 447587 (0.0013) [2024-06-15 17:17:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 46421.3, 300 sec: 45653.1). Total num frames: 916717568. Throughput: 0: 11366.5. Samples: 229220352. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:17:37,655][1653645] Updated weights for policy 0, policy_version 447664 (0.0032) [2024-06-15 17:17:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.7, 300 sec: 45430.9). Total num frames: 916881408. Throughput: 0: 11093.4. Samples: 229283328. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:17:41,144][1653645] Updated weights for policy 0, policy_version 447699 (0.0012) [2024-06-15 17:17:41,831][1653645] Updated weights for policy 0, policy_version 447737 (0.0095) [2024-06-15 17:17:45,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 917078016. Throughput: 0: 11229.9. Samples: 229352448. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:17:46,585][1653645] Updated weights for policy 0, policy_version 447824 (0.0013) [2024-06-15 17:17:47,697][1653645] Updated weights for policy 0, policy_version 447870 (0.0037) [2024-06-15 17:17:49,953][1653645] Updated weights for policy 0, policy_version 447930 (0.0011) [2024-06-15 17:17:50,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.8, 300 sec: 45765.2). Total num frames: 917372928. Throughput: 0: 11036.4. Samples: 229379072. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:17:53,245][1653645] Updated weights for policy 0, policy_version 447991 (0.0012) [2024-06-15 17:17:55,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 917504000. Throughput: 0: 10968.2. Samples: 229448192. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:17:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:17:58,221][1653645] Updated weights for policy 0, policy_version 448048 (0.0014) [2024-06-15 17:17:58,402][1651596] Signal inference workers to stop experience collection... (23250 times) [2024-06-15 17:17:58,448][1653645] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-06-15 17:17:58,596][1651596] Signal inference workers to resume experience collection... (23250 times) [2024-06-15 17:17:58,597][1653645] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-06-15 17:17:59,804][1653645] Updated weights for policy 0, policy_version 448102 (0.0013) [2024-06-15 17:18:00,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43144.4, 300 sec: 45541.9). Total num frames: 917766144. Throughput: 0: 10831.6. Samples: 229510144. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:00,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:18:01,456][1653645] Updated weights for policy 0, policy_version 448146 (0.0012) [2024-06-15 17:18:02,390][1653645] Updated weights for policy 0, policy_version 448189 (0.0037) [2024-06-15 17:18:05,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 918028288. Throughput: 0: 10911.3. Samples: 229546496. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:18:09,045][1653645] Updated weights for policy 0, policy_version 448272 (0.0128) [2024-06-15 17:18:10,668][1653645] Updated weights for policy 0, policy_version 448336 (0.0017) [2024-06-15 17:18:10,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 918192128. Throughput: 0: 11116.1. Samples: 229617664. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:18:12,504][1653645] Updated weights for policy 0, policy_version 448386 (0.0023) [2024-06-15 17:18:15,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44236.5, 300 sec: 45541.9). Total num frames: 918454272. Throughput: 0: 10968.1. Samples: 229679616. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:18:15,995][1653645] Updated weights for policy 0, policy_version 448471 (0.0013) [2024-06-15 17:18:20,143][1653645] Updated weights for policy 0, policy_version 448515 (0.0011) [2024-06-15 17:18:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 45097.7). Total num frames: 918618112. Throughput: 0: 11013.7. Samples: 229715968. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:18:21,656][1653645] Updated weights for policy 0, policy_version 448576 (0.0012) [2024-06-15 17:18:23,067][1653645] Updated weights for policy 0, policy_version 448638 (0.0013) [2024-06-15 17:18:25,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.5, 300 sec: 45764.1). Total num frames: 918945792. Throughput: 0: 11116.0. Samples: 229783552. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:18:27,736][1653645] Updated weights for policy 0, policy_version 448713 (0.0014) [2024-06-15 17:18:28,577][1653645] Updated weights for policy 0, policy_version 448757 (0.0012) [2024-06-15 17:18:30,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 919076864. Throughput: 0: 11116.1. Samples: 229852672. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 17:18:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:18:32,577][1653645] Updated weights for policy 0, policy_version 448823 (0.0016) [2024-06-15 17:18:34,413][1653645] Updated weights for policy 0, policy_version 448888 (0.0014) [2024-06-15 17:18:35,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 919339008. Throughput: 0: 11229.9. Samples: 229884416. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:18:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:18:36,767][1653645] Updated weights for policy 0, policy_version 448914 (0.0011) [2024-06-15 17:18:39,573][1653645] Updated weights for policy 0, policy_version 448982 (0.0014) [2024-06-15 17:18:40,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 919601152. Throughput: 0: 11173.0. Samples: 229950976. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:18:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:18:42,927][1653645] Updated weights for policy 0, policy_version 449043 (0.0014) [2024-06-15 17:18:43,334][1651596] Signal inference workers to stop experience collection... (23300 times) [2024-06-15 17:18:43,369][1653645] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-06-15 17:18:43,515][1651596] Signal inference workers to resume experience collection... (23300 times) [2024-06-15 17:18:43,516][1653645] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-06-15 17:18:43,809][1653645] Updated weights for policy 0, policy_version 449085 (0.0011) [2024-06-15 17:18:45,910][1653645] Updated weights for policy 0, policy_version 449146 (0.0012) [2024-06-15 17:18:45,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 919830528. Throughput: 0: 11286.8. Samples: 230018048. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:18:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:18:49,993][1653645] Updated weights for policy 0, policy_version 449213 (0.0013) [2024-06-15 17:18:50,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 920027136. Throughput: 0: 11275.4. Samples: 230053888. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:18:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:18:51,873][1653645] Updated weights for policy 0, policy_version 449272 (0.0011) [2024-06-15 17:18:54,636][1653645] Updated weights for policy 0, policy_version 449316 (0.0012) [2024-06-15 17:18:55,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 920256512. Throughput: 0: 11138.8. Samples: 230118912. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:18:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:18:55,974][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000449344_920256512.pth... [2024-06-15 17:18:56,046][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000443952_909213696.pth [2024-06-15 17:18:57,805][1653645] Updated weights for policy 0, policy_version 449382 (0.0014) [2024-06-15 17:19:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44237.0, 300 sec: 45430.9). Total num frames: 920420352. Throughput: 0: 11241.3. Samples: 230185472. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:19:01,081][1653645] Updated weights for policy 0, policy_version 449426 (0.0014) [2024-06-15 17:19:02,848][1653645] Updated weights for policy 0, policy_version 449489 (0.0036) [2024-06-15 17:19:05,476][1653645] Updated weights for policy 0, policy_version 449539 (0.0014) [2024-06-15 17:19:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 920682496. Throughput: 0: 11127.5. Samples: 230216704. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:19:06,828][1653645] Updated weights for policy 0, policy_version 449600 (0.0012) [2024-06-15 17:19:10,018][1653645] Updated weights for policy 0, policy_version 449663 (0.0014) [2024-06-15 17:19:10,958][1648982] Fps is (10 sec: 49150.1, 60 sec: 45328.8, 300 sec: 45764.1). Total num frames: 920911872. Throughput: 0: 11195.7. Samples: 230287360. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:19:13,123][1653645] Updated weights for policy 0, policy_version 449720 (0.0013) [2024-06-15 17:19:15,245][1653645] Updated weights for policy 0, policy_version 449776 (0.0012) [2024-06-15 17:19:15,958][1648982] Fps is (10 sec: 49147.7, 60 sec: 45328.6, 300 sec: 45875.1). Total num frames: 921174016. Throughput: 0: 11138.6. Samples: 230353920. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:19:17,904][1653645] Updated weights for policy 0, policy_version 449824 (0.0014) [2024-06-15 17:19:19,943][1653645] Updated weights for policy 0, policy_version 449872 (0.0017) [2024-06-15 17:19:20,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 46421.2, 300 sec: 45653.0). Total num frames: 921403392. Throughput: 0: 11184.3. Samples: 230387712. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:19:24,175][1653645] Updated weights for policy 0, policy_version 449936 (0.0013) [2024-06-15 17:19:25,275][1653645] Updated weights for policy 0, policy_version 449981 (0.0011) [2024-06-15 17:19:25,958][1648982] Fps is (10 sec: 39325.0, 60 sec: 43690.8, 300 sec: 45764.2). Total num frames: 921567232. Throughput: 0: 11172.9. Samples: 230453760. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:19:27,139][1653645] Updated weights for policy 0, policy_version 450042 (0.0012) [2024-06-15 17:19:30,029][1653645] Updated weights for policy 0, policy_version 450101 (0.0138) [2024-06-15 17:19:30,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45874.9, 300 sec: 45541.9). Total num frames: 921829376. Throughput: 0: 11184.3. Samples: 230521344. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:30,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:19:32,358][1651596] Signal inference workers to stop experience collection... (23350 times) [2024-06-15 17:19:32,449][1653645] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-06-15 17:19:32,451][1653645] Updated weights for policy 0, policy_version 450133 (0.0036) [2024-06-15 17:19:32,574][1651596] Signal inference workers to resume experience collection... (23350 times) [2024-06-15 17:19:32,575][1653645] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-06-15 17:19:35,583][1653645] Updated weights for policy 0, policy_version 450178 (0.0015) [2024-06-15 17:19:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 45764.3). Total num frames: 921993216. Throughput: 0: 11093.4. Samples: 230553088. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:19:36,942][1653645] Updated weights for policy 0, policy_version 450237 (0.0013) [2024-06-15 17:19:38,920][1653645] Updated weights for policy 0, policy_version 450300 (0.0014) [2024-06-15 17:19:40,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.6, 300 sec: 45764.1). Total num frames: 922255360. Throughput: 0: 11229.9. Samples: 230624256. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:19:41,739][1653645] Updated weights for policy 0, policy_version 450362 (0.0012) [2024-06-15 17:19:44,987][1653645] Updated weights for policy 0, policy_version 450431 (0.0015) [2024-06-15 17:19:45,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44236.8, 300 sec: 45764.1). Total num frames: 922484736. Throughput: 0: 11150.2. Samples: 230687232. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:19:48,939][1653645] Updated weights for policy 0, policy_version 450496 (0.0012) [2024-06-15 17:19:50,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 46097.3). Total num frames: 922746880. Throughput: 0: 11320.8. Samples: 230726144. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:19:51,980][1653645] Updated weights for policy 0, policy_version 450577 (0.0022) [2024-06-15 17:19:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 922877952. Throughput: 0: 11218.6. Samples: 230792192. Policy #0 lag: (min: 15.0, avg: 103.9, max: 271.0) [2024-06-15 17:19:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:19:56,005][1653645] Updated weights for policy 0, policy_version 450640 (0.0018) [2024-06-15 17:20:00,173][1653645] Updated weights for policy 0, policy_version 450721 (0.0119) [2024-06-15 17:20:00,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 923140096. Throughput: 0: 11150.5. Samples: 230855680. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:20:01,656][1653645] Updated weights for policy 0, policy_version 450784 (0.0050) [2024-06-15 17:20:04,385][1653645] Updated weights for policy 0, policy_version 450832 (0.0013) [2024-06-15 17:20:05,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 45542.0). Total num frames: 923402240. Throughput: 0: 11025.1. Samples: 230883840. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:20:08,730][1653645] Updated weights for policy 0, policy_version 450912 (0.0013) [2024-06-15 17:20:10,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.8, 300 sec: 45542.0). Total num frames: 923533312. Throughput: 0: 10990.9. Samples: 230948352. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:20:12,555][1653645] Updated weights for policy 0, policy_version 450977 (0.0014) [2024-06-15 17:20:13,579][1653645] Updated weights for policy 0, policy_version 451013 (0.0010) [2024-06-15 17:20:14,857][1653645] Updated weights for policy 0, policy_version 451064 (0.0013) [2024-06-15 17:20:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43691.3, 300 sec: 45319.8). Total num frames: 923795456. Throughput: 0: 10956.9. Samples: 231014400. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:20:17,699][1653645] Updated weights for policy 0, policy_version 451132 (0.0013) [2024-06-15 17:20:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 45430.9). Total num frames: 923959296. Throughput: 0: 10945.4. Samples: 231045632. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:20:21,376][1651596] Signal inference workers to stop experience collection... (23400 times) [2024-06-15 17:20:21,409][1653645] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-06-15 17:20:21,612][1651596] Signal inference workers to resume experience collection... (23400 times) [2024-06-15 17:20:21,613][1653645] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-06-15 17:20:21,801][1653645] Updated weights for policy 0, policy_version 451194 (0.0023) [2024-06-15 17:20:24,805][1653645] Updated weights for policy 0, policy_version 451252 (0.0018) [2024-06-15 17:20:25,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 44236.6, 300 sec: 45319.8). Total num frames: 924221440. Throughput: 0: 10808.8. Samples: 231110656. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:20:26,088][1653645] Updated weights for policy 0, policy_version 451285 (0.0012) [2024-06-15 17:20:27,127][1653645] Updated weights for policy 0, policy_version 451328 (0.0095) [2024-06-15 17:20:29,607][1653645] Updated weights for policy 0, policy_version 451392 (0.0013) [2024-06-15 17:20:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.8, 300 sec: 45319.8). Total num frames: 924450816. Throughput: 0: 10877.1. Samples: 231176704. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:20:34,498][1653645] Updated weights for policy 0, policy_version 451456 (0.0017) [2024-06-15 17:20:35,960][1648982] Fps is (10 sec: 36045.6, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 924581888. Throughput: 0: 10808.9. Samples: 231212544. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:35,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:20:37,087][1653645] Updated weights for policy 0, policy_version 451513 (0.0012) [2024-06-15 17:20:39,344][1653645] Updated weights for policy 0, policy_version 451580 (0.0012) [2024-06-15 17:20:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 924876800. Throughput: 0: 10558.6. Samples: 231267328. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:20:41,206][1653645] Updated weights for policy 0, policy_version 451618 (0.0012) [2024-06-15 17:20:45,700][1653645] Updated weights for policy 0, policy_version 451680 (0.0013) [2024-06-15 17:20:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 45097.7). Total num frames: 925040640. Throughput: 0: 10797.5. Samples: 231341568. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:20:49,079][1653645] Updated weights for policy 0, policy_version 451760 (0.0014) [2024-06-15 17:20:50,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 44653.3). Total num frames: 925302784. Throughput: 0: 10831.6. Samples: 231371264. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:20:51,384][1653645] Updated weights for policy 0, policy_version 451835 (0.0119) [2024-06-15 17:20:53,936][1653645] Updated weights for policy 0, policy_version 451881 (0.0017) [2024-06-15 17:20:55,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 925499392. Throughput: 0: 10717.9. Samples: 231430656. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:20:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:20:56,003][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000451904_925499392.pth... [2024-06-15 17:20:56,052][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000446720_914882560.pth [2024-06-15 17:20:57,570][1653645] Updated weights for policy 0, policy_version 451920 (0.0013) [2024-06-15 17:20:58,643][1653645] Updated weights for policy 0, policy_version 451963 (0.0012) [2024-06-15 17:21:00,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 925761536. Throughput: 0: 10808.9. Samples: 231500800. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:21:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:21:02,250][1653645] Updated weights for policy 0, policy_version 452050 (0.0018) [2024-06-15 17:21:03,004][1653645] Updated weights for policy 0, policy_version 452096 (0.0012) [2024-06-15 17:21:05,658][1653645] Updated weights for policy 0, policy_version 452160 (0.0013) [2024-06-15 17:21:05,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.5, 300 sec: 44876.2). Total num frames: 926023680. Throughput: 0: 10888.5. Samples: 231535616. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:21:05,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:21:10,627][1653645] Updated weights for policy 0, policy_version 452216 (0.0012) [2024-06-15 17:21:10,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 926154752. Throughput: 0: 11025.1. Samples: 231606784. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:21:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:21:12,040][1651596] Signal inference workers to stop experience collection... (23450 times) [2024-06-15 17:21:12,082][1653645] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-06-15 17:21:12,199][1651596] Signal inference workers to resume experience collection... (23450 times) [2024-06-15 17:21:12,200][1653645] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-06-15 17:21:12,558][1653645] Updated weights for policy 0, policy_version 452272 (0.0012) [2024-06-15 17:21:13,961][1653645] Updated weights for policy 0, policy_version 452308 (0.0012) [2024-06-15 17:21:15,649][1653645] Updated weights for policy 0, policy_version 452356 (0.0022) [2024-06-15 17:21:15,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 926449664. Throughput: 0: 10990.9. Samples: 231671296. Policy #0 lag: (min: 6.0, avg: 110.7, max: 262.0) [2024-06-15 17:21:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:21:16,696][1653645] Updated weights for policy 0, policy_version 452409 (0.0086) [2024-06-15 17:21:20,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 926613504. Throughput: 0: 11047.8. Samples: 231709696. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:21:21,500][1653645] Updated weights for policy 0, policy_version 452468 (0.0011) [2024-06-15 17:21:23,859][1653645] Updated weights for policy 0, policy_version 452544 (0.0014) [2024-06-15 17:21:25,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 44236.7, 300 sec: 44655.9). Total num frames: 926875648. Throughput: 0: 11320.8. Samples: 231776768. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:21:26,264][1653645] Updated weights for policy 0, policy_version 452602 (0.0013) [2024-06-15 17:21:28,078][1653645] Updated weights for policy 0, policy_version 452665 (0.0012) [2024-06-15 17:21:30,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 44542.2). Total num frames: 927072256. Throughput: 0: 11081.9. Samples: 231840256. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:21:33,441][1653645] Updated weights for policy 0, policy_version 452720 (0.0013) [2024-06-15 17:21:34,844][1653645] Updated weights for policy 0, policy_version 452741 (0.0012) [2024-06-15 17:21:35,957][1653645] Updated weights for policy 0, policy_version 452798 (0.0012) [2024-06-15 17:21:35,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 927301632. Throughput: 0: 11241.3. Samples: 231877120. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:21:37,824][1653645] Updated weights for policy 0, policy_version 452848 (0.0012) [2024-06-15 17:21:39,891][1653645] Updated weights for policy 0, policy_version 452928 (0.0110) [2024-06-15 17:21:40,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 927596544. Throughput: 0: 11275.4. Samples: 231938048. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:21:45,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 927727616. Throughput: 0: 11275.4. Samples: 232008192. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:21:47,282][1653645] Updated weights for policy 0, policy_version 453011 (0.0145) [2024-06-15 17:21:48,284][1653645] Updated weights for policy 0, policy_version 453055 (0.0011) [2024-06-15 17:21:50,958][1648982] Fps is (10 sec: 36043.2, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 927956992. Throughput: 0: 11172.9. Samples: 232038400. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:50,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:21:51,325][1653645] Updated weights for policy 0, policy_version 453122 (0.0013) [2024-06-15 17:21:52,702][1653645] Updated weights for policy 0, policy_version 453184 (0.0014) [2024-06-15 17:21:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 928120832. Throughput: 0: 10911.3. Samples: 232097792. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:21:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:21:58,124][1653645] Updated weights for policy 0, policy_version 453240 (0.0015) [2024-06-15 17:21:59,574][1651596] Signal inference workers to stop experience collection... (23500 times) [2024-06-15 17:21:59,609][1653645] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-06-15 17:21:59,814][1651596] Signal inference workers to resume experience collection... (23500 times) [2024-06-15 17:21:59,815][1653645] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-06-15 17:22:00,118][1653645] Updated weights for policy 0, policy_version 453296 (0.0014) [2024-06-15 17:22:00,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 928382976. Throughput: 0: 10956.8. Samples: 232164352. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:22:02,594][1653645] Updated weights for policy 0, policy_version 453369 (0.0013) [2024-06-15 17:22:03,906][1653645] Updated weights for policy 0, policy_version 453408 (0.0011) [2024-06-15 17:22:05,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 928645120. Throughput: 0: 10865.8. Samples: 232198656. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:22:09,756][1653645] Updated weights for policy 0, policy_version 453495 (0.0012) [2024-06-15 17:22:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 928776192. Throughput: 0: 10774.8. Samples: 232261632. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:22:12,471][1653645] Updated weights for policy 0, policy_version 453558 (0.0124) [2024-06-15 17:22:15,730][1653645] Updated weights for policy 0, policy_version 453603 (0.0012) [2024-06-15 17:22:15,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 929005568. Throughput: 0: 10729.2. Samples: 232323072. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:22:17,765][1653645] Updated weights for policy 0, policy_version 453683 (0.0120) [2024-06-15 17:22:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 43542.5). Total num frames: 929169408. Throughput: 0: 10467.5. Samples: 232348160. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:22:22,192][1653645] Updated weights for policy 0, policy_version 453752 (0.0013) [2024-06-15 17:22:25,274][1653645] Updated weights for policy 0, policy_version 453817 (0.0016) [2024-06-15 17:22:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 42598.6, 300 sec: 43986.9). Total num frames: 929431552. Throughput: 0: 10740.6. Samples: 232421376. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:22:28,292][1653645] Updated weights for policy 0, policy_version 453888 (0.0103) [2024-06-15 17:22:29,435][1653645] Updated weights for policy 0, policy_version 453946 (0.0012) [2024-06-15 17:22:30,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 929693696. Throughput: 0: 10547.2. Samples: 232482816. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:22:34,316][1653645] Updated weights for policy 0, policy_version 454016 (0.0051) [2024-06-15 17:22:35,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 43875.8). Total num frames: 929824768. Throughput: 0: 10752.1. Samples: 232522240. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:22:37,267][1653645] Updated weights for policy 0, policy_version 454080 (0.0013) [2024-06-15 17:22:40,003][1653645] Updated weights for policy 0, policy_version 454132 (0.0131) [2024-06-15 17:22:40,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 930152448. Throughput: 0: 10729.3. Samples: 232580608. Policy #0 lag: (min: 15.0, avg: 108.6, max: 271.0) [2024-06-15 17:22:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:22:41,399][1653645] Updated weights for policy 0, policy_version 454207 (0.0011) [2024-06-15 17:22:45,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 42598.2, 300 sec: 43764.7). Total num frames: 930283520. Throughput: 0: 10899.8. Samples: 232654848. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:22:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:22:46,170][1653645] Updated weights for policy 0, policy_version 454265 (0.0081) [2024-06-15 17:22:47,207][1651596] Signal inference workers to stop experience collection... (23550 times) [2024-06-15 17:22:47,257][1653645] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-06-15 17:22:47,436][1651596] Signal inference workers to resume experience collection... (23550 times) [2024-06-15 17:22:47,436][1653645] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-06-15 17:22:48,334][1653645] Updated weights for policy 0, policy_version 454331 (0.0013) [2024-06-15 17:22:50,484][1653645] Updated weights for policy 0, policy_version 454375 (0.0012) [2024-06-15 17:22:50,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 930611200. Throughput: 0: 10888.5. Samples: 232688640. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:22:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:22:52,304][1653645] Updated weights for policy 0, policy_version 454453 (0.0012) [2024-06-15 17:22:55,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 43690.2, 300 sec: 43986.8). Total num frames: 930742272. Throughput: 0: 10979.4. Samples: 232755712. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:22:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:22:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000454464_930742272.pth... [2024-06-15 17:22:56,011][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000449344_920256512.pth [2024-06-15 17:22:57,346][1653645] Updated weights for policy 0, policy_version 454496 (0.0014) [2024-06-15 17:22:59,230][1653645] Updated weights for policy 0, policy_version 454544 (0.0027) [2024-06-15 17:23:00,957][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 931004416. Throughput: 0: 11127.5. Samples: 232823808. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:23:01,801][1653645] Updated weights for policy 0, policy_version 454624 (0.0017) [2024-06-15 17:23:03,763][1653645] Updated weights for policy 0, policy_version 454704 (0.0082) [2024-06-15 17:23:05,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 931266560. Throughput: 0: 11184.3. Samples: 232851456. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:23:08,960][1653645] Updated weights for policy 0, policy_version 454752 (0.0023) [2024-06-15 17:23:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 931397632. Throughput: 0: 11298.2. Samples: 232929792. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:23:11,142][1653645] Updated weights for policy 0, policy_version 454803 (0.0016) [2024-06-15 17:23:12,162][1653645] Updated weights for policy 0, policy_version 454848 (0.0035) [2024-06-15 17:23:14,485][1653645] Updated weights for policy 0, policy_version 454918 (0.0024) [2024-06-15 17:23:15,958][1653645] Updated weights for policy 0, policy_version 454976 (0.0023) [2024-06-15 17:23:15,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 46421.3, 300 sec: 44653.3). Total num frames: 931790848. Throughput: 0: 11264.0. Samples: 232989696. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:15,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:23:20,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 931856384. Throughput: 0: 11229.8. Samples: 233027584. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:23:21,564][1653645] Updated weights for policy 0, policy_version 455039 (0.0100) [2024-06-15 17:23:23,645][1653645] Updated weights for policy 0, policy_version 455104 (0.0013) [2024-06-15 17:23:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 932151296. Throughput: 0: 11446.0. Samples: 233095680. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:23:26,270][1653645] Updated weights for policy 0, policy_version 455170 (0.0010) [2024-06-15 17:23:27,629][1653645] Updated weights for policy 0, policy_version 455228 (0.0101) [2024-06-15 17:23:30,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 932315136. Throughput: 0: 11184.4. Samples: 233158144. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:23:33,003][1651596] Signal inference workers to stop experience collection... (23600 times) [2024-06-15 17:23:33,057][1653645] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-06-15 17:23:33,265][1651596] Signal inference workers to resume experience collection... (23600 times) [2024-06-15 17:23:33,266][1653645] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-06-15 17:23:33,930][1653645] Updated weights for policy 0, policy_version 455291 (0.0058) [2024-06-15 17:23:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 932544512. Throughput: 0: 11241.2. Samples: 233194496. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:35,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:23:36,099][1653645] Updated weights for policy 0, policy_version 455351 (0.0014) [2024-06-15 17:23:37,825][1653645] Updated weights for policy 0, policy_version 455408 (0.0126) [2024-06-15 17:23:39,301][1653645] Updated weights for policy 0, policy_version 455472 (0.0016) [2024-06-15 17:23:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 932839424. Throughput: 0: 10911.5. Samples: 233246720. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:23:45,157][1653645] Updated weights for policy 0, policy_version 455506 (0.0013) [2024-06-15 17:23:45,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44237.0, 300 sec: 43764.7). Total num frames: 932937728. Throughput: 0: 11059.2. Samples: 233321472. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:45,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:23:47,377][1653645] Updated weights for policy 0, policy_version 455570 (0.0100) [2024-06-15 17:23:48,655][1653645] Updated weights for policy 0, policy_version 455617 (0.0021) [2024-06-15 17:23:50,673][1653645] Updated weights for policy 0, policy_version 455715 (0.0013) [2024-06-15 17:23:50,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 933330944. Throughput: 0: 11150.3. Samples: 233353216. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 17:23:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43691.1, 300 sec: 43875.8). Total num frames: 933363712. Throughput: 0: 10968.2. Samples: 233423360. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:23:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:23:56,526][1653645] Updated weights for policy 0, policy_version 455776 (0.0016) [2024-06-15 17:23:59,897][1653645] Updated weights for policy 0, policy_version 455842 (0.0013) [2024-06-15 17:24:00,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 933658624. Throughput: 0: 11070.6. Samples: 233487872. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:24:00,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 17:24:01,395][1653645] Updated weights for policy 0, policy_version 455906 (0.0014) [2024-06-15 17:24:03,553][1653645] Updated weights for policy 0, policy_version 455991 (0.0031) [2024-06-15 17:24:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 933888000. Throughput: 0: 10672.4. Samples: 233507840. Policy #0 lag: (min: 3.0, avg: 90.2, max: 259.0) [2024-06-15 17:24:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:24:10,216][1653645] Updated weights for policy 0, policy_version 456048 (0.0134) [2024-06-15 17:24:10,960][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 43542.7). Total num frames: 934019072. Throughput: 0: 10843.0. Samples: 233583616. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:10,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:24:12,019][1653645] Updated weights for policy 0, policy_version 456099 (0.0012) [2024-06-15 17:24:13,159][1653645] Updated weights for policy 0, policy_version 456151 (0.0011) [2024-06-15 17:24:13,441][1651596] Signal inference workers to stop experience collection... (23650 times) [2024-06-15 17:24:13,494][1653645] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-06-15 17:24:13,667][1651596] Signal inference workers to resume experience collection... (23650 times) [2024-06-15 17:24:13,668][1653645] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-06-15 17:24:14,661][1653645] Updated weights for policy 0, policy_version 456211 (0.0012) [2024-06-15 17:24:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 934412288. Throughput: 0: 10717.9. Samples: 233640448. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:24:20,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 43542.6). Total num frames: 934412288. Throughput: 0: 10763.4. Samples: 233678848. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:24:21,912][1653645] Updated weights for policy 0, policy_version 456304 (0.0016) [2024-06-15 17:24:24,592][1653645] Updated weights for policy 0, policy_version 456370 (0.0013) [2024-06-15 17:24:25,665][1653645] Updated weights for policy 0, policy_version 456432 (0.0025) [2024-06-15 17:24:25,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 934772736. Throughput: 0: 11070.5. Samples: 233744896. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:24:27,278][1653645] Updated weights for policy 0, policy_version 456496 (0.0012) [2024-06-15 17:24:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 934936576. Throughput: 0: 10899.9. Samples: 233811968. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:24:33,368][1653645] Updated weights for policy 0, policy_version 456547 (0.0012) [2024-06-15 17:24:35,748][1653645] Updated weights for policy 0, policy_version 456608 (0.0013) [2024-06-15 17:24:35,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 43144.6, 300 sec: 43653.7). Total num frames: 935133184. Throughput: 0: 11047.8. Samples: 233850368. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:24:36,921][1653645] Updated weights for policy 0, policy_version 456657 (0.0060) [2024-06-15 17:24:38,694][1653645] Updated weights for policy 0, policy_version 456725 (0.0080) [2024-06-15 17:24:39,620][1653645] Updated weights for policy 0, policy_version 456767 (0.0011) [2024-06-15 17:24:40,959][1648982] Fps is (10 sec: 52424.0, 60 sec: 43689.9, 300 sec: 43986.7). Total num frames: 935460864. Throughput: 0: 10774.5. Samples: 233908224. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:40,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:24:45,536][1653645] Updated weights for policy 0, policy_version 456827 (0.0012) [2024-06-15 17:24:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 935591936. Throughput: 0: 10934.1. Samples: 233979904. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:24:47,908][1653645] Updated weights for policy 0, policy_version 456889 (0.0142) [2024-06-15 17:24:49,507][1653645] Updated weights for policy 0, policy_version 456944 (0.0020) [2024-06-15 17:24:50,958][1648982] Fps is (10 sec: 45879.6, 60 sec: 43144.6, 300 sec: 44209.0). Total num frames: 935919616. Throughput: 0: 11184.3. Samples: 234011136. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:50,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:24:51,403][1653645] Updated weights for policy 0, policy_version 457018 (0.0105) [2024-06-15 17:24:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 935985152. Throughput: 0: 10956.8. Samples: 234076672. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:24:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:24:56,310][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000457040_936017920.pth... [2024-06-15 17:24:56,444][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000451904_925499392.pth [2024-06-15 17:24:57,047][1653645] Updated weights for policy 0, policy_version 457072 (0.0014) [2024-06-15 17:24:58,738][1651596] Signal inference workers to stop experience collection... (23700 times) [2024-06-15 17:24:58,794][1653645] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-06-15 17:24:59,035][1651596] Signal inference workers to resume experience collection... (23700 times) [2024-06-15 17:24:59,036][1653645] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-06-15 17:24:59,265][1653645] Updated weights for policy 0, policy_version 457125 (0.0013) [2024-06-15 17:25:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 43764.7). Total num frames: 936312832. Throughput: 0: 11173.0. Samples: 234143232. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:25:01,226][1653645] Updated weights for policy 0, policy_version 457206 (0.0015) [2024-06-15 17:25:02,974][1653645] Updated weights for policy 0, policy_version 457280 (0.0154) [2024-06-15 17:25:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 936509440. Throughput: 0: 10820.3. Samples: 234165760. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:25:09,079][1653645] Updated weights for policy 0, policy_version 457344 (0.0014) [2024-06-15 17:25:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 936706048. Throughput: 0: 11173.0. Samples: 234247680. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:25:11,262][1653645] Updated weights for policy 0, policy_version 457399 (0.0013) [2024-06-15 17:25:13,265][1653645] Updated weights for policy 0, policy_version 457456 (0.0012) [2024-06-15 17:25:15,960][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 937033728. Throughput: 0: 10831.6. Samples: 234299392. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:15,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:25:20,251][1653645] Updated weights for policy 0, policy_version 457541 (0.0013) [2024-06-15 17:25:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 43653.7). Total num frames: 937099264. Throughput: 0: 10888.5. Samples: 234340352. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:25:21,528][1653645] Updated weights for policy 0, policy_version 457599 (0.0013) [2024-06-15 17:25:23,854][1653645] Updated weights for policy 0, policy_version 457661 (0.0014) [2024-06-15 17:25:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 937426944. Throughput: 0: 11036.7. Samples: 234404864. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:25:26,338][1653645] Updated weights for policy 0, policy_version 457746 (0.0021) [2024-06-15 17:25:30,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 937558016. Throughput: 0: 10979.5. Samples: 234473984. Policy #0 lag: (min: 15.0, avg: 93.2, max: 271.0) [2024-06-15 17:25:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:25:32,072][1653645] Updated weights for policy 0, policy_version 457809 (0.0025) [2024-06-15 17:25:33,131][1653645] Updated weights for policy 0, policy_version 457852 (0.0038) [2024-06-15 17:25:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 937820160. Throughput: 0: 11093.3. Samples: 234510336. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:25:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:25:36,490][1653645] Updated weights for policy 0, policy_version 457924 (0.0108) [2024-06-15 17:25:37,803][1653645] Updated weights for policy 0, policy_version 457984 (0.0012) [2024-06-15 17:25:39,096][1653645] Updated weights for policy 0, policy_version 458046 (0.0025) [2024-06-15 17:25:40,958][1648982] Fps is (10 sec: 52426.6, 60 sec: 43691.0, 300 sec: 44209.0). Total num frames: 938082304. Throughput: 0: 10877.0. Samples: 234566144. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:25:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:25:43,825][1651596] Signal inference workers to stop experience collection... (23750 times) [2024-06-15 17:25:43,883][1653645] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-06-15 17:25:44,144][1651596] Signal inference workers to resume experience collection... (23750 times) [2024-06-15 17:25:44,145][1653645] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-06-15 17:25:44,298][1653645] Updated weights for policy 0, policy_version 458103 (0.0013) [2024-06-15 17:25:45,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 938246144. Throughput: 0: 11104.7. Samples: 234642944. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:25:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:25:46,821][1653645] Updated weights for policy 0, policy_version 458160 (0.0013) [2024-06-15 17:25:49,197][1653645] Updated weights for policy 0, policy_version 458230 (0.0014) [2024-06-15 17:25:50,958][1648982] Fps is (10 sec: 49154.7, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 938573824. Throughput: 0: 11286.8. Samples: 234673664. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:25:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:25:50,999][1653645] Updated weights for policy 0, policy_version 458304 (0.0013) [2024-06-15 17:25:55,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 938704896. Throughput: 0: 10934.1. Samples: 234739712. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:25:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:25:56,005][1653645] Updated weights for policy 0, policy_version 458359 (0.0013) [2024-06-15 17:25:58,712][1653645] Updated weights for policy 0, policy_version 458428 (0.0012) [2024-06-15 17:26:00,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 938967040. Throughput: 0: 11298.1. Samples: 234807808. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:26:01,097][1653645] Updated weights for policy 0, policy_version 458492 (0.0012) [2024-06-15 17:26:02,558][1653645] Updated weights for policy 0, policy_version 458544 (0.0028) [2024-06-15 17:26:05,957][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 939130880. Throughput: 0: 11013.7. Samples: 234835968. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:26:06,420][1653645] Updated weights for policy 0, policy_version 458579 (0.0035) [2024-06-15 17:26:09,432][1653645] Updated weights for policy 0, policy_version 458646 (0.0014) [2024-06-15 17:26:10,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 939393024. Throughput: 0: 11172.9. Samples: 234907648. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:10,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 17:26:11,971][1653645] Updated weights for policy 0, policy_version 458704 (0.0016) [2024-06-15 17:26:13,652][1653645] Updated weights for policy 0, policy_version 458768 (0.0013) [2024-06-15 17:26:15,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 939655168. Throughput: 0: 11104.7. Samples: 234973696. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:26:18,007][1653645] Updated weights for policy 0, policy_version 458848 (0.0014) [2024-06-15 17:26:20,479][1653645] Updated weights for policy 0, policy_version 458881 (0.0012) [2024-06-15 17:26:20,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 45329.1, 300 sec: 43875.9). Total num frames: 939819008. Throughput: 0: 11138.9. Samples: 235011584. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:20,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 17:26:21,483][1653645] Updated weights for policy 0, policy_version 458935 (0.0013) [2024-06-15 17:26:24,642][1653645] Updated weights for policy 0, policy_version 459008 (0.0014) [2024-06-15 17:26:25,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 940146688. Throughput: 0: 11434.8. Samples: 235080704. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:26:26,075][1653645] Updated weights for policy 0, policy_version 459070 (0.0048) [2024-06-15 17:26:29,746][1651596] Signal inference workers to stop experience collection... (23800 times) [2024-06-15 17:26:29,797][1653645] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-06-15 17:26:29,948][1651596] Signal inference workers to resume experience collection... (23800 times) [2024-06-15 17:26:29,949][1653645] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-06-15 17:26:30,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.3, 300 sec: 44098.0). Total num frames: 940310528. Throughput: 0: 11127.5. Samples: 235143680. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:26:32,390][1653645] Updated weights for policy 0, policy_version 459137 (0.0014) [2024-06-15 17:26:33,720][1653645] Updated weights for policy 0, policy_version 459200 (0.0015) [2024-06-15 17:26:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 940539904. Throughput: 0: 11320.9. Samples: 235183104. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:26:35,979][1653645] Updated weights for policy 0, policy_version 459260 (0.0013) [2024-06-15 17:26:37,146][1653645] Updated weights for policy 0, policy_version 459304 (0.0014) [2024-06-15 17:26:40,320][1653645] Updated weights for policy 0, policy_version 459346 (0.0019) [2024-06-15 17:26:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.4, 300 sec: 44320.1). Total num frames: 940802048. Throughput: 0: 11275.4. Samples: 235247104. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:26:41,183][1653645] Updated weights for policy 0, policy_version 459391 (0.0038) [2024-06-15 17:26:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.2, 300 sec: 44098.0). Total num frames: 940965888. Throughput: 0: 11332.3. Samples: 235317760. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:26:47,589][1653645] Updated weights for policy 0, policy_version 459488 (0.0016) [2024-06-15 17:26:49,426][1653645] Updated weights for policy 0, policy_version 459552 (0.0014) [2024-06-15 17:26:50,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 941228032. Throughput: 0: 11389.1. Samples: 235348480. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:26:52,732][1653645] Updated weights for policy 0, policy_version 459616 (0.0012) [2024-06-15 17:26:55,960][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 941359104. Throughput: 0: 11059.2. Samples: 235405312. Policy #0 lag: (min: 8.0, avg: 104.3, max: 264.0) [2024-06-15 17:26:55,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:26:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000459648_941359104.pth... [2024-06-15 17:26:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000454464_930742272.pth [2024-06-15 17:26:57,030][1653645] Updated weights for policy 0, policy_version 459680 (0.0016) [2024-06-15 17:26:57,760][1653645] Updated weights for policy 0, policy_version 459712 (0.0012) [2024-06-15 17:27:00,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 941588480. Throughput: 0: 11161.6. Samples: 235475968. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:27:01,306][1653645] Updated weights for policy 0, policy_version 459781 (0.0014) [2024-06-15 17:27:02,355][1653645] Updated weights for policy 0, policy_version 459839 (0.0021) [2024-06-15 17:27:04,936][1653645] Updated weights for policy 0, policy_version 459899 (0.0050) [2024-06-15 17:27:05,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.0, 300 sec: 44431.2). Total num frames: 941883392. Throughput: 0: 11025.0. Samples: 235507712. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:27:08,774][1653645] Updated weights for policy 0, policy_version 459957 (0.0013) [2024-06-15 17:27:10,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 942014464. Throughput: 0: 11195.7. Samples: 235584512. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:27:11,854][1653645] Updated weights for policy 0, policy_version 460016 (0.0013) [2024-06-15 17:27:15,329][1651596] Signal inference workers to stop experience collection... (23850 times) [2024-06-15 17:27:15,356][1653645] Updated weights for policy 0, policy_version 460098 (0.0013) [2024-06-15 17:27:15,395][1653645] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-06-15 17:27:15,558][1651596] Signal inference workers to resume experience collection... (23850 times) [2024-06-15 17:27:15,559][1653645] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-06-15 17:27:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 942342144. Throughput: 0: 11207.1. Samples: 235648000. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:27:16,415][1653645] Updated weights for policy 0, policy_version 460153 (0.0018) [2024-06-15 17:27:19,369][1653645] Updated weights for policy 0, policy_version 460198 (0.0012) [2024-06-15 17:27:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 942538752. Throughput: 0: 11150.2. Samples: 235684864. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:27:22,221][1653645] Updated weights for policy 0, policy_version 460228 (0.0012) [2024-06-15 17:27:23,532][1653645] Updated weights for policy 0, policy_version 460288 (0.0012) [2024-06-15 17:27:24,972][1653645] Updated weights for policy 0, policy_version 460344 (0.0017) [2024-06-15 17:27:25,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 942800896. Throughput: 0: 11275.4. Samples: 235754496. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:27:27,372][1653645] Updated weights for policy 0, policy_version 460373 (0.0012) [2024-06-15 17:27:28,225][1653645] Updated weights for policy 0, policy_version 460415 (0.0011) [2024-06-15 17:27:30,110][1653645] Updated weights for policy 0, policy_version 460470 (0.0014) [2024-06-15 17:27:30,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.0, 300 sec: 44875.5). Total num frames: 943063040. Throughput: 0: 11286.7. Samples: 235825664. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:30,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:27:33,317][1653645] Updated weights for policy 0, policy_version 460499 (0.0013) [2024-06-15 17:27:34,634][1653645] Updated weights for policy 0, policy_version 460560 (0.0012) [2024-06-15 17:27:35,828][1653645] Updated weights for policy 0, policy_version 460602 (0.0012) [2024-06-15 17:27:35,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 46421.3, 300 sec: 44653.3). Total num frames: 943325184. Throughput: 0: 11582.6. Samples: 235869696. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:27:39,233][1653645] Updated weights for policy 0, policy_version 460656 (0.0013) [2024-06-15 17:27:40,792][1653645] Updated weights for policy 0, policy_version 460704 (0.0026) [2024-06-15 17:27:40,962][1648982] Fps is (10 sec: 45855.0, 60 sec: 45325.6, 300 sec: 44874.8). Total num frames: 943521792. Throughput: 0: 11661.1. Samples: 235930112. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:40,963][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:27:45,131][1653645] Updated weights for policy 0, policy_version 460741 (0.0011) [2024-06-15 17:27:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 943685632. Throughput: 0: 11696.4. Samples: 236002304. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:27:46,908][1653645] Updated weights for policy 0, policy_version 460816 (0.0129) [2024-06-15 17:27:50,549][1653645] Updated weights for policy 0, policy_version 460865 (0.0019) [2024-06-15 17:27:50,958][1648982] Fps is (10 sec: 36060.7, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 943882240. Throughput: 0: 11514.3. Samples: 236025856. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:27:51,999][1653645] Updated weights for policy 0, policy_version 460924 (0.0033) [2024-06-15 17:27:53,394][1653645] Updated weights for policy 0, policy_version 460976 (0.0013) [2024-06-15 17:27:55,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 944111616. Throughput: 0: 11252.6. Samples: 236090880. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:27:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:27:57,699][1653645] Updated weights for policy 0, policy_version 461008 (0.0015) [2024-06-15 17:27:59,121][1651596] Signal inference workers to stop experience collection... (23900 times) [2024-06-15 17:27:59,175][1653645] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-06-15 17:27:59,345][1651596] Signal inference workers to resume experience collection... (23900 times) [2024-06-15 17:27:59,354][1653645] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-06-15 17:28:00,261][1653645] Updated weights for policy 0, policy_version 461114 (0.0126) [2024-06-15 17:28:00,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 944373760. Throughput: 0: 11173.0. Samples: 236150784. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:28:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:28:04,140][1653645] Updated weights for policy 0, policy_version 461169 (0.0060) [2024-06-15 17:28:05,919][1653645] Updated weights for policy 0, policy_version 461243 (0.0023) [2024-06-15 17:28:05,959][1648982] Fps is (10 sec: 49146.7, 60 sec: 45328.4, 300 sec: 44764.3). Total num frames: 944603136. Throughput: 0: 11241.0. Samples: 236190720. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:28:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:28:10,958][1648982] Fps is (10 sec: 29490.8, 60 sec: 44236.6, 300 sec: 43653.6). Total num frames: 944668672. Throughput: 0: 11002.3. Samples: 236249600. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:28:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:28:12,380][1653645] Updated weights for policy 0, policy_version 461329 (0.0015) [2024-06-15 17:28:13,554][1653645] Updated weights for policy 0, policy_version 461374 (0.0012) [2024-06-15 17:28:15,958][1648982] Fps is (10 sec: 39325.8, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 944996352. Throughput: 0: 10911.3. Samples: 236316672. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:28:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:28:16,270][1653645] Updated weights for policy 0, policy_version 461440 (0.0013) [2024-06-15 17:28:17,714][1653645] Updated weights for policy 0, policy_version 461500 (0.0019) [2024-06-15 17:28:20,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 945160192. Throughput: 0: 10513.1. Samples: 236342784. Policy #0 lag: (min: 15.0, avg: 98.1, max: 239.0) [2024-06-15 17:28:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:28:25,682][1653645] Updated weights for policy 0, policy_version 461600 (0.0071) [2024-06-15 17:28:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 945356800. Throughput: 0: 10605.2. Samples: 236407296. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:25,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:28:28,131][1653645] Updated weights for policy 0, policy_version 461648 (0.0013) [2024-06-15 17:28:30,404][1653645] Updated weights for policy 0, policy_version 461744 (0.0030) [2024-06-15 17:28:30,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 945684480. Throughput: 0: 10149.0. Samples: 236459008. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:30,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:28:35,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 39321.6, 300 sec: 43542.5). Total num frames: 945684480. Throughput: 0: 10422.1. Samples: 236494848. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:28:37,146][1653645] Updated weights for policy 0, policy_version 461793 (0.0013) [2024-06-15 17:28:38,982][1653645] Updated weights for policy 0, policy_version 461858 (0.0012) [2024-06-15 17:28:40,958][1648982] Fps is (10 sec: 26214.4, 60 sec: 40417.0, 300 sec: 44098.0). Total num frames: 945946624. Throughput: 0: 10319.7. Samples: 236555264. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:28:41,223][1653645] Updated weights for policy 0, policy_version 461906 (0.0013) [2024-06-15 17:28:43,052][1651596] Signal inference workers to stop experience collection... (23950 times) [2024-06-15 17:28:43,128][1653645] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-06-15 17:28:43,131][1653645] Updated weights for policy 0, policy_version 461972 (0.0149) [2024-06-15 17:28:43,312][1651596] Signal inference workers to resume experience collection... (23950 times) [2024-06-15 17:28:43,313][1653645] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-06-15 17:28:45,961][1648982] Fps is (10 sec: 52409.5, 60 sec: 42049.6, 300 sec: 43653.1). Total num frames: 946208768. Throughput: 0: 10455.3. Samples: 236621312. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:45,962][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:28:48,479][1653645] Updated weights for policy 0, policy_version 462035 (0.0012) [2024-06-15 17:28:50,086][1653645] Updated weights for policy 0, policy_version 462112 (0.0012) [2024-06-15 17:28:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43144.7, 300 sec: 44431.2). Total num frames: 946470912. Throughput: 0: 10445.1. Samples: 236660736. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:28:50,958][1653645] Updated weights for policy 0, policy_version 462144 (0.0011) [2024-06-15 17:28:54,148][1653645] Updated weights for policy 0, policy_version 462198 (0.0021) [2024-06-15 17:28:55,958][1648982] Fps is (10 sec: 49170.2, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 946700288. Throughput: 0: 10558.6. Samples: 236724736. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:28:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:28:56,062][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000462272_946733056.pth... [2024-06-15 17:28:56,125][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000457040_936017920.pth [2024-06-15 17:28:59,421][1653645] Updated weights for policy 0, policy_version 462274 (0.0013) [2024-06-15 17:29:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 43986.9). Total num frames: 946864128. Throughput: 0: 10501.7. Samples: 236789248. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:29:01,340][1653645] Updated weights for policy 0, policy_version 462340 (0.0013) [2024-06-15 17:29:05,562][1653645] Updated weights for policy 0, policy_version 462432 (0.0099) [2024-06-15 17:29:05,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 41506.9, 300 sec: 44320.1). Total num frames: 947093504. Throughput: 0: 10672.4. Samples: 236823040. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:29:07,403][1653645] Updated weights for policy 0, policy_version 462512 (0.0017) [2024-06-15 17:29:10,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.7, 300 sec: 43542.6). Total num frames: 947257344. Throughput: 0: 10706.5. Samples: 236889088. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:29:13,460][1653645] Updated weights for policy 0, policy_version 462597 (0.0148) [2024-06-15 17:29:14,693][1653645] Updated weights for policy 0, policy_version 462649 (0.0013) [2024-06-15 17:29:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 44431.2). Total num frames: 947519488. Throughput: 0: 10865.8. Samples: 236947968. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:29:18,386][1653645] Updated weights for policy 0, policy_version 462704 (0.0012) [2024-06-15 17:29:20,260][1653645] Updated weights for policy 0, policy_version 462782 (0.0017) [2024-06-15 17:29:20,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 947781632. Throughput: 0: 10888.5. Samples: 236984832. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:20,959][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 17:29:25,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 43144.4, 300 sec: 44097.9). Total num frames: 947945472. Throughput: 0: 11059.1. Samples: 237052928. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:29:26,047][1653645] Updated weights for policy 0, policy_version 462868 (0.0083) [2024-06-15 17:29:26,973][1653645] Updated weights for policy 0, policy_version 462911 (0.0011) [2024-06-15 17:29:29,452][1651596] Signal inference workers to stop experience collection... (24000 times) [2024-06-15 17:29:29,515][1653645] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-06-15 17:29:29,742][1651596] Signal inference workers to resume experience collection... (24000 times) [2024-06-15 17:29:29,743][1653645] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-06-15 17:29:30,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 44209.0). Total num frames: 948174848. Throughput: 0: 10878.1. Samples: 237110784. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:29:31,046][1653645] Updated weights for policy 0, policy_version 462979 (0.0014) [2024-06-15 17:29:32,062][1653645] Updated weights for policy 0, policy_version 463030 (0.0014) [2024-06-15 17:29:35,958][1648982] Fps is (10 sec: 36046.1, 60 sec: 43690.8, 300 sec: 43542.7). Total num frames: 948305920. Throughput: 0: 10706.5. Samples: 237142528. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:29:36,554][1653645] Updated weights for policy 0, policy_version 463072 (0.0012) [2024-06-15 17:29:38,921][1653645] Updated weights for policy 0, policy_version 463157 (0.0016) [2024-06-15 17:29:40,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 948568064. Throughput: 0: 10729.2. Samples: 237207552. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:40,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 17:29:42,154][1653645] Updated weights for policy 0, policy_version 463206 (0.0013) [2024-06-15 17:29:43,655][1653645] Updated weights for policy 0, policy_version 463265 (0.0012) [2024-06-15 17:29:45,957][1648982] Fps is (10 sec: 52429.0, 60 sec: 43693.5, 300 sec: 43764.7). Total num frames: 948830208. Throughput: 0: 10774.8. Samples: 237274112. Policy #0 lag: (min: 44.0, avg: 108.8, max: 268.0) [2024-06-15 17:29:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:29:48,146][1653645] Updated weights for policy 0, policy_version 463328 (0.0013) [2024-06-15 17:29:50,450][1653645] Updated weights for policy 0, policy_version 463408 (0.0014) [2024-06-15 17:29:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 949092352. Throughput: 0: 10956.8. Samples: 237316096. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:29:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:29:54,604][1653645] Updated weights for policy 0, policy_version 463488 (0.0124) [2024-06-15 17:29:55,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 42598.3, 300 sec: 43875.8). Total num frames: 949256192. Throughput: 0: 10706.5. Samples: 237370880. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:29:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:29:59,762][1653645] Updated weights for policy 0, policy_version 463553 (0.0012) [2024-06-15 17:30:00,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 949452800. Throughput: 0: 11013.7. Samples: 237443584. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:30:01,545][1653645] Updated weights for policy 0, policy_version 463620 (0.0012) [2024-06-15 17:30:05,643][1653645] Updated weights for policy 0, policy_version 463682 (0.0014) [2024-06-15 17:30:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 43875.8). Total num frames: 949649408. Throughput: 0: 10774.7. Samples: 237469696. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:30:07,479][1653645] Updated weights for policy 0, policy_version 463747 (0.0014) [2024-06-15 17:30:10,958][1648982] Fps is (10 sec: 42596.5, 60 sec: 43690.4, 300 sec: 43542.5). Total num frames: 949878784. Throughput: 0: 10604.1. Samples: 237530112. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:30:12,120][1653645] Updated weights for policy 0, policy_version 463809 (0.0026) [2024-06-15 17:30:13,693][1651596] Signal inference workers to stop experience collection... (24050 times) [2024-06-15 17:30:13,745][1653645] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-06-15 17:30:13,918][1651596] Signal inference workers to resume experience collection... (24050 times) [2024-06-15 17:30:13,919][1653645] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-06-15 17:30:13,921][1653645] Updated weights for policy 0, policy_version 463888 (0.0038) [2024-06-15 17:30:15,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 950140928. Throughput: 0: 10729.3. Samples: 237593600. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:30:19,061][1653645] Updated weights for policy 0, policy_version 463984 (0.0019) [2024-06-15 17:30:20,663][1653645] Updated weights for policy 0, policy_version 464032 (0.0011) [2024-06-15 17:30:20,958][1648982] Fps is (10 sec: 45877.2, 60 sec: 42598.5, 300 sec: 43764.7). Total num frames: 950337536. Throughput: 0: 10934.0. Samples: 237634560. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:30:21,360][1653645] Updated weights for policy 0, policy_version 464062 (0.0012) [2024-06-15 17:30:25,218][1653645] Updated weights for policy 0, policy_version 464144 (0.0032) [2024-06-15 17:30:25,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44237.0, 300 sec: 44209.0). Total num frames: 950599680. Throughput: 0: 10934.1. Samples: 237699584. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:25,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:30:26,326][1653645] Updated weights for policy 0, policy_version 464186 (0.0016) [2024-06-15 17:30:30,320][1653645] Updated weights for policy 0, policy_version 464240 (0.0012) [2024-06-15 17:30:30,958][1648982] Fps is (10 sec: 45873.3, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 950796288. Throughput: 0: 11013.6. Samples: 237769728. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:30:31,550][1653645] Updated weights for policy 0, policy_version 464276 (0.0010) [2024-06-15 17:30:34,983][1653645] Updated weights for policy 0, policy_version 464336 (0.0011) [2024-06-15 17:30:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 43987.0). Total num frames: 951058432. Throughput: 0: 10831.7. Samples: 237803520. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:30:37,055][1653645] Updated weights for policy 0, policy_version 464437 (0.0116) [2024-06-15 17:30:40,958][1648982] Fps is (10 sec: 39323.4, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 951189504. Throughput: 0: 11047.9. Samples: 237868032. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:30:42,724][1653645] Updated weights for policy 0, policy_version 464496 (0.0012) [2024-06-15 17:30:44,459][1653645] Updated weights for policy 0, policy_version 464560 (0.0132) [2024-06-15 17:30:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 951451648. Throughput: 0: 10865.8. Samples: 237932544. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 17:30:47,831][1653645] Updated weights for policy 0, policy_version 464624 (0.0013) [2024-06-15 17:30:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 951713792. Throughput: 0: 10956.8. Samples: 237962752. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:30:54,213][1653645] Updated weights for policy 0, policy_version 464706 (0.0011) [2024-06-15 17:30:55,749][1653645] Updated weights for policy 0, policy_version 464771 (0.0090) [2024-06-15 17:30:55,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 951844864. Throughput: 0: 11377.8. Samples: 238042112. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:30:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:30:56,363][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000464800_951910400.pth... [2024-06-15 17:30:56,525][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000459648_941359104.pth [2024-06-15 17:30:56,529][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000464800_951910400.pth [2024-06-15 17:30:58,438][1651596] Signal inference workers to stop experience collection... (24100 times) [2024-06-15 17:30:58,480][1653645] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-06-15 17:30:58,691][1651596] Signal inference workers to resume experience collection... (24100 times) [2024-06-15 17:30:58,692][1653645] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-06-15 17:30:58,694][1653645] Updated weights for policy 0, policy_version 464848 (0.0014) [2024-06-15 17:31:00,025][1653645] Updated weights for policy 0, policy_version 464901 (0.0012) [2024-06-15 17:31:00,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 952205312. Throughput: 0: 11264.0. Samples: 238100480. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:31:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:31:01,200][1653645] Updated weights for policy 0, policy_version 464956 (0.0087) [2024-06-15 17:31:05,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.8, 300 sec: 43653.7). Total num frames: 952270848. Throughput: 0: 11161.6. Samples: 238136832. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:31:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:31:06,492][1653645] Updated weights for policy 0, policy_version 465013 (0.0012) [2024-06-15 17:31:08,221][1653645] Updated weights for policy 0, policy_version 465088 (0.0030) [2024-06-15 17:31:10,584][1653645] Updated weights for policy 0, policy_version 465152 (0.0014) [2024-06-15 17:31:10,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45875.5, 300 sec: 43986.9). Total num frames: 952631296. Throughput: 0: 11275.4. Samples: 238206976. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:31:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:31:12,093][1653645] Updated weights for policy 0, policy_version 465213 (0.0137) [2024-06-15 17:31:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 952762368. Throughput: 0: 11230.0. Samples: 238275072. Policy #0 lag: (min: 63.0, avg: 139.0, max: 303.0) [2024-06-15 17:31:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:31:18,701][1653645] Updated weights for policy 0, policy_version 465266 (0.0012) [2024-06-15 17:31:20,269][1653645] Updated weights for policy 0, policy_version 465344 (0.0012) [2024-06-15 17:31:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 43653.6). Total num frames: 953024512. Throughput: 0: 11355.0. Samples: 238314496. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:31:22,779][1653645] Updated weights for policy 0, policy_version 465424 (0.0011) [2024-06-15 17:31:25,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 44782.7, 300 sec: 43986.8). Total num frames: 953286656. Throughput: 0: 11081.9. Samples: 238366720. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:31:30,060][1653645] Updated weights for policy 0, policy_version 465489 (0.0013) [2024-06-15 17:31:30,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.8, 300 sec: 43542.6). Total num frames: 953384960. Throughput: 0: 11332.3. Samples: 238442496. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:31:32,033][1653645] Updated weights for policy 0, policy_version 465571 (0.0014) [2024-06-15 17:31:33,193][1653645] Updated weights for policy 0, policy_version 465607 (0.0012) [2024-06-15 17:31:34,753][1653645] Updated weights for policy 0, policy_version 465669 (0.0012) [2024-06-15 17:31:35,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45874.9, 300 sec: 44097.9). Total num frames: 953810944. Throughput: 0: 11320.8. Samples: 238472192. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:31:40,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 953810944. Throughput: 0: 11173.0. Samples: 238544896. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:31:41,475][1653645] Updated weights for policy 0, policy_version 465732 (0.0012) [2024-06-15 17:31:41,736][1651596] Signal inference workers to stop experience collection... (24150 times) [2024-06-15 17:31:41,813][1653645] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-06-15 17:31:41,985][1651596] Signal inference workers to resume experience collection... (24150 times) [2024-06-15 17:31:41,986][1653645] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-06-15 17:31:42,634][1653645] Updated weights for policy 0, policy_version 465792 (0.0012) [2024-06-15 17:31:44,310][1653645] Updated weights for policy 0, policy_version 465862 (0.0014) [2024-06-15 17:31:45,702][1653645] Updated weights for policy 0, policy_version 465920 (0.0010) [2024-06-15 17:31:45,958][1648982] Fps is (10 sec: 39323.3, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 954204160. Throughput: 0: 11286.8. Samples: 238608384. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:31:50,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 954335232. Throughput: 0: 11195.6. Samples: 238640640. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:50,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:31:53,634][1653645] Updated weights for policy 0, policy_version 466000 (0.0013) [2024-06-15 17:31:55,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 954531840. Throughput: 0: 11229.9. Samples: 238712320. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:31:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:31:55,994][1653645] Updated weights for policy 0, policy_version 466096 (0.0014) [2024-06-15 17:31:58,404][1653645] Updated weights for policy 0, policy_version 466192 (0.0012) [2024-06-15 17:31:59,354][1653645] Updated weights for policy 0, policy_version 466237 (0.0037) [2024-06-15 17:32:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44236.6, 300 sec: 43986.9). Total num frames: 954859520. Throughput: 0: 10797.5. Samples: 238760960. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:00,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 17:32:05,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43144.4, 300 sec: 43542.5). Total num frames: 954859520. Throughput: 0: 10820.2. Samples: 238801408. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:32:07,612][1653645] Updated weights for policy 0, policy_version 466282 (0.0016) [2024-06-15 17:32:08,924][1653645] Updated weights for policy 0, policy_version 466337 (0.0014) [2024-06-15 17:32:10,522][1653645] Updated weights for policy 0, policy_version 466404 (0.0012) [2024-06-15 17:32:10,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 955219968. Throughput: 0: 11093.4. Samples: 238865920. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:32:12,491][1653645] Updated weights for policy 0, policy_version 466484 (0.0013) [2024-06-15 17:32:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 955383808. Throughput: 0: 10831.6. Samples: 238929920. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:32:19,666][1653645] Updated weights for policy 0, policy_version 466544 (0.0014) [2024-06-15 17:32:20,960][1648982] Fps is (10 sec: 39315.3, 60 sec: 43143.3, 300 sec: 43431.2). Total num frames: 955613184. Throughput: 0: 11024.7. Samples: 238968320. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:20,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:32:21,073][1651596] Signal inference workers to stop experience collection... (24200 times) [2024-06-15 17:32:21,091][1653645] Updated weights for policy 0, policy_version 466609 (0.0020) [2024-06-15 17:32:21,107][1653645] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-06-15 17:32:21,299][1651596] Signal inference workers to resume experience collection... (24200 times) [2024-06-15 17:32:21,300][1653645] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-06-15 17:32:22,837][1653645] Updated weights for policy 0, policy_version 466688 (0.0022) [2024-06-15 17:32:24,280][1653645] Updated weights for policy 0, policy_version 466752 (0.0013) [2024-06-15 17:32:25,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.9, 300 sec: 43542.6). Total num frames: 955908096. Throughput: 0: 10615.5. Samples: 239022592. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:32:30,958][1648982] Fps is (10 sec: 32773.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 955940864. Throughput: 0: 10968.2. Samples: 239101952. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:30,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 17:32:32,229][1653645] Updated weights for policy 0, policy_version 466848 (0.0014) [2024-06-15 17:32:33,854][1653645] Updated weights for policy 0, policy_version 466913 (0.0105) [2024-06-15 17:32:35,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43144.8, 300 sec: 43654.3). Total num frames: 956399616. Throughput: 0: 10820.3. Samples: 239127552. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:32:36,040][1653645] Updated weights for policy 0, policy_version 466998 (0.0046) [2024-06-15 17:32:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 956432384. Throughput: 0: 10763.4. Samples: 239196672. Policy #0 lag: (min: 15.0, avg: 70.0, max: 250.0) [2024-06-15 17:32:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:32:42,990][1653645] Updated weights for policy 0, policy_version 467030 (0.0010) [2024-06-15 17:32:44,662][1653645] Updated weights for policy 0, policy_version 467107 (0.0019) [2024-06-15 17:32:45,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 42052.3, 300 sec: 43542.6). Total num frames: 956727296. Throughput: 0: 11093.4. Samples: 239260160. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:32:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:32:46,408][1653645] Updated weights for policy 0, policy_version 467171 (0.0012) [2024-06-15 17:32:48,062][1653645] Updated weights for policy 0, policy_version 467235 (0.0013) [2024-06-15 17:32:50,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 956956672. Throughput: 0: 10683.7. Samples: 239282176. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:32:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:32:54,965][1653645] Updated weights for policy 0, policy_version 467265 (0.0014) [2024-06-15 17:32:55,958][1648982] Fps is (10 sec: 29490.9, 60 sec: 41506.2, 300 sec: 42876.1). Total num frames: 957022208. Throughput: 0: 10968.2. Samples: 239359488. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:32:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:32:56,523][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000467328_957087744.pth... [2024-06-15 17:32:56,732][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000462272_946733056.pth [2024-06-15 17:32:57,705][1653645] Updated weights for policy 0, policy_version 467360 (0.0116) [2024-06-15 17:32:59,580][1653645] Updated weights for policy 0, policy_version 467425 (0.0014) [2024-06-15 17:33:00,462][1651596] Signal inference workers to stop experience collection... (24250 times) [2024-06-15 17:33:00,610][1653645] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-06-15 17:33:00,849][1651596] Signal inference workers to resume experience collection... (24250 times) [2024-06-15 17:33:00,850][1653645] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-06-15 17:33:00,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 42052.4, 300 sec: 43320.6). Total num frames: 957382656. Throughput: 0: 10513.1. Samples: 239403008. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:33:01,421][1653645] Updated weights for policy 0, policy_version 467493 (0.0012) [2024-06-15 17:33:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 957480960. Throughput: 0: 10422.5. Samples: 239437312. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:33:09,013][1653645] Updated weights for policy 0, policy_version 467539 (0.0012) [2024-06-15 17:33:10,819][1653645] Updated weights for policy 0, policy_version 467601 (0.0012) [2024-06-15 17:33:10,958][1648982] Fps is (10 sec: 26214.5, 60 sec: 40414.0, 300 sec: 42876.1). Total num frames: 957644800. Throughput: 0: 10717.9. Samples: 239504896. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:33:12,413][1653645] Updated weights for policy 0, policy_version 467673 (0.0012) [2024-06-15 17:33:13,827][1653645] Updated weights for policy 0, policy_version 467744 (0.0011) [2024-06-15 17:33:15,960][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 958005248. Throughput: 0: 10149.0. Samples: 239558656. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:15,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:33:20,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 39868.9, 300 sec: 42876.1). Total num frames: 958005248. Throughput: 0: 10456.2. Samples: 239598080. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:33:21,358][1653645] Updated weights for policy 0, policy_version 467795 (0.0012) [2024-06-15 17:33:23,603][1653645] Updated weights for policy 0, policy_version 467894 (0.0104) [2024-06-15 17:33:25,336][1653645] Updated weights for policy 0, policy_version 467971 (0.0013) [2024-06-15 17:33:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 43209.3). Total num frames: 958431232. Throughput: 0: 10217.3. Samples: 239656448. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:33:30,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 43144.2, 300 sec: 43542.5). Total num frames: 958529536. Throughput: 0: 10114.7. Samples: 239715328. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:33:33,449][1653645] Updated weights for policy 0, policy_version 468048 (0.0196) [2024-06-15 17:33:34,852][1653645] Updated weights for policy 0, policy_version 468115 (0.0112) [2024-06-15 17:33:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 39867.8, 300 sec: 43542.6). Total num frames: 958791680. Throughput: 0: 10615.5. Samples: 239759872. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:33:36,413][1653645] Updated weights for policy 0, policy_version 468192 (0.0013) [2024-06-15 17:33:38,626][1653645] Updated weights for policy 0, policy_version 468278 (0.0029) [2024-06-15 17:33:40,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.6, 300 sec: 43543.1). Total num frames: 959053824. Throughput: 0: 10069.3. Samples: 239812608. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:33:45,131][1651596] Signal inference workers to stop experience collection... (24300 times) [2024-06-15 17:33:45,164][1653645] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-06-15 17:33:45,190][1653645] Updated weights for policy 0, policy_version 468326 (0.0013) [2024-06-15 17:33:45,287][1651596] Signal inference workers to resume experience collection... (24300 times) [2024-06-15 17:33:45,288][1653645] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-06-15 17:33:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 43098.3). Total num frames: 959184896. Throughput: 0: 10888.6. Samples: 239892992. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:33:46,464][1653645] Updated weights for policy 0, policy_version 468389 (0.0012) [2024-06-15 17:33:48,486][1653645] Updated weights for policy 0, policy_version 468480 (0.0015) [2024-06-15 17:33:50,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 959578112. Throughput: 0: 10740.5. Samples: 239920640. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:50,959][1648982] Avg episode reward: [(0, '37.540')] [2024-06-15 17:33:55,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 43098.2). Total num frames: 959578112. Throughput: 0: 10797.5. Samples: 239990784. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:33:55,958][1648982] Avg episode reward: [(0, '37.550')] [2024-06-15 17:33:56,774][1653645] Updated weights for policy 0, policy_version 468576 (0.0011) [2024-06-15 17:33:58,311][1653645] Updated weights for policy 0, policy_version 468640 (0.0013) [2024-06-15 17:34:00,084][1653645] Updated weights for policy 0, policy_version 468720 (0.0014) [2024-06-15 17:34:00,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 960004096. Throughput: 0: 11013.6. Samples: 240054272. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:34:00,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:34:01,673][1653645] Updated weights for policy 0, policy_version 468791 (0.0015) [2024-06-15 17:34:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 960102400. Throughput: 0: 10934.1. Samples: 240090112. Policy #0 lag: (min: 15.0, avg: 61.8, max: 271.0) [2024-06-15 17:34:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:34:07,975][1653645] Updated weights for policy 0, policy_version 468855 (0.0013) [2024-06-15 17:34:09,063][1653645] Updated weights for policy 0, policy_version 468899 (0.0013) [2024-06-15 17:34:10,045][1653645] Updated weights for policy 0, policy_version 468931 (0.0013) [2024-06-15 17:34:10,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 46421.3, 300 sec: 43764.7). Total num frames: 960430080. Throughput: 0: 11264.0. Samples: 240163328. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:34:11,984][1653645] Updated weights for policy 0, policy_version 469015 (0.0013) [2024-06-15 17:34:15,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 960626688. Throughput: 0: 11423.4. Samples: 240229376. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:15,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:34:19,362][1653645] Updated weights for policy 0, policy_version 469088 (0.0015) [2024-06-15 17:34:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 46421.4, 300 sec: 43542.6). Total num frames: 960790528. Throughput: 0: 11309.5. Samples: 240268800. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:34:21,008][1653645] Updated weights for policy 0, policy_version 469152 (0.0013) [2024-06-15 17:34:21,471][1651596] Signal inference workers to stop experience collection... (24350 times) [2024-06-15 17:34:21,519][1653645] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-06-15 17:34:21,710][1651596] Signal inference workers to resume experience collection... (24350 times) [2024-06-15 17:34:21,712][1653645] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-06-15 17:34:22,201][1653645] Updated weights for policy 0, policy_version 469208 (0.0012) [2024-06-15 17:34:24,588][1653645] Updated weights for policy 0, policy_version 469305 (0.0013) [2024-06-15 17:34:25,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 961150976. Throughput: 0: 11286.8. Samples: 240320512. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:34:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44237.1, 300 sec: 43653.6). Total num frames: 961183744. Throughput: 0: 11309.5. Samples: 240401920. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:34:31,572][1653645] Updated weights for policy 0, policy_version 469370 (0.0013) [2024-06-15 17:34:33,315][1653645] Updated weights for policy 0, policy_version 469440 (0.0012) [2024-06-15 17:34:35,618][1653645] Updated weights for policy 0, policy_version 469523 (0.0014) [2024-06-15 17:34:35,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 46967.4, 300 sec: 44209.0). Total num frames: 961609728. Throughput: 0: 11264.1. Samples: 240427520. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:34:40,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 961675264. Throughput: 0: 11127.5. Samples: 240491520. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:34:43,291][1653645] Updated weights for policy 0, policy_version 469589 (0.0013) [2024-06-15 17:34:45,486][1653645] Updated weights for policy 0, policy_version 469680 (0.0014) [2024-06-15 17:34:45,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 45329.0, 300 sec: 43431.5). Total num frames: 961904640. Throughput: 0: 11161.7. Samples: 240556544. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:34:46,943][1653645] Updated weights for policy 0, policy_version 469728 (0.0012) [2024-06-15 17:34:49,292][1653645] Updated weights for policy 0, policy_version 469812 (0.0013) [2024-06-15 17:34:50,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 962199552. Throughput: 0: 10808.8. Samples: 240576512. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:34:55,959][1648982] Fps is (10 sec: 32767.9, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 962232320. Throughput: 0: 10899.9. Samples: 240653824. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:34:55,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:34:56,330][1653645] Updated weights for policy 0, policy_version 469863 (0.0014) [2024-06-15 17:34:56,477][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000469872_962297856.pth... [2024-06-15 17:34:56,665][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000464800_951910400.pth [2024-06-15 17:34:58,642][1653645] Updated weights for policy 0, policy_version 469939 (0.0015) [2024-06-15 17:35:00,093][1653645] Updated weights for policy 0, policy_version 469986 (0.0094) [2024-06-15 17:35:00,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 962592768. Throughput: 0: 10478.9. Samples: 240700928. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:35:01,767][1651596] Signal inference workers to stop experience collection... (24400 times) [2024-06-15 17:35:01,810][1653645] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-06-15 17:35:01,814][1653645] Updated weights for policy 0, policy_version 470050 (0.0016) [2024-06-15 17:35:01,970][1651596] Signal inference workers to resume experience collection... (24400 times) [2024-06-15 17:35:01,971][1653645] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-06-15 17:35:05,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.5, 300 sec: 43542.6). Total num frames: 962723840. Throughput: 0: 10342.4. Samples: 240734208. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:35:07,940][1653645] Updated weights for policy 0, policy_version 470096 (0.0012) [2024-06-15 17:35:09,874][1653645] Updated weights for policy 0, policy_version 470164 (0.0024) [2024-06-15 17:35:10,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 962953216. Throughput: 0: 10911.3. Samples: 240811520. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:35:11,715][1653645] Updated weights for policy 0, policy_version 470227 (0.0012) [2024-06-15 17:35:13,339][1653645] Updated weights for policy 0, policy_version 470304 (0.0011) [2024-06-15 17:35:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 963248128. Throughput: 0: 10399.3. Samples: 240869888. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:15,958][1648982] Avg episode reward: [(0, '36.980')] [2024-06-15 17:35:20,960][1648982] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 43209.3). Total num frames: 963346432. Throughput: 0: 10695.1. Samples: 240908800. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:20,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:35:21,130][1653645] Updated weights for policy 0, policy_version 470392 (0.0013) [2024-06-15 17:35:23,078][1653645] Updated weights for policy 0, policy_version 470448 (0.0012) [2024-06-15 17:35:24,217][1653645] Updated weights for policy 0, policy_version 470500 (0.0012) [2024-06-15 17:35:25,646][1653645] Updated weights for policy 0, policy_version 470576 (0.0045) [2024-06-15 17:35:25,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 963739648. Throughput: 0: 10490.3. Samples: 240963584. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:35:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 963772416. Throughput: 0: 10717.9. Samples: 241038848. Policy #0 lag: (min: 47.0, avg: 107.1, max: 303.0) [2024-06-15 17:35:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:35:32,483][1653645] Updated weights for policy 0, policy_version 470627 (0.0013) [2024-06-15 17:35:34,141][1653645] Updated weights for policy 0, policy_version 470658 (0.0011) [2024-06-15 17:35:35,551][1653645] Updated weights for policy 0, policy_version 470724 (0.0076) [2024-06-15 17:35:35,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 40960.0, 300 sec: 43653.6). Total num frames: 964067328. Throughput: 0: 10956.8. Samples: 241069568. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:35:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:35:37,060][1653645] Updated weights for policy 0, policy_version 470789 (0.0026) [2024-06-15 17:35:40,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 964296704. Throughput: 0: 10695.1. Samples: 241135104. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:35:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:35:43,352][1653645] Updated weights for policy 0, policy_version 470864 (0.0013) [2024-06-15 17:35:45,549][1653645] Updated weights for policy 0, policy_version 470913 (0.0019) [2024-06-15 17:35:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 964460544. Throughput: 0: 11229.9. Samples: 241206272. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:35:45,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:35:46,666][1651596] Signal inference workers to stop experience collection... (24450 times) [2024-06-15 17:35:46,827][1653645] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-06-15 17:35:47,107][1651596] Signal inference workers to resume experience collection... (24450 times) [2024-06-15 17:35:47,118][1653645] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-06-15 17:35:47,517][1653645] Updated weights for policy 0, policy_version 470992 (0.0104) [2024-06-15 17:35:49,322][1653645] Updated weights for policy 0, policy_version 471073 (0.0013) [2024-06-15 17:35:50,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 964820992. Throughput: 0: 10968.2. Samples: 241227776. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:35:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:35:55,360][1653645] Updated weights for policy 0, policy_version 471120 (0.0013) [2024-06-15 17:35:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 42987.2). Total num frames: 964886528. Throughput: 0: 10922.7. Samples: 241303040. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:35:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:35:56,360][1653645] Updated weights for policy 0, policy_version 471158 (0.0035) [2024-06-15 17:35:58,254][1653645] Updated weights for policy 0, policy_version 471217 (0.0010) [2024-06-15 17:35:59,700][1653645] Updated weights for policy 0, policy_version 471292 (0.0013) [2024-06-15 17:36:00,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44783.1, 300 sec: 44097.9). Total num frames: 965279744. Throughput: 0: 11059.2. Samples: 241367552. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:00,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:36:01,399][1653645] Updated weights for policy 0, policy_version 471358 (0.0015) [2024-06-15 17:36:05,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 965345280. Throughput: 0: 10911.2. Samples: 241399808. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:36:08,145][1653645] Updated weights for policy 0, policy_version 471421 (0.0011) [2024-06-15 17:36:10,356][1653645] Updated weights for policy 0, policy_version 471474 (0.0116) [2024-06-15 17:36:10,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 44782.7, 300 sec: 43653.6). Total num frames: 965640192. Throughput: 0: 11275.3. Samples: 241470976. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:36:11,828][1653645] Updated weights for policy 0, policy_version 471545 (0.0013) [2024-06-15 17:36:13,310][1653645] Updated weights for policy 0, policy_version 471609 (0.0013) [2024-06-15 17:36:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 965869568. Throughput: 0: 10911.2. Samples: 241529856. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:15,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 17:36:20,332][1653645] Updated weights for policy 0, policy_version 471677 (0.0021) [2024-06-15 17:36:20,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 44783.0, 300 sec: 43209.4). Total num frames: 966033408. Throughput: 0: 11116.1. Samples: 241569792. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:36:21,405][1653645] Updated weights for policy 0, policy_version 471715 (0.0013) [2024-06-15 17:36:22,809][1653645] Updated weights for policy 0, policy_version 471778 (0.0014) [2024-06-15 17:36:24,928][1653645] Updated weights for policy 0, policy_version 471840 (0.0014) [2024-06-15 17:36:25,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 966393856. Throughput: 0: 11025.0. Samples: 241631232. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:36:30,907][1653645] Updated weights for policy 0, policy_version 471904 (0.0015) [2024-06-15 17:36:30,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.7, 300 sec: 42876.1). Total num frames: 966459392. Throughput: 0: 11161.5. Samples: 241708544. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:30,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:36:31,058][1651596] Signal inference workers to stop experience collection... (24500 times) [2024-06-15 17:36:31,155][1653645] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-06-15 17:36:31,245][1651596] Signal inference workers to resume experience collection... (24500 times) [2024-06-15 17:36:31,246][1653645] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-06-15 17:36:31,505][1653645] Updated weights for policy 0, policy_version 471933 (0.0025) [2024-06-15 17:36:33,124][1653645] Updated weights for policy 0, policy_version 471986 (0.0013) [2024-06-15 17:36:34,380][1653645] Updated weights for policy 0, policy_version 472050 (0.0011) [2024-06-15 17:36:35,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 966787072. Throughput: 0: 11275.4. Samples: 241735168. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:36:36,668][1653645] Updated weights for policy 0, policy_version 472099 (0.0018) [2024-06-15 17:36:40,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 966918144. Throughput: 0: 11093.3. Samples: 241802240. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:40,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:36:42,274][1653645] Updated weights for policy 0, policy_version 472176 (0.0014) [2024-06-15 17:36:45,002][1653645] Updated weights for policy 0, policy_version 472230 (0.0115) [2024-06-15 17:36:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 43653.7). Total num frames: 967213056. Throughput: 0: 11207.1. Samples: 241871872. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:36:46,701][1653645] Updated weights for policy 0, policy_version 472310 (0.0013) [2024-06-15 17:36:48,406][1653645] Updated weights for policy 0, policy_version 472374 (0.0013) [2024-06-15 17:36:50,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 967442432. Throughput: 0: 11036.5. Samples: 241896448. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:36:54,760][1653645] Updated weights for policy 0, policy_version 472432 (0.0034) [2024-06-15 17:36:55,958][1648982] Fps is (10 sec: 36043.1, 60 sec: 44782.6, 300 sec: 43098.2). Total num frames: 967573504. Throughput: 0: 11150.2. Samples: 241972736. Policy #0 lag: (min: 50.0, avg: 115.0, max: 306.0) [2024-06-15 17:36:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:36:56,291][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000472464_967606272.pth... [2024-06-15 17:36:56,420][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000467328_957087744.pth [2024-06-15 17:36:56,804][1653645] Updated weights for policy 0, policy_version 472483 (0.0033) [2024-06-15 17:36:58,310][1653645] Updated weights for policy 0, policy_version 472551 (0.0014) [2024-06-15 17:36:59,680][1653645] Updated weights for policy 0, policy_version 472608 (0.0011) [2024-06-15 17:37:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 967966720. Throughput: 0: 11195.8. Samples: 242033664. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:37:05,211][1653645] Updated weights for policy 0, policy_version 472642 (0.0013) [2024-06-15 17:37:05,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 968032256. Throughput: 0: 11207.0. Samples: 242074112. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:05,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:37:07,328][1653645] Updated weights for policy 0, policy_version 472705 (0.0013) [2024-06-15 17:37:09,162][1653645] Updated weights for policy 0, policy_version 472800 (0.0105) [2024-06-15 17:37:10,725][1653645] Updated weights for policy 0, policy_version 472836 (0.0076) [2024-06-15 17:37:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.4, 300 sec: 44098.0). Total num frames: 968392704. Throughput: 0: 11241.3. Samples: 242137088. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:10,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 17:37:12,032][1653645] Updated weights for policy 0, policy_version 472887 (0.0018) [2024-06-15 17:37:15,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 43690.8, 300 sec: 43653.9). Total num frames: 968491008. Throughput: 0: 11207.2. Samples: 242212864. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:37:16,819][1651596] Signal inference workers to stop experience collection... (24550 times) [2024-06-15 17:37:16,901][1653645] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-06-15 17:37:17,089][1651596] Signal inference workers to resume experience collection... (24550 times) [2024-06-15 17:37:17,090][1653645] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-06-15 17:37:17,535][1653645] Updated weights for policy 0, policy_version 472928 (0.0013) [2024-06-15 17:37:18,568][1653645] Updated weights for policy 0, policy_version 472960 (0.0012) [2024-06-15 17:37:20,450][1653645] Updated weights for policy 0, policy_version 473009 (0.0014) [2024-06-15 17:37:20,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 968753152. Throughput: 0: 11309.5. Samples: 242244096. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:20,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 17:37:21,877][1653645] Updated weights for policy 0, policy_version 473072 (0.0012) [2024-06-15 17:37:23,660][1653645] Updated weights for policy 0, policy_version 473142 (0.0026) [2024-06-15 17:37:25,964][1648982] Fps is (10 sec: 52396.9, 60 sec: 43686.3, 300 sec: 44319.2). Total num frames: 969015296. Throughput: 0: 11137.3. Samples: 242303488. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:25,968][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 17:37:29,340][1653645] Updated weights for policy 0, policy_version 473169 (0.0012) [2024-06-15 17:37:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44783.1, 300 sec: 43209.3). Total num frames: 969146368. Throughput: 0: 11161.6. Samples: 242374144. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:37:32,814][1653645] Updated weights for policy 0, policy_version 473264 (0.0013) [2024-06-15 17:37:34,007][1653645] Updated weights for policy 0, policy_version 473314 (0.0013) [2024-06-15 17:37:35,958][1648982] Fps is (10 sec: 45903.6, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 969474048. Throughput: 0: 11207.1. Samples: 242400768. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:35,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:37:36,061][1653645] Updated weights for policy 0, policy_version 473392 (0.0013) [2024-06-15 17:37:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 969539584. Throughput: 0: 10877.3. Samples: 242462208. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:37:42,703][1653645] Updated weights for policy 0, policy_version 473467 (0.0013) [2024-06-15 17:37:45,263][1653645] Updated weights for policy 0, policy_version 473520 (0.0013) [2024-06-15 17:37:45,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 969801728. Throughput: 0: 11093.3. Samples: 242532864. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:37:47,153][1653645] Updated weights for policy 0, policy_version 473585 (0.0024) [2024-06-15 17:37:48,648][1653645] Updated weights for policy 0, policy_version 473648 (0.0101) [2024-06-15 17:37:50,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 970063872. Throughput: 0: 10706.5. Samples: 242555904. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:50,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:37:53,991][1653645] Updated weights for policy 0, policy_version 473680 (0.0019) [2024-06-15 17:37:55,786][1653645] Updated weights for policy 0, policy_version 473730 (0.0014) [2024-06-15 17:37:55,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.8, 300 sec: 43431.4). Total num frames: 970194944. Throughput: 0: 11025.0. Samples: 242633216. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:37:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:37:57,672][1651596] Signal inference workers to stop experience collection... (24600 times) [2024-06-15 17:37:57,690][1653645] Updated weights for policy 0, policy_version 473810 (0.0086) [2024-06-15 17:37:57,718][1653645] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-06-15 17:37:57,979][1651596] Signal inference workers to resume experience collection... (24600 times) [2024-06-15 17:37:57,980][1653645] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-06-15 17:37:58,901][1653645] Updated weights for policy 0, policy_version 473858 (0.0011) [2024-06-15 17:38:00,371][1653645] Updated weights for policy 0, policy_version 473915 (0.0013) [2024-06-15 17:38:00,957][1648982] Fps is (10 sec: 52431.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 970588160. Throughput: 0: 10490.4. Samples: 242684928. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:38:00,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 17:38:05,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 42598.6, 300 sec: 43875.8). Total num frames: 970588160. Throughput: 0: 10717.9. Samples: 242726400. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:38:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:38:06,804][1653645] Updated weights for policy 0, policy_version 473968 (0.0014) [2024-06-15 17:38:08,585][1653645] Updated weights for policy 0, policy_version 474019 (0.0012) [2024-06-15 17:38:10,383][1653645] Updated weights for policy 0, policy_version 474087 (0.0011) [2024-06-15 17:38:10,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 970981376. Throughput: 0: 10912.8. Samples: 242794496. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:38:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:38:12,396][1653645] Updated weights for policy 0, policy_version 474176 (0.0016) [2024-06-15 17:38:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 971112448. Throughput: 0: 10683.7. Samples: 242854912. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:38:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:38:20,441][1653645] Updated weights for policy 0, policy_version 474256 (0.0078) [2024-06-15 17:38:20,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 42598.4, 300 sec: 43653.6). Total num frames: 971309056. Throughput: 0: 10945.4. Samples: 242893312. Policy #0 lag: (min: 80.0, avg: 159.9, max: 322.0) [2024-06-15 17:38:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:38:22,186][1653645] Updated weights for policy 0, policy_version 474306 (0.0046) [2024-06-15 17:38:23,621][1653645] Updated weights for policy 0, policy_version 474369 (0.0088) [2024-06-15 17:38:24,908][1653645] Updated weights for policy 0, policy_version 474432 (0.0089) [2024-06-15 17:38:25,958][1648982] Fps is (10 sec: 52425.5, 60 sec: 43694.6, 300 sec: 44431.2). Total num frames: 971636736. Throughput: 0: 10660.8. Samples: 242941952. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:38:30,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 41506.1, 300 sec: 43542.5). Total num frames: 971636736. Throughput: 0: 10763.4. Samples: 243017216. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:38:33,175][1653645] Updated weights for policy 0, policy_version 474528 (0.0021) [2024-06-15 17:38:33,943][1653645] Updated weights for policy 0, policy_version 474560 (0.0012) [2024-06-15 17:38:35,778][1653645] Updated weights for policy 0, policy_version 474640 (0.0045) [2024-06-15 17:38:35,958][1648982] Fps is (10 sec: 42601.3, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 972062720. Throughput: 0: 10786.2. Samples: 243041280. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:35,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:38:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 972161024. Throughput: 0: 10558.6. Samples: 243108352. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:38:43,276][1653645] Updated weights for policy 0, policy_version 474708 (0.0153) [2024-06-15 17:38:43,576][1651596] Signal inference workers to stop experience collection... (24650 times) [2024-06-15 17:38:43,618][1653645] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-06-15 17:38:43,866][1651596] Signal inference workers to resume experience collection... (24650 times) [2024-06-15 17:38:43,866][1653645] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-06-15 17:38:45,482][1653645] Updated weights for policy 0, policy_version 474812 (0.0084) [2024-06-15 17:38:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 972423168. Throughput: 0: 10968.1. Samples: 243178496. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:38:46,937][1653645] Updated weights for policy 0, policy_version 474880 (0.0012) [2024-06-15 17:38:48,238][1653645] Updated weights for policy 0, policy_version 474941 (0.0017) [2024-06-15 17:38:50,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 972685312. Throughput: 0: 10752.0. Samples: 243210240. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:38:54,871][1653645] Updated weights for policy 0, policy_version 475003 (0.0013) [2024-06-15 17:38:55,958][1648982] Fps is (10 sec: 45873.5, 60 sec: 44782.8, 300 sec: 43653.6). Total num frames: 972881920. Throughput: 0: 11047.7. Samples: 243291648. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:38:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:38:56,366][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000475056_972914688.pth... [2024-06-15 17:38:56,508][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000469872_962297856.pth [2024-06-15 17:38:57,143][1653645] Updated weights for policy 0, policy_version 475088 (0.0013) [2024-06-15 17:38:59,103][1653645] Updated weights for policy 0, policy_version 475171 (0.0012) [2024-06-15 17:39:00,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.3, 300 sec: 44431.1). Total num frames: 973209600. Throughput: 0: 10786.1. Samples: 243340288. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:00,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 17:39:05,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 44783.0, 300 sec: 43542.6). Total num frames: 973275136. Throughput: 0: 10990.9. Samples: 243387904. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:39:06,100][1653645] Updated weights for policy 0, policy_version 475236 (0.0013) [2024-06-15 17:39:07,682][1653645] Updated weights for policy 0, policy_version 475318 (0.0012) [2024-06-15 17:39:09,674][1653645] Updated weights for policy 0, policy_version 475392 (0.0012) [2024-06-15 17:39:10,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 973701120. Throughput: 0: 11298.3. Samples: 243450368. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:39:11,000][1653645] Updated weights for policy 0, policy_version 475453 (0.0025) [2024-06-15 17:39:15,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 973733888. Throughput: 0: 11355.0. Samples: 243528192. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:39:17,498][1653645] Updated weights for policy 0, policy_version 475504 (0.0015) [2024-06-15 17:39:19,179][1653645] Updated weights for policy 0, policy_version 475569 (0.0022) [2024-06-15 17:39:20,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 45875.1, 300 sec: 43764.7). Total num frames: 974061568. Throughput: 0: 11571.1. Samples: 243561984. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:39:21,651][1651596] Signal inference workers to stop experience collection... (24700 times) [2024-06-15 17:39:21,719][1653645] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-06-15 17:39:21,846][1651596] Signal inference workers to resume experience collection... (24700 times) [2024-06-15 17:39:21,847][1653645] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-06-15 17:39:21,849][1653645] Updated weights for policy 0, policy_version 475664 (0.0013) [2024-06-15 17:39:22,693][1653645] Updated weights for policy 0, policy_version 475712 (0.0013) [2024-06-15 17:39:25,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43691.1, 300 sec: 44320.1). Total num frames: 974258176. Throughput: 0: 11446.0. Samples: 243623424. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:25,958][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 17:39:29,863][1653645] Updated weights for policy 0, policy_version 475792 (0.0030) [2024-06-15 17:39:30,697][1653645] Updated weights for policy 0, policy_version 475836 (0.0012) [2024-06-15 17:39:30,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 48059.8, 300 sec: 43764.7). Total num frames: 974520320. Throughput: 0: 11582.6. Samples: 243699712. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:30,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 17:39:32,469][1653645] Updated weights for policy 0, policy_version 475890 (0.0014) [2024-06-15 17:39:33,633][1653645] Updated weights for policy 0, policy_version 475947 (0.0013) [2024-06-15 17:39:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 974782464. Throughput: 0: 11468.8. Samples: 243726336. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:39:39,721][1653645] Updated weights for policy 0, policy_version 476004 (0.0123) [2024-06-15 17:39:40,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 46967.3, 300 sec: 44320.1). Total num frames: 974979072. Throughput: 0: 11594.0. Samples: 243813376. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:39:41,270][1653645] Updated weights for policy 0, policy_version 476080 (0.0020) [2024-06-15 17:39:43,045][1653645] Updated weights for policy 0, policy_version 476144 (0.0012) [2024-06-15 17:39:44,469][1653645] Updated weights for policy 0, policy_version 476209 (0.0012) [2024-06-15 17:39:45,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 48059.7, 300 sec: 44431.2). Total num frames: 975306752. Throughput: 0: 11673.7. Samples: 243865600. Policy #0 lag: (min: 42.0, avg: 171.3, max: 314.0) [2024-06-15 17:39:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:39:50,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 975372288. Throughput: 0: 11776.0. Samples: 243917824. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:39:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:39:51,012][1653645] Updated weights for policy 0, policy_version 476257 (0.0013) [2024-06-15 17:39:52,731][1653645] Updated weights for policy 0, policy_version 476324 (0.0017) [2024-06-15 17:39:54,483][1653645] Updated weights for policy 0, policy_version 476410 (0.0022) [2024-06-15 17:39:55,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 48060.0, 300 sec: 44653.4). Total num frames: 975765504. Throughput: 0: 11628.1. Samples: 243973632. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:39:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:40:00,970][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 975831040. Throughput: 0: 11798.8. Samples: 244059136. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:00,971][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:40:02,011][1653645] Updated weights for policy 0, policy_version 476512 (0.0016) [2024-06-15 17:40:03,916][1653645] Updated weights for policy 0, policy_version 476577 (0.0013) [2024-06-15 17:40:04,361][1651596] Signal inference workers to stop experience collection... (24750 times) [2024-06-15 17:40:04,416][1653645] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-06-15 17:40:04,492][1651596] Signal inference workers to resume experience collection... (24750 times) [2024-06-15 17:40:04,493][1653645] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-06-15 17:40:05,679][1653645] Updated weights for policy 0, policy_version 476656 (0.0012) [2024-06-15 17:40:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 49152.0, 300 sec: 44986.6). Total num frames: 976224256. Throughput: 0: 11741.9. Samples: 244090368. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:40:07,217][1653645] Updated weights for policy 0, policy_version 476728 (0.0087) [2024-06-15 17:40:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 976355328. Throughput: 0: 11798.8. Samples: 244154368. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:10,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 17:40:13,655][1653645] Updated weights for policy 0, policy_version 476794 (0.0021) [2024-06-15 17:40:15,427][1653645] Updated weights for policy 0, policy_version 476851 (0.0013) [2024-06-15 17:40:15,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 48606.1, 300 sec: 45097.7). Total num frames: 976650240. Throughput: 0: 11741.9. Samples: 244228096. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:40:16,907][1653645] Updated weights for policy 0, policy_version 476928 (0.0014) [2024-06-15 17:40:18,391][1653645] Updated weights for policy 0, policy_version 476987 (0.0031) [2024-06-15 17:40:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 46967.6, 300 sec: 44542.3). Total num frames: 976879616. Throughput: 0: 11696.3. Samples: 244252672. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:40:25,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 45329.2, 300 sec: 44764.4). Total num frames: 976977920. Throughput: 0: 11537.2. Samples: 244332544. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:40:26,279][1653645] Updated weights for policy 0, policy_version 477059 (0.0126) [2024-06-15 17:40:28,252][1653645] Updated weights for policy 0, policy_version 477152 (0.0014) [2024-06-15 17:40:29,663][1653645] Updated weights for policy 0, policy_version 477202 (0.0012) [2024-06-15 17:40:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 48059.8, 300 sec: 45208.7). Total num frames: 977403904. Throughput: 0: 11468.8. Samples: 244381696. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:40:35,962][1648982] Fps is (10 sec: 42578.2, 60 sec: 43687.2, 300 sec: 44430.5). Total num frames: 977403904. Throughput: 0: 11251.4. Samples: 244424192. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:35,963][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:40:38,086][1653645] Updated weights for policy 0, policy_version 477280 (0.0013) [2024-06-15 17:40:39,858][1653645] Updated weights for policy 0, policy_version 477360 (0.0013) [2024-06-15 17:40:40,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 977698816. Throughput: 0: 11423.3. Samples: 244487680. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:40:41,633][1653645] Updated weights for policy 0, policy_version 477427 (0.0015) [2024-06-15 17:40:43,266][1653645] Updated weights for policy 0, policy_version 477494 (0.0011) [2024-06-15 17:40:45,958][1648982] Fps is (10 sec: 52452.0, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 977928192. Throughput: 0: 10808.8. Samples: 244545536. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:40:50,262][1651596] Signal inference workers to stop experience collection... (24800 times) [2024-06-15 17:40:50,296][1653645] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-06-15 17:40:50,576][1651596] Signal inference workers to resume experience collection... (24800 times) [2024-06-15 17:40:50,577][1653645] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-06-15 17:40:50,580][1653645] Updated weights for policy 0, policy_version 477536 (0.0017) [2024-06-15 17:40:50,958][1648982] Fps is (10 sec: 29491.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 977993728. Throughput: 0: 10911.3. Samples: 244581376. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:40:52,403][1653645] Updated weights for policy 0, policy_version 477600 (0.0025) [2024-06-15 17:40:54,204][1653645] Updated weights for policy 0, policy_version 477664 (0.0012) [2024-06-15 17:40:55,968][1648982] Fps is (10 sec: 42562.4, 60 sec: 43138.3, 300 sec: 44318.8). Total num frames: 978354176. Throughput: 0: 10715.8. Samples: 244636672. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:40:55,976][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:40:56,427][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000477744_978419712.pth... [2024-06-15 17:40:56,540][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000472464_967606272.pth [2024-06-15 17:40:56,646][1653645] Updated weights for policy 0, policy_version 477751 (0.0106) [2024-06-15 17:41:00,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 978452480. Throughput: 0: 10399.3. Samples: 244696064. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:41:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:41:04,393][1653645] Updated weights for policy 0, policy_version 477808 (0.0130) [2024-06-15 17:41:05,958][1648982] Fps is (10 sec: 29516.2, 60 sec: 40413.7, 300 sec: 44098.0). Total num frames: 978649088. Throughput: 0: 10763.3. Samples: 244737024. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:41:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:41:06,311][1653645] Updated weights for policy 0, policy_version 477876 (0.0010) [2024-06-15 17:41:08,157][1653645] Updated weights for policy 0, policy_version 477968 (0.0012) [2024-06-15 17:41:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 978976768. Throughput: 0: 10023.8. Samples: 244783616. Policy #0 lag: (min: 2.0, avg: 48.9, max: 258.0) [2024-06-15 17:41:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:41:15,308][1653645] Updated weights for policy 0, policy_version 478048 (0.0014) [2024-06-15 17:41:15,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 40413.8, 300 sec: 44209.0). Total num frames: 979075072. Throughput: 0: 10706.5. Samples: 244863488. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:15,962][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:41:16,705][1653645] Updated weights for policy 0, policy_version 478081 (0.0014) [2024-06-15 17:41:18,124][1653645] Updated weights for policy 0, policy_version 478134 (0.0012) [2024-06-15 17:41:19,714][1653645] Updated weights for policy 0, policy_version 478209 (0.0014) [2024-06-15 17:41:20,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 979468288. Throughput: 0: 10354.8. Samples: 244890112. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:41:25,960][1648982] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 44209.0). Total num frames: 979501056. Throughput: 0: 10410.7. Samples: 244956160. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:25,961][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 17:41:26,422][1653645] Updated weights for policy 0, policy_version 478273 (0.0024) [2024-06-15 17:41:27,868][1653645] Updated weights for policy 0, policy_version 478332 (0.0015) [2024-06-15 17:41:30,418][1651596] Signal inference workers to stop experience collection... (24850 times) [2024-06-15 17:41:30,443][1653645] Updated weights for policy 0, policy_version 478401 (0.0027) [2024-06-15 17:41:30,461][1653645] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-06-15 17:41:30,658][1651596] Signal inference workers to resume experience collection... (24850 times) [2024-06-15 17:41:30,659][1653645] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-06-15 17:41:30,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 40413.8, 300 sec: 44209.0). Total num frames: 979828736. Throughput: 0: 10592.8. Samples: 245022208. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:41:32,457][1653645] Updated weights for policy 0, policy_version 478496 (0.0013) [2024-06-15 17:41:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43694.0, 300 sec: 44431.2). Total num frames: 980025344. Throughput: 0: 10274.1. Samples: 245043712. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:41:39,420][1653645] Updated weights for policy 0, policy_version 478581 (0.0015) [2024-06-15 17:41:40,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 40959.9, 300 sec: 43875.8). Total num frames: 980156416. Throughput: 0: 10799.6. Samples: 245122560. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:40,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:41:41,723][1653645] Updated weights for policy 0, policy_version 478640 (0.0012) [2024-06-15 17:41:44,576][1653645] Updated weights for policy 0, policy_version 478739 (0.0013) [2024-06-15 17:41:45,959][1648982] Fps is (10 sec: 52423.2, 60 sec: 43690.0, 300 sec: 44431.0). Total num frames: 980549632. Throughput: 0: 10535.6. Samples: 245170176. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:41:50,216][1653645] Updated weights for policy 0, policy_version 478800 (0.0013) [2024-06-15 17:41:50,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44236.5, 300 sec: 44320.1). Total num frames: 980647936. Throughput: 0: 10706.5. Samples: 245218816. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:41:53,592][1653645] Updated weights for policy 0, policy_version 478896 (0.0028) [2024-06-15 17:41:55,958][1648982] Fps is (10 sec: 39324.9, 60 sec: 43150.6, 300 sec: 43986.8). Total num frames: 980942848. Throughput: 0: 11138.8. Samples: 245284864. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:41:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:41:56,355][1653645] Updated weights for policy 0, policy_version 478992 (0.0012) [2024-06-15 17:42:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 981073920. Throughput: 0: 10615.4. Samples: 245341184. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:00,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:42:03,056][1653645] Updated weights for policy 0, policy_version 479056 (0.0013) [2024-06-15 17:42:05,106][1653645] Updated weights for policy 0, policy_version 479110 (0.0013) [2024-06-15 17:42:05,958][1648982] Fps is (10 sec: 32768.7, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 981270528. Throughput: 0: 10934.0. Samples: 245382144. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:42:06,551][1653645] Updated weights for policy 0, policy_version 479168 (0.0035) [2024-06-15 17:42:08,422][1653645] Updated weights for policy 0, policy_version 479238 (0.0012) [2024-06-15 17:42:09,474][1653645] Updated weights for policy 0, policy_version 479287 (0.0039) [2024-06-15 17:42:10,959][1648982] Fps is (10 sec: 52421.6, 60 sec: 43689.5, 300 sec: 44430.9). Total num frames: 981598208. Throughput: 0: 10660.6. Samples: 245435904. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:10,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:42:14,623][1651596] Signal inference workers to stop experience collection... (24900 times) [2024-06-15 17:42:14,714][1653645] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-06-15 17:42:14,874][1651596] Signal inference workers to resume experience collection... (24900 times) [2024-06-15 17:42:14,874][1653645] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-06-15 17:42:15,284][1653645] Updated weights for policy 0, policy_version 479344 (0.0019) [2024-06-15 17:42:15,974][1648982] Fps is (10 sec: 45800.0, 60 sec: 44224.6, 300 sec: 43984.4). Total num frames: 981729280. Throughput: 0: 10998.3. Samples: 245517312. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:15,975][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:42:17,931][1653645] Updated weights for policy 0, policy_version 479410 (0.0022) [2024-06-15 17:42:19,074][1653645] Updated weights for policy 0, policy_version 479459 (0.0012) [2024-06-15 17:42:20,725][1653645] Updated weights for policy 0, policy_version 479537 (0.0015) [2024-06-15 17:42:20,958][1648982] Fps is (10 sec: 52438.1, 60 sec: 44236.9, 300 sec: 44432.1). Total num frames: 982122496. Throughput: 0: 11173.0. Samples: 245546496. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:42:25,958][1648982] Fps is (10 sec: 39385.2, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 982122496. Throughput: 0: 10956.8. Samples: 245615616. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:42:26,403][1653645] Updated weights for policy 0, policy_version 479569 (0.0011) [2024-06-15 17:42:28,504][1653645] Updated weights for policy 0, policy_version 479648 (0.0119) [2024-06-15 17:42:30,719][1653645] Updated weights for policy 0, policy_version 479735 (0.0014) [2024-06-15 17:42:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 982515712. Throughput: 0: 11309.8. Samples: 245679104. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:42:32,385][1653645] Updated weights for policy 0, policy_version 479801 (0.0033) [2024-06-15 17:42:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 982646784. Throughput: 0: 10843.0. Samples: 245706752. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:35,961][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:42:38,882][1653645] Updated weights for policy 0, policy_version 479843 (0.0015) [2024-06-15 17:42:40,567][1653645] Updated weights for policy 0, policy_version 479889 (0.0021) [2024-06-15 17:42:40,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 982843392. Throughput: 0: 11104.8. Samples: 245784576. Policy #0 lag: (min: 47.0, avg: 103.6, max: 303.0) [2024-06-15 17:42:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:42:42,913][1653645] Updated weights for policy 0, policy_version 479985 (0.0107) [2024-06-15 17:42:44,538][1653645] Updated weights for policy 0, policy_version 480058 (0.0016) [2024-06-15 17:42:45,960][1648982] Fps is (10 sec: 52429.7, 60 sec: 43691.4, 300 sec: 44431.2). Total num frames: 983171072. Throughput: 0: 10991.0. Samples: 245835776. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:42:45,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:42:50,266][1653645] Updated weights for policy 0, policy_version 480096 (0.0014) [2024-06-15 17:42:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 983269376. Throughput: 0: 11184.4. Samples: 245885440. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:42:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:42:52,141][1653645] Updated weights for policy 0, policy_version 480149 (0.0013) [2024-06-15 17:42:53,132][1653645] Updated weights for policy 0, policy_version 480193 (0.0018) [2024-06-15 17:42:53,965][1651596] Signal inference workers to stop experience collection... (24950 times) [2024-06-15 17:42:53,994][1653645] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-06-15 17:42:54,274][1651596] Signal inference workers to resume experience collection... (24950 times) [2024-06-15 17:42:54,276][1653645] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-06-15 17:42:55,340][1653645] Updated weights for policy 0, policy_version 480288 (0.0171) [2024-06-15 17:42:55,960][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 983662592. Throughput: 0: 11378.2. Samples: 245947904. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:42:55,963][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:42:56,120][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000480320_983695360.pth... [2024-06-15 17:42:56,186][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000475056_972914688.pth [2024-06-15 17:43:00,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 983695360. Throughput: 0: 11222.6. Samples: 246022144. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:43:01,712][1653645] Updated weights for policy 0, policy_version 480352 (0.0015) [2024-06-15 17:43:04,372][1653645] Updated weights for policy 0, policy_version 480416 (0.0015) [2024-06-15 17:43:05,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 45329.0, 300 sec: 44097.9). Total num frames: 983990272. Throughput: 0: 11389.1. Samples: 246059008. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:43:06,190][1653645] Updated weights for policy 0, policy_version 480485 (0.0013) [2024-06-15 17:43:07,866][1653645] Updated weights for policy 0, policy_version 480560 (0.0013) [2024-06-15 17:43:10,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43691.9, 300 sec: 44431.2). Total num frames: 984219648. Throughput: 0: 11093.4. Samples: 246114816. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:43:12,645][1653645] Updated weights for policy 0, policy_version 480612 (0.0012) [2024-06-15 17:43:15,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44795.3, 300 sec: 44431.2). Total num frames: 984416256. Throughput: 0: 11491.6. Samples: 246196224. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:43:16,111][1653645] Updated weights for policy 0, policy_version 480677 (0.0013) [2024-06-15 17:43:17,558][1653645] Updated weights for policy 0, policy_version 480740 (0.0012) [2024-06-15 17:43:20,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44431.3). Total num frames: 984743936. Throughput: 0: 11355.1. Samples: 246217728. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:43:23,368][1653645] Updated weights for policy 0, policy_version 480834 (0.0193) [2024-06-15 17:43:25,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 984875008. Throughput: 0: 11320.9. Samples: 246294016. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:43:27,299][1653645] Updated weights for policy 0, policy_version 480912 (0.0011) [2024-06-15 17:43:29,119][1653645] Updated weights for policy 0, policy_version 480993 (0.0014) [2024-06-15 17:43:30,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 45328.8, 300 sec: 44653.3). Total num frames: 985235456. Throughput: 0: 11457.4. Samples: 246351360. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:30,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:43:31,133][1653645] Updated weights for policy 0, policy_version 481083 (0.0014) [2024-06-15 17:43:35,617][1653645] Updated weights for policy 0, policy_version 481136 (0.0103) [2024-06-15 17:43:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 985399296. Throughput: 0: 11241.2. Samples: 246391296. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:43:39,307][1651596] Signal inference workers to stop experience collection... (25000 times) [2024-06-15 17:43:39,379][1653645] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-06-15 17:43:39,557][1651596] Signal inference workers to resume experience collection... (25000 times) [2024-06-15 17:43:39,557][1653645] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-06-15 17:43:40,527][1653645] Updated weights for policy 0, policy_version 481204 (0.0089) [2024-06-15 17:43:40,958][1648982] Fps is (10 sec: 29492.3, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 985530368. Throughput: 0: 11423.3. Samples: 246461952. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:40,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:43:42,237][1653645] Updated weights for policy 0, policy_version 481280 (0.0013) [2024-06-15 17:43:43,570][1653645] Updated weights for policy 0, policy_version 481334 (0.0013) [2024-06-15 17:43:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 985792512. Throughput: 0: 11127.5. Samples: 246522880. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:43:47,592][1653645] Updated weights for policy 0, policy_version 481378 (0.0015) [2024-06-15 17:43:50,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44236.8, 300 sec: 44209.1). Total num frames: 985923584. Throughput: 0: 11047.9. Samples: 246556160. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:43:51,703][1653645] Updated weights for policy 0, policy_version 481444 (0.0012) [2024-06-15 17:43:53,829][1653645] Updated weights for policy 0, policy_version 481536 (0.0012) [2024-06-15 17:43:55,239][1653645] Updated weights for policy 0, policy_version 481592 (0.0014) [2024-06-15 17:43:55,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 986316800. Throughput: 0: 11184.3. Samples: 246618112. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:43:55,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 17:44:00,873][1653645] Updated weights for policy 0, policy_version 481660 (0.0012) [2024-06-15 17:44:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.3, 300 sec: 44653.3). Total num frames: 986447872. Throughput: 0: 10922.7. Samples: 246687744. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:44:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:44:04,299][1653645] Updated weights for policy 0, policy_version 481728 (0.0012) [2024-06-15 17:44:05,816][1653645] Updated weights for policy 0, policy_version 481782 (0.0014) [2024-06-15 17:44:05,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 44783.1, 300 sec: 43986.9). Total num frames: 986677248. Throughput: 0: 11241.3. Samples: 246723584. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:44:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:44:07,224][1653645] Updated weights for policy 0, policy_version 481851 (0.0014) [2024-06-15 17:44:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 986841088. Throughput: 0: 10899.9. Samples: 246784512. Policy #0 lag: (min: 63.0, avg: 151.2, max: 319.0) [2024-06-15 17:44:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:44:15,074][1653645] Updated weights for policy 0, policy_version 481921 (0.0013) [2024-06-15 17:44:15,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 987037696. Throughput: 0: 11195.8. Samples: 246855168. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:15,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:44:16,609][1653645] Updated weights for policy 0, policy_version 481988 (0.0128) [2024-06-15 17:44:18,199][1651596] Signal inference workers to stop experience collection... (25050 times) [2024-06-15 17:44:18,224][1653645] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-06-15 17:44:18,539][1651596] Signal inference workers to resume experience collection... (25050 times) [2024-06-15 17:44:18,540][1653645] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-06-15 17:44:19,601][1653645] Updated weights for policy 0, policy_version 482105 (0.0144) [2024-06-15 17:44:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 987365376. Throughput: 0: 10820.3. Samples: 246878208. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:44:25,429][1653645] Updated weights for policy 0, policy_version 482160 (0.0011) [2024-06-15 17:44:25,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 987496448. Throughput: 0: 10683.7. Samples: 246942720. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:44:28,039][1653645] Updated weights for policy 0, policy_version 482209 (0.0014) [2024-06-15 17:44:29,335][1653645] Updated weights for policy 0, policy_version 482259 (0.0011) [2024-06-15 17:44:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42052.5, 300 sec: 43986.9). Total num frames: 987758592. Throughput: 0: 10547.2. Samples: 246997504. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:44:32,233][1653645] Updated weights for policy 0, policy_version 482361 (0.0113) [2024-06-15 17:44:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 43764.8). Total num frames: 987889664. Throughput: 0: 10285.5. Samples: 247019008. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:44:38,261][1653645] Updated weights for policy 0, policy_version 482405 (0.0011) [2024-06-15 17:44:40,575][1653645] Updated weights for policy 0, policy_version 482436 (0.0031) [2024-06-15 17:44:40,962][1648982] Fps is (10 sec: 29479.3, 60 sec: 42049.4, 300 sec: 43208.7). Total num frames: 988053504. Throughput: 0: 10557.7. Samples: 247093248. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:40,963][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:44:42,699][1653645] Updated weights for policy 0, policy_version 482532 (0.0013) [2024-06-15 17:44:44,523][1653645] Updated weights for policy 0, policy_version 482608 (0.0011) [2024-06-15 17:44:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 988413952. Throughput: 0: 10296.9. Samples: 247151104. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:44:49,598][1653645] Updated weights for policy 0, policy_version 482656 (0.0011) [2024-06-15 17:44:50,958][1648982] Fps is (10 sec: 49171.2, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 988545024. Throughput: 0: 10353.7. Samples: 247189504. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:44:53,056][1653645] Updated weights for policy 0, policy_version 482720 (0.0013) [2024-06-15 17:44:55,298][1653645] Updated weights for policy 0, policy_version 482800 (0.0024) [2024-06-15 17:44:55,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 41506.0, 300 sec: 43986.8). Total num frames: 988807168. Throughput: 0: 10387.8. Samples: 247251968. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:44:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:44:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000482816_988807168.pth... [2024-06-15 17:44:56,169][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000477744_978419712.pth [2024-06-15 17:44:57,067][1653645] Updated weights for policy 0, policy_version 482873 (0.0078) [2024-06-15 17:45:00,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 43209.3). Total num frames: 988971008. Throughput: 0: 10285.5. Samples: 247318016. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:45:01,370][1653645] Updated weights for policy 0, policy_version 482913 (0.0013) [2024-06-15 17:45:04,497][1653645] Updated weights for policy 0, policy_version 482947 (0.0012) [2024-06-15 17:45:05,955][1651596] Signal inference workers to stop experience collection... (25100 times) [2024-06-15 17:45:05,958][1648982] Fps is (10 sec: 36046.3, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 989167616. Throughput: 0: 10615.5. Samples: 247355904. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:45:06,041][1653645] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-06-15 17:45:06,043][1653645] Updated weights for policy 0, policy_version 483007 (0.0011) [2024-06-15 17:45:06,049][1651596] Signal inference workers to resume experience collection... (25100 times) [2024-06-15 17:45:06,060][1653645] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-06-15 17:45:07,120][1653645] Updated weights for policy 0, policy_version 483057 (0.0012) [2024-06-15 17:45:08,473][1653645] Updated weights for policy 0, policy_version 483120 (0.0013) [2024-06-15 17:45:10,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 989462528. Throughput: 0: 10649.6. Samples: 247421952. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:45:11,927][1653645] Updated weights for policy 0, policy_version 483156 (0.0012) [2024-06-15 17:45:15,558][1653645] Updated weights for policy 0, policy_version 483201 (0.0010) [2024-06-15 17:45:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 989626368. Throughput: 0: 11184.4. Samples: 247500800. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:45:16,857][1653645] Updated weights for policy 0, policy_version 483261 (0.0013) [2024-06-15 17:45:18,242][1653645] Updated weights for policy 0, policy_version 483313 (0.0015) [2024-06-15 17:45:19,539][1653645] Updated weights for policy 0, policy_version 483386 (0.0013) [2024-06-15 17:45:20,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 989986816. Throughput: 0: 11355.0. Samples: 247529984. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:45:22,968][1653645] Updated weights for policy 0, policy_version 483411 (0.0014) [2024-06-15 17:45:25,958][1648982] Fps is (10 sec: 49150.5, 60 sec: 43690.4, 300 sec: 43098.2). Total num frames: 990117888. Throughput: 0: 11321.8. Samples: 247602688. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:45:26,969][1653645] Updated weights for policy 0, policy_version 483459 (0.0016) [2024-06-15 17:45:28,551][1653645] Updated weights for policy 0, policy_version 483522 (0.0015) [2024-06-15 17:45:29,704][1653645] Updated weights for policy 0, policy_version 483584 (0.0134) [2024-06-15 17:45:30,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45329.1, 300 sec: 44320.8). Total num frames: 990478336. Throughput: 0: 11559.9. Samples: 247671296. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:45:31,117][1653645] Updated weights for policy 0, policy_version 483647 (0.0012) [2024-06-15 17:45:34,460][1653645] Updated weights for policy 0, policy_version 483696 (0.0012) [2024-06-15 17:45:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.0, 300 sec: 43875.8). Total num frames: 990642176. Throughput: 0: 11525.6. Samples: 247708160. Policy #0 lag: (min: 15.0, avg: 99.4, max: 271.0) [2024-06-15 17:45:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:45:39,333][1653645] Updated weights for policy 0, policy_version 483752 (0.0012) [2024-06-15 17:45:40,342][1653645] Updated weights for policy 0, policy_version 483794 (0.0014) [2024-06-15 17:45:40,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 46970.6, 300 sec: 43875.8). Total num frames: 990871552. Throughput: 0: 11730.6. Samples: 247779840. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:45:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:45:42,130][1653645] Updated weights for policy 0, policy_version 483873 (0.0107) [2024-06-15 17:45:45,142][1651596] Signal inference workers to stop experience collection... (25150 times) [2024-06-15 17:45:45,212][1653645] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-06-15 17:45:45,224][1653645] Updated weights for policy 0, policy_version 483939 (0.0013) [2024-06-15 17:45:45,459][1651596] Signal inference workers to resume experience collection... (25150 times) [2024-06-15 17:45:45,460][1653645] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-06-15 17:45:45,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 991166464. Throughput: 0: 11605.3. Samples: 247840256. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:45:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:45:49,544][1653645] Updated weights for policy 0, policy_version 483986 (0.0013) [2024-06-15 17:45:51,004][1648982] Fps is (10 sec: 42403.9, 60 sec: 45840.2, 300 sec: 43870.3). Total num frames: 991297536. Throughput: 0: 11775.4. Samples: 247886336. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:45:51,004][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:45:51,466][1653645] Updated weights for policy 0, policy_version 484034 (0.0012) [2024-06-15 17:45:52,636][1653645] Updated weights for policy 0, policy_version 484086 (0.0011) [2024-06-15 17:45:54,294][1653645] Updated weights for policy 0, policy_version 484152 (0.0012) [2024-06-15 17:45:55,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 46421.6, 300 sec: 44542.3). Total num frames: 991592448. Throughput: 0: 11650.8. Samples: 247946240. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:45:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:45:56,140][1653645] Updated weights for policy 0, policy_version 484181 (0.0011) [2024-06-15 17:46:00,958][1648982] Fps is (10 sec: 39502.0, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 991690752. Throughput: 0: 11525.6. Samples: 248019456. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:46:01,600][1653645] Updated weights for policy 0, policy_version 484263 (0.0012) [2024-06-15 17:46:03,712][1653645] Updated weights for policy 0, policy_version 484309 (0.0014) [2024-06-15 17:46:04,931][1653645] Updated weights for policy 0, policy_version 484355 (0.0013) [2024-06-15 17:46:05,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 47513.5, 300 sec: 44209.0). Total num frames: 992018432. Throughput: 0: 11685.0. Samples: 248055808. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:46:07,027][1653645] Updated weights for policy 0, policy_version 484418 (0.0011) [2024-06-15 17:46:08,517][1653645] Updated weights for policy 0, policy_version 484477 (0.0171) [2024-06-15 17:46:10,960][1648982] Fps is (10 sec: 52418.2, 60 sec: 45873.5, 300 sec: 44541.9). Total num frames: 992215040. Throughput: 0: 11252.1. Samples: 248109056. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:10,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:46:14,737][1653645] Updated weights for policy 0, policy_version 484536 (0.0025) [2024-06-15 17:46:15,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 45875.2, 300 sec: 43764.7). Total num frames: 992378880. Throughput: 0: 11389.1. Samples: 248183808. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:46:17,095][1653645] Updated weights for policy 0, policy_version 484608 (0.0013) [2024-06-15 17:46:18,420][1653645] Updated weights for policy 0, policy_version 484670 (0.0014) [2024-06-15 17:46:20,755][1653645] Updated weights for policy 0, policy_version 484725 (0.0013) [2024-06-15 17:46:20,958][1648982] Fps is (10 sec: 52440.8, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 992739328. Throughput: 0: 11104.8. Samples: 248207872. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:46:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44237.0, 300 sec: 43875.8). Total num frames: 992772096. Throughput: 0: 11093.3. Samples: 248279040. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:46:26,560][1653645] Updated weights for policy 0, policy_version 484784 (0.0012) [2024-06-15 17:46:28,658][1653645] Updated weights for policy 0, policy_version 484848 (0.0013) [2024-06-15 17:46:30,294][1653645] Updated weights for policy 0, policy_version 484921 (0.0015) [2024-06-15 17:46:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 993132544. Throughput: 0: 11002.4. Samples: 248335360. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:46:31,773][1651596] Signal inference workers to stop experience collection... (25200 times) [2024-06-15 17:46:31,796][1653645] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-06-15 17:46:32,056][1651596] Signal inference workers to resume experience collection... (25200 times) [2024-06-15 17:46:32,057][1653645] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-06-15 17:46:32,904][1653645] Updated weights for policy 0, policy_version 484981 (0.0012) [2024-06-15 17:46:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 993263616. Throughput: 0: 10797.1. Samples: 248371712. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:46:38,948][1653645] Updated weights for policy 0, policy_version 485056 (0.0012) [2024-06-15 17:46:40,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 43690.5, 300 sec: 43875.9). Total num frames: 993492992. Throughput: 0: 10934.0. Samples: 248438272. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:46:41,221][1653645] Updated weights for policy 0, policy_version 485124 (0.0012) [2024-06-15 17:46:44,023][1653645] Updated weights for policy 0, policy_version 485187 (0.0013) [2024-06-15 17:46:45,382][1653645] Updated weights for policy 0, policy_version 485248 (0.0016) [2024-06-15 17:46:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 993787904. Throughput: 0: 10592.7. Samples: 248496128. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:46:50,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 43177.5, 300 sec: 43875.8). Total num frames: 993886208. Throughput: 0: 10638.2. Samples: 248534528. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:46:51,040][1653645] Updated weights for policy 0, policy_version 485311 (0.0103) [2024-06-15 17:46:52,699][1653645] Updated weights for policy 0, policy_version 485360 (0.0012) [2024-06-15 17:46:54,108][1653645] Updated weights for policy 0, policy_version 485431 (0.0012) [2024-06-15 17:46:55,824][1653645] Updated weights for policy 0, policy_version 485472 (0.0012) [2024-06-15 17:46:55,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 44236.5, 300 sec: 44653.3). Total num frames: 994246656. Throughput: 0: 10889.0. Samples: 248599040. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:46:55,959][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 17:46:56,189][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000485488_994279424.pth... [2024-06-15 17:46:56,229][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000480320_983695360.pth [2024-06-15 17:47:00,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 994312192. Throughput: 0: 10808.8. Samples: 248670208. Policy #0 lag: (min: 9.0, avg: 79.4, max: 265.0) [2024-06-15 17:47:00,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:47:02,192][1653645] Updated weights for policy 0, policy_version 485520 (0.0035) [2024-06-15 17:47:03,463][1653645] Updated weights for policy 0, policy_version 485569 (0.0029) [2024-06-15 17:47:04,788][1653645] Updated weights for policy 0, policy_version 485630 (0.0027) [2024-06-15 17:47:05,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 43690.7, 300 sec: 44209.3). Total num frames: 994639872. Throughput: 0: 11002.3. Samples: 248702976. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:05,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 17:47:07,289][1653645] Updated weights for policy 0, policy_version 485712 (0.0138) [2024-06-15 17:47:08,337][1653645] Updated weights for policy 0, policy_version 485758 (0.0012) [2024-06-15 17:47:10,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43692.3, 300 sec: 44433.7). Total num frames: 994836480. Throughput: 0: 10740.6. Samples: 248762368. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:47:15,228][1653645] Updated weights for policy 0, policy_version 485808 (0.0011) [2024-06-15 17:47:15,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.8, 300 sec: 43653.7). Total num frames: 995000320. Throughput: 0: 11104.7. Samples: 248835072. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:47:16,684][1653645] Updated weights for policy 0, policy_version 485888 (0.0075) [2024-06-15 17:47:16,780][1651596] Signal inference workers to stop experience collection... (25250 times) [2024-06-15 17:47:16,850][1653645] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-06-15 17:47:17,026][1651596] Signal inference workers to resume experience collection... (25250 times) [2024-06-15 17:47:17,026][1653645] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-06-15 17:47:18,178][1653645] Updated weights for policy 0, policy_version 485952 (0.0012) [2024-06-15 17:47:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 995360768. Throughput: 0: 10899.9. Samples: 248862208. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:47:25,958][1648982] Fps is (10 sec: 36042.7, 60 sec: 43144.2, 300 sec: 43542.5). Total num frames: 995360768. Throughput: 0: 11081.9. Samples: 248936960. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:47:26,572][1653645] Updated weights for policy 0, policy_version 486037 (0.0012) [2024-06-15 17:47:27,774][1653645] Updated weights for policy 0, policy_version 486101 (0.0012) [2024-06-15 17:47:29,409][1653645] Updated weights for policy 0, policy_version 486176 (0.0011) [2024-06-15 17:47:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 995753984. Throughput: 0: 11184.4. Samples: 248999424. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:47:31,506][1653645] Updated weights for policy 0, policy_version 486240 (0.0014) [2024-06-15 17:47:35,958][1648982] Fps is (10 sec: 52431.3, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 995885056. Throughput: 0: 11059.2. Samples: 249032192. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:47:38,153][1653645] Updated weights for policy 0, policy_version 486293 (0.0026) [2024-06-15 17:47:39,844][1653645] Updated weights for policy 0, policy_version 486368 (0.0012) [2024-06-15 17:47:40,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44783.0, 300 sec: 44097.9). Total num frames: 996179968. Throughput: 0: 11150.3. Samples: 249100800. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:47:41,098][1653645] Updated weights for policy 0, policy_version 486420 (0.0011) [2024-06-15 17:47:42,135][1653645] Updated weights for policy 0, policy_version 486466 (0.0024) [2024-06-15 17:47:43,481][1653645] Updated weights for policy 0, policy_version 486527 (0.0146) [2024-06-15 17:47:45,959][1648982] Fps is (10 sec: 52422.5, 60 sec: 43689.9, 300 sec: 44542.1). Total num frames: 996409344. Throughput: 0: 11127.2. Samples: 249170944. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:45,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 17:47:50,127][1653645] Updated weights for policy 0, policy_version 486582 (0.0017) [2024-06-15 17:47:50,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 996573184. Throughput: 0: 11241.2. Samples: 249208832. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:50,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 17:47:51,575][1653645] Updated weights for policy 0, policy_version 486651 (0.0012) [2024-06-15 17:47:53,189][1653645] Updated weights for policy 0, policy_version 486707 (0.0012) [2024-06-15 17:47:54,755][1653645] Updated weights for policy 0, policy_version 486777 (0.0015) [2024-06-15 17:47:55,958][1648982] Fps is (10 sec: 52435.0, 60 sec: 44783.2, 300 sec: 44875.5). Total num frames: 996933632. Throughput: 0: 11173.0. Samples: 249265152. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:47:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:48:00,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 996933632. Throughput: 0: 11218.4. Samples: 249339904. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:48:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:48:01,797][1651596] Signal inference workers to stop experience collection... (25300 times) [2024-06-15 17:48:01,846][1653645] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-06-15 17:48:02,065][1651596] Signal inference workers to resume experience collection... (25300 times) [2024-06-15 17:48:02,070][1653645] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-06-15 17:48:03,415][1653645] Updated weights for policy 0, policy_version 486864 (0.0014) [2024-06-15 17:48:04,686][1653645] Updated weights for policy 0, policy_version 486913 (0.0026) [2024-06-15 17:48:05,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 997294080. Throughput: 0: 11161.6. Samples: 249364480. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:48:05,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 17:48:06,464][1653645] Updated weights for policy 0, policy_version 486992 (0.0012) [2024-06-15 17:48:07,461][1653645] Updated weights for policy 0, policy_version 487040 (0.0012) [2024-06-15 17:48:10,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 997457920. Throughput: 0: 10854.5. Samples: 249425408. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:48:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:48:15,284][1653645] Updated weights for policy 0, policy_version 487104 (0.0012) [2024-06-15 17:48:15,960][1648982] Fps is (10 sec: 32760.1, 60 sec: 43688.8, 300 sec: 43653.3). Total num frames: 997621760. Throughput: 0: 11024.5. Samples: 249495552. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:48:15,962][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:48:16,704][1653645] Updated weights for policy 0, policy_version 487164 (0.0012) [2024-06-15 17:48:18,197][1653645] Updated weights for policy 0, policy_version 487217 (0.0013) [2024-06-15 17:48:19,792][1653645] Updated weights for policy 0, policy_version 487280 (0.0018) [2024-06-15 17:48:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 997982208. Throughput: 0: 10979.5. Samples: 249526272. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:48:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:48:25,958][1648982] Fps is (10 sec: 39331.0, 60 sec: 44237.1, 300 sec: 43320.5). Total num frames: 998014976. Throughput: 0: 10945.5. Samples: 249593344. Policy #0 lag: (min: 71.0, avg: 160.8, max: 319.0) [2024-06-15 17:48:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:48:26,499][1653645] Updated weights for policy 0, policy_version 487344 (0.0012) [2024-06-15 17:48:27,676][1653645] Updated weights for policy 0, policy_version 487398 (0.0013) [2024-06-15 17:48:30,231][1653645] Updated weights for policy 0, policy_version 487458 (0.0012) [2024-06-15 17:48:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 998375424. Throughput: 0: 10786.4. Samples: 249656320. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:48:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:48:31,714][1653645] Updated weights for policy 0, policy_version 487524 (0.0015) [2024-06-15 17:48:35,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 998506496. Throughput: 0: 10638.2. Samples: 249687552. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:48:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:48:37,789][1653645] Updated weights for policy 0, policy_version 487572 (0.0015) [2024-06-15 17:48:39,311][1653645] Updated weights for policy 0, policy_version 487632 (0.0090) [2024-06-15 17:48:40,362][1653645] Updated weights for policy 0, policy_version 487679 (0.0013) [2024-06-15 17:48:40,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 998801408. Throughput: 0: 10956.8. Samples: 249758208. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:48:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:48:41,375][1651596] Signal inference workers to stop experience collection... (25350 times) [2024-06-15 17:48:41,411][1653645] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-06-15 17:48:41,560][1651596] Signal inference workers to resume experience collection... (25350 times) [2024-06-15 17:48:41,561][1653645] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-06-15 17:48:41,729][1653645] Updated weights for policy 0, policy_version 487731 (0.0012) [2024-06-15 17:48:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43691.5, 300 sec: 44431.2). Total num frames: 999030784. Throughput: 0: 10615.5. Samples: 249817600. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:48:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:48:49,817][1653645] Updated weights for policy 0, policy_version 487810 (0.0090) [2024-06-15 17:48:50,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 42598.4, 300 sec: 43431.5). Total num frames: 999129088. Throughput: 0: 10934.1. Samples: 249856512. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:48:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:48:52,096][1653645] Updated weights for policy 0, policy_version 487908 (0.0013) [2024-06-15 17:48:53,561][1653645] Updated weights for policy 0, policy_version 487956 (0.0016) [2024-06-15 17:48:54,879][1653645] Updated weights for policy 0, policy_version 488016 (0.0132) [2024-06-15 17:48:55,920][1653645] Updated weights for policy 0, policy_version 488060 (0.0011) [2024-06-15 17:48:55,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 999522304. Throughput: 0: 10854.4. Samples: 249913856. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:48:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:48:56,026][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000488064_999555072.pth... [2024-06-15 17:48:56,069][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000482816_988807168.pth [2024-06-15 17:49:00,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 999555072. Throughput: 0: 11002.9. Samples: 249990656. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:00,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:49:02,483][1653645] Updated weights for policy 0, policy_version 488115 (0.0012) [2024-06-15 17:49:05,069][1653645] Updated weights for policy 0, policy_version 488224 (0.0105) [2024-06-15 17:49:05,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 999948288. Throughput: 0: 11059.2. Samples: 250023936. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:49:06,808][1653645] Updated weights for policy 0, policy_version 488259 (0.0013) [2024-06-15 17:49:10,960][1648982] Fps is (10 sec: 52418.9, 60 sec: 43689.2, 300 sec: 44208.7). Total num frames: 1000079360. Throughput: 0: 10740.1. Samples: 250076672. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:10,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:49:13,230][1653645] Updated weights for policy 0, policy_version 488336 (0.0057) [2024-06-15 17:49:14,672][1653645] Updated weights for policy 0, policy_version 488400 (0.0012) [2024-06-15 17:49:15,768][1653645] Updated weights for policy 0, policy_version 488447 (0.0012) [2024-06-15 17:49:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45330.9, 300 sec: 43986.9). Total num frames: 1000341504. Throughput: 0: 10922.7. Samples: 250147840. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:49:17,870][1653645] Updated weights for policy 0, policy_version 488512 (0.0013) [2024-06-15 17:49:20,958][1648982] Fps is (10 sec: 52439.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1000603648. Throughput: 0: 10934.0. Samples: 250179584. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:49:24,764][1653645] Updated weights for policy 0, policy_version 488577 (0.0013) [2024-06-15 17:49:25,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 44782.8, 300 sec: 43875.8). Total num frames: 1000701952. Throughput: 0: 11036.4. Samples: 250254848. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:25,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:49:26,311][1651596] Signal inference workers to stop experience collection... (25400 times) [2024-06-15 17:49:26,394][1653645] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-06-15 17:49:26,615][1651596] Signal inference workers to resume experience collection... (25400 times) [2024-06-15 17:49:26,617][1653645] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-06-15 17:49:26,770][1653645] Updated weights for policy 0, policy_version 488657 (0.0088) [2024-06-15 17:49:29,203][1653645] Updated weights for policy 0, policy_version 488706 (0.0018) [2024-06-15 17:49:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1000996864. Throughput: 0: 10956.8. Samples: 250310656. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:49:31,401][1653645] Updated weights for policy 0, policy_version 488800 (0.0027) [2024-06-15 17:49:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 44320.7). Total num frames: 1001127936. Throughput: 0: 10740.6. Samples: 250339840. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:35,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 17:49:37,795][1653645] Updated weights for policy 0, policy_version 488867 (0.0019) [2024-06-15 17:49:39,477][1653645] Updated weights for policy 0, policy_version 488949 (0.0024) [2024-06-15 17:49:40,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 1001390080. Throughput: 0: 11002.3. Samples: 250408960. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:49:42,088][1653645] Updated weights for policy 0, policy_version 488995 (0.0015) [2024-06-15 17:49:43,621][1653645] Updated weights for policy 0, policy_version 489072 (0.0014) [2024-06-15 17:49:45,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1001652224. Throughput: 0: 10786.2. Samples: 250476032. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:49:49,410][1653645] Updated weights for policy 0, policy_version 489136 (0.0070) [2024-06-15 17:49:50,902][1653645] Updated weights for policy 0, policy_version 489212 (0.0012) [2024-06-15 17:49:50,958][1648982] Fps is (10 sec: 49150.4, 60 sec: 45874.9, 300 sec: 44320.1). Total num frames: 1001881600. Throughput: 0: 10945.3. Samples: 250516480. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:49:53,905][1653645] Updated weights for policy 0, policy_version 489273 (0.0014) [2024-06-15 17:49:55,073][1653645] Updated weights for policy 0, policy_version 489328 (0.0014) [2024-06-15 17:49:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1002176512. Throughput: 0: 11139.3. Samples: 250577920. Policy #0 lag: (min: 63.0, avg: 184.0, max: 319.0) [2024-06-15 17:49:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:50:00,450][1653645] Updated weights for policy 0, policy_version 489348 (0.0013) [2024-06-15 17:50:00,958][1648982] Fps is (10 sec: 32769.1, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 1002209280. Throughput: 0: 11207.1. Samples: 250652160. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:50:02,196][1653645] Updated weights for policy 0, policy_version 489424 (0.0112) [2024-06-15 17:50:04,840][1653645] Updated weights for policy 0, policy_version 489488 (0.0013) [2024-06-15 17:50:05,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 1002536960. Throughput: 0: 11093.4. Samples: 250678784. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:50:06,635][1651596] Signal inference workers to stop experience collection... (25450 times) [2024-06-15 17:50:06,684][1653645] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-06-15 17:50:06,685][1653645] Updated weights for policy 0, policy_version 489553 (0.0011) [2024-06-15 17:50:06,882][1651596] Signal inference workers to resume experience collection... (25450 times) [2024-06-15 17:50:06,888][1653645] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-06-15 17:50:10,958][1648982] Fps is (10 sec: 49150.7, 60 sec: 43692.0, 300 sec: 44320.1). Total num frames: 1002700800. Throughput: 0: 10888.5. Samples: 250744832. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:10,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:50:12,906][1653645] Updated weights for policy 0, policy_version 489632 (0.0026) [2024-06-15 17:50:14,752][1653645] Updated weights for policy 0, policy_version 489727 (0.0103) [2024-06-15 17:50:15,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1002962944. Throughput: 0: 11093.3. Samples: 250809856. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:50:18,326][1653645] Updated weights for policy 0, policy_version 489792 (0.0014) [2024-06-15 17:50:19,850][1653645] Updated weights for policy 0, policy_version 489851 (0.0013) [2024-06-15 17:50:20,970][1648982] Fps is (10 sec: 52363.9, 60 sec: 43681.5, 300 sec: 44429.3). Total num frames: 1003225088. Throughput: 0: 11056.1. Samples: 250837504. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:20,971][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 17:50:25,550][1653645] Updated weights for policy 0, policy_version 489904 (0.0017) [2024-06-15 17:50:25,960][1648982] Fps is (10 sec: 39313.7, 60 sec: 44235.4, 300 sec: 43653.3). Total num frames: 1003356160. Throughput: 0: 11138.3. Samples: 250910208. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:25,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:50:26,868][1653645] Updated weights for policy 0, policy_version 489972 (0.0045) [2024-06-15 17:50:30,663][1653645] Updated weights for policy 0, policy_version 490048 (0.0015) [2024-06-15 17:50:30,958][1648982] Fps is (10 sec: 39371.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1003618304. Throughput: 0: 11036.4. Samples: 250972672. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:50:31,940][1653645] Updated weights for policy 0, policy_version 490105 (0.0033) [2024-06-15 17:50:35,958][1648982] Fps is (10 sec: 39329.8, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 1003749376. Throughput: 0: 10843.1. Samples: 251004416. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:50:36,795][1653645] Updated weights for policy 0, policy_version 490146 (0.0013) [2024-06-15 17:50:38,464][1653645] Updated weights for policy 0, policy_version 490233 (0.0105) [2024-06-15 17:50:40,963][1648982] Fps is (10 sec: 39303.0, 60 sec: 43687.2, 300 sec: 43541.9). Total num frames: 1004011520. Throughput: 0: 10967.0. Samples: 251071488. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:40,965][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:50:42,624][1653645] Updated weights for policy 0, policy_version 490291 (0.0013) [2024-06-15 17:50:43,936][1653645] Updated weights for policy 0, policy_version 490352 (0.0011) [2024-06-15 17:50:45,959][1648982] Fps is (10 sec: 52420.6, 60 sec: 43689.5, 300 sec: 43993.5). Total num frames: 1004273664. Throughput: 0: 10990.5. Samples: 251146752. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:45,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:50:48,333][1653645] Updated weights for policy 0, policy_version 490432 (0.0014) [2024-06-15 17:50:49,192][1651596] Signal inference workers to stop experience collection... (25500 times) [2024-06-15 17:50:49,234][1653645] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-06-15 17:50:49,418][1651596] Signal inference workers to resume experience collection... (25500 times) [2024-06-15 17:50:49,419][1653645] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-06-15 17:50:49,647][1653645] Updated weights for policy 0, policy_version 490492 (0.0013) [2024-06-15 17:50:50,973][1648982] Fps is (10 sec: 52376.3, 60 sec: 44226.1, 300 sec: 43873.6). Total num frames: 1004535808. Throughput: 0: 11055.6. Samples: 251176448. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:50,976][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:50:53,937][1653645] Updated weights for policy 0, policy_version 490548 (0.0085) [2024-06-15 17:50:55,643][1653645] Updated weights for policy 0, policy_version 490624 (0.0013) [2024-06-15 17:50:55,958][1648982] Fps is (10 sec: 52436.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1004797952. Throughput: 0: 11082.0. Samples: 251243520. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:50:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:50:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000490624_1004797952.pth... [2024-06-15 17:50:56,022][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000485488_994279424.pth [2024-06-15 17:50:56,028][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000490624_1004797952.pth [2024-06-15 17:51:00,129][1653645] Updated weights for policy 0, policy_version 490687 (0.0016) [2024-06-15 17:51:00,958][1648982] Fps is (10 sec: 39379.0, 60 sec: 45328.9, 300 sec: 43764.7). Total num frames: 1004929024. Throughput: 0: 11047.8. Samples: 251307008. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:51:00,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:51:02,241][1653645] Updated weights for policy 0, policy_version 490736 (0.0012) [2024-06-15 17:51:05,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43690.6, 300 sec: 43876.1). Total num frames: 1005158400. Throughput: 0: 11142.0. Samples: 251338752. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:51:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:51:06,257][1653645] Updated weights for policy 0, policy_version 490817 (0.0012) [2024-06-15 17:51:10,906][1653645] Updated weights for policy 0, policy_version 490882 (0.0011) [2024-06-15 17:51:10,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1005322240. Throughput: 0: 11037.0. Samples: 251406848. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:51:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:51:12,011][1653645] Updated weights for policy 0, policy_version 490933 (0.0018) [2024-06-15 17:51:13,757][1653645] Updated weights for policy 0, policy_version 490982 (0.0103) [2024-06-15 17:51:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1005584384. Throughput: 0: 11116.1. Samples: 251472896. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:51:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:51:16,678][1653645] Updated weights for policy 0, policy_version 491043 (0.0013) [2024-06-15 17:51:18,420][1653645] Updated weights for policy 0, policy_version 491104 (0.0009) [2024-06-15 17:51:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43699.9, 300 sec: 44320.1). Total num frames: 1005846528. Throughput: 0: 11241.3. Samples: 251510272. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:51:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:51:22,664][1653645] Updated weights for policy 0, policy_version 491168 (0.0013) [2024-06-15 17:51:25,672][1653645] Updated weights for policy 0, policy_version 491232 (0.0088) [2024-06-15 17:51:25,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 45330.5, 300 sec: 43875.8). Total num frames: 1006075904. Throughput: 0: 11265.1. Samples: 251578368. Policy #0 lag: (min: 0.0, avg: 68.8, max: 256.0) [2024-06-15 17:51:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 17:51:28,296][1653645] Updated weights for policy 0, policy_version 491317 (0.0015) [2024-06-15 17:51:30,037][1653645] Updated weights for policy 0, policy_version 491348 (0.0023) [2024-06-15 17:51:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1006370816. Throughput: 0: 11116.5. Samples: 251646976. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:51:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 17:51:34,153][1653645] Updated weights for policy 0, policy_version 491440 (0.0013) [2024-06-15 17:51:35,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 1006501888. Throughput: 0: 11301.8. Samples: 251684864. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:51:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:51:36,758][1653645] Updated weights for policy 0, policy_version 491488 (0.0013) [2024-06-15 17:51:36,868][1651596] Signal inference workers to stop experience collection... (25550 times) [2024-06-15 17:51:36,952][1653645] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-06-15 17:51:37,061][1651596] Signal inference workers to resume experience collection... (25550 times) [2024-06-15 17:51:37,061][1653645] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-06-15 17:51:37,355][1653645] Updated weights for policy 0, policy_version 491520 (0.0015) [2024-06-15 17:51:40,620][1653645] Updated weights for policy 0, policy_version 491585 (0.0013) [2024-06-15 17:51:40,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 46425.0, 300 sec: 44098.0). Total num frames: 1006796800. Throughput: 0: 11207.1. Samples: 251747840. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:51:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:51:41,837][1653645] Updated weights for policy 0, policy_version 491648 (0.0011) [2024-06-15 17:51:45,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45330.2, 300 sec: 44431.2). Total num frames: 1006993408. Throughput: 0: 11480.2. Samples: 251823616. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:51:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:51:45,960][1653645] Updated weights for policy 0, policy_version 491710 (0.0030) [2024-06-15 17:51:48,674][1653645] Updated weights for policy 0, policy_version 491771 (0.0013) [2024-06-15 17:51:50,198][1653645] Updated weights for policy 0, policy_version 491836 (0.0012) [2024-06-15 17:51:50,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 45886.3, 300 sec: 44209.0). Total num frames: 1007288320. Throughput: 0: 11491.5. Samples: 251855872. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:51:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:51:55,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1007419392. Throughput: 0: 11389.1. Samples: 251919360. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:51:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:51:56,509][1653645] Updated weights for policy 0, policy_version 491908 (0.0014) [2024-06-15 17:51:57,685][1653645] Updated weights for policy 0, policy_version 491967 (0.0014) [2024-06-15 17:52:00,658][1653645] Updated weights for policy 0, policy_version 492032 (0.0013) [2024-06-15 17:52:00,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45875.3, 300 sec: 44209.0). Total num frames: 1007681536. Throughput: 0: 11502.9. Samples: 251990528. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:52:01,931][1653645] Updated weights for policy 0, policy_version 492092 (0.0019) [2024-06-15 17:52:05,597][1653645] Updated weights for policy 0, policy_version 492153 (0.0013) [2024-06-15 17:52:05,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 46421.4, 300 sec: 44431.2). Total num frames: 1007943680. Throughput: 0: 11411.9. Samples: 252023808. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:52:08,985][1653645] Updated weights for policy 0, policy_version 492220 (0.0037) [2024-06-15 17:52:10,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 45875.0, 300 sec: 44320.0). Total num frames: 1008074752. Throughput: 0: 11366.4. Samples: 252089856. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:52:11,875][1653645] Updated weights for policy 0, policy_version 492274 (0.0013) [2024-06-15 17:52:13,455][1653645] Updated weights for policy 0, policy_version 492342 (0.0021) [2024-06-15 17:52:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 1008336896. Throughput: 0: 11332.3. Samples: 252156928. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:52:17,830][1653645] Updated weights for policy 0, policy_version 492410 (0.0014) [2024-06-15 17:52:20,904][1653645] Updated weights for policy 0, policy_version 492464 (0.0012) [2024-06-15 17:52:20,961][1648982] Fps is (10 sec: 49138.4, 60 sec: 45326.7, 300 sec: 44764.0). Total num frames: 1008566272. Throughput: 0: 11183.6. Samples: 252188160. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:20,962][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:52:23,311][1653645] Updated weights for policy 0, policy_version 492497 (0.0011) [2024-06-15 17:52:23,652][1651596] Signal inference workers to stop experience collection... (25600 times) [2024-06-15 17:52:23,683][1653645] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-06-15 17:52:23,952][1651596] Signal inference workers to resume experience collection... (25600 times) [2024-06-15 17:52:23,952][1653645] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-06-15 17:52:25,295][1653645] Updated weights for policy 0, policy_version 492581 (0.0012) [2024-06-15 17:52:25,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46421.5, 300 sec: 44431.2). Total num frames: 1008861184. Throughput: 0: 11218.5. Samples: 252252672. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:52:29,111][1653645] Updated weights for policy 0, policy_version 492612 (0.0018) [2024-06-15 17:52:30,279][1653645] Updated weights for policy 0, policy_version 492669 (0.0011) [2024-06-15 17:52:30,958][1648982] Fps is (10 sec: 42611.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1008992256. Throughput: 0: 11002.3. Samples: 252318720. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:52:32,620][1653645] Updated weights for policy 0, policy_version 492736 (0.0012) [2024-06-15 17:52:35,808][1653645] Updated weights for policy 0, policy_version 492792 (0.0012) [2024-06-15 17:52:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 1009254400. Throughput: 0: 11047.9. Samples: 252353024. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:52:37,772][1653645] Updated weights for policy 0, policy_version 492863 (0.0015) [2024-06-15 17:52:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 43987.0). Total num frames: 1009385472. Throughput: 0: 11070.7. Samples: 252417536. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:52:42,376][1653645] Updated weights for policy 0, policy_version 492917 (0.0015) [2024-06-15 17:52:44,273][1653645] Updated weights for policy 0, policy_version 492976 (0.0014) [2024-06-15 17:52:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 1009647616. Throughput: 0: 11047.8. Samples: 252487680. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:52:46,783][1653645] Updated weights for policy 0, policy_version 493024 (0.0010) [2024-06-15 17:52:49,228][1653645] Updated weights for policy 0, policy_version 493094 (0.0012) [2024-06-15 17:52:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1009909760. Throughput: 0: 11150.2. Samples: 252525568. Policy #0 lag: (min: 5.0, avg: 121.5, max: 261.0) [2024-06-15 17:52:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:52:52,685][1653645] Updated weights for policy 0, policy_version 493136 (0.0012) [2024-06-15 17:52:54,951][1653645] Updated weights for policy 0, policy_version 493188 (0.0013) [2024-06-15 17:52:55,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.2, 300 sec: 44764.4). Total num frames: 1010139136. Throughput: 0: 11047.9. Samples: 252587008. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:52:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:52:56,098][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000493248_1010171904.pth... [2024-06-15 17:52:56,181][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000488064_999555072.pth [2024-06-15 17:52:57,773][1653645] Updated weights for policy 0, policy_version 493267 (0.0013) [2024-06-15 17:52:58,872][1653645] Updated weights for policy 0, policy_version 493312 (0.0010) [2024-06-15 17:53:00,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1010401280. Throughput: 0: 11184.4. Samples: 252660224. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:53:03,768][1653645] Updated weights for policy 0, policy_version 493377 (0.0012) [2024-06-15 17:53:05,499][1653645] Updated weights for policy 0, policy_version 493445 (0.0101) [2024-06-15 17:53:05,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1010597888. Throughput: 0: 11333.1. Samples: 252698112. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:53:06,659][1653645] Updated weights for policy 0, policy_version 493502 (0.0064) [2024-06-15 17:53:09,314][1651596] Signal inference workers to stop experience collection... (25650 times) [2024-06-15 17:53:09,365][1653645] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-06-15 17:53:09,574][1651596] Signal inference workers to resume experience collection... (25650 times) [2024-06-15 17:53:09,574][1653645] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-06-15 17:53:09,732][1653645] Updated weights for policy 0, policy_version 493557 (0.0017) [2024-06-15 17:53:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.4, 300 sec: 44764.8). Total num frames: 1010827264. Throughput: 0: 11343.6. Samples: 252763136. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:53:12,615][1653645] Updated weights for policy 0, policy_version 493616 (0.0019) [2024-06-15 17:53:15,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1010958336. Throughput: 0: 11457.4. Samples: 252834304. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:53:16,508][1653645] Updated weights for policy 0, policy_version 493664 (0.0241) [2024-06-15 17:53:18,680][1653645] Updated weights for policy 0, policy_version 493758 (0.0012) [2024-06-15 17:53:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44785.3, 300 sec: 44875.5). Total num frames: 1011253248. Throughput: 0: 11150.3. Samples: 252854784. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:53:21,749][1653645] Updated weights for policy 0, policy_version 493815 (0.0129) [2024-06-15 17:53:25,174][1653645] Updated weights for policy 0, policy_version 493858 (0.0015) [2024-06-15 17:53:25,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1011482624. Throughput: 0: 11525.7. Samples: 252936192. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:53:28,583][1653645] Updated weights for policy 0, policy_version 493936 (0.0015) [2024-06-15 17:53:30,119][1653645] Updated weights for policy 0, policy_version 494009 (0.0118) [2024-06-15 17:53:30,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1011744768. Throughput: 0: 11195.7. Samples: 252991488. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:53:32,259][1653645] Updated weights for policy 0, policy_version 494064 (0.0011) [2024-06-15 17:53:35,965][1648982] Fps is (10 sec: 39294.9, 60 sec: 43685.7, 300 sec: 44319.1). Total num frames: 1011875840. Throughput: 0: 11285.0. Samples: 253033472. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:35,965][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:53:36,766][1653645] Updated weights for policy 0, policy_version 494128 (0.0019) [2024-06-15 17:53:39,364][1653645] Updated weights for policy 0, policy_version 494176 (0.0036) [2024-06-15 17:53:40,946][1653645] Updated weights for policy 0, policy_version 494240 (0.0014) [2024-06-15 17:53:40,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 46967.5, 300 sec: 44653.3). Total num frames: 1012203520. Throughput: 0: 11548.5. Samples: 253106688. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:53:42,042][1653645] Updated weights for policy 0, policy_version 494275 (0.0011) [2024-06-15 17:53:45,958][1648982] Fps is (10 sec: 52462.8, 60 sec: 45874.9, 300 sec: 44986.5). Total num frames: 1012400128. Throughput: 0: 11468.7. Samples: 253176320. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:53:47,738][1653645] Updated weights for policy 0, policy_version 494352 (0.0013) [2024-06-15 17:53:50,167][1653645] Updated weights for policy 0, policy_version 494401 (0.0013) [2024-06-15 17:53:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1012596736. Throughput: 0: 11389.1. Samples: 253210624. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:53:52,101][1653645] Updated weights for policy 0, policy_version 494481 (0.0135) [2024-06-15 17:53:53,711][1651596] Signal inference workers to stop experience collection... (25700 times) [2024-06-15 17:53:53,794][1653645] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-06-15 17:53:53,983][1651596] Signal inference workers to resume experience collection... (25700 times) [2024-06-15 17:53:53,983][1653645] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-06-15 17:53:53,985][1653645] Updated weights for policy 0, policy_version 494544 (0.0013) [2024-06-15 17:53:55,046][1653645] Updated weights for policy 0, policy_version 494592 (0.0013) [2024-06-15 17:53:55,959][1648982] Fps is (10 sec: 52423.6, 60 sec: 46420.4, 300 sec: 45319.6). Total num frames: 1012924416. Throughput: 0: 11195.4. Samples: 253266944. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:53:55,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:54:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1013022720. Throughput: 0: 11252.6. Samples: 253340672. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:54:00,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:54:03,619][1653645] Updated weights for policy 0, policy_version 494704 (0.0122) [2024-06-15 17:54:05,440][1653645] Updated weights for policy 0, policy_version 494778 (0.0015) [2024-06-15 17:54:05,958][1648982] Fps is (10 sec: 39326.0, 60 sec: 45328.8, 300 sec: 44875.8). Total num frames: 1013317632. Throughput: 0: 11434.6. Samples: 253369344. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:54:05,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:54:06,796][1653645] Updated weights for policy 0, policy_version 494832 (0.0011) [2024-06-15 17:54:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1013448704. Throughput: 0: 10934.0. Samples: 253428224. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:54:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:54:14,317][1653645] Updated weights for policy 0, policy_version 494911 (0.0012) [2024-06-15 17:54:15,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1013678080. Throughput: 0: 11207.1. Samples: 253495808. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:54:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:54:16,612][1653645] Updated weights for policy 0, policy_version 494983 (0.0013) [2024-06-15 17:54:17,647][1653645] Updated weights for policy 0, policy_version 495029 (0.0054) [2024-06-15 17:54:18,610][1653645] Updated weights for policy 0, policy_version 495072 (0.0035) [2024-06-15 17:54:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45328.9, 300 sec: 44986.6). Total num frames: 1013972992. Throughput: 0: 10867.4. Samples: 253522432. Policy #0 lag: (min: 63.0, avg: 167.8, max: 319.0) [2024-06-15 17:54:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:54:25,648][1653645] Updated weights for policy 0, policy_version 495137 (0.0014) [2024-06-15 17:54:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1014071296. Throughput: 0: 10968.2. Samples: 253600256. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:54:26,882][1653645] Updated weights for policy 0, policy_version 495169 (0.0023) [2024-06-15 17:54:28,763][1653645] Updated weights for policy 0, policy_version 495248 (0.0014) [2024-06-15 17:54:30,187][1653645] Updated weights for policy 0, policy_version 495305 (0.0032) [2024-06-15 17:54:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 1014431744. Throughput: 0: 10581.4. Samples: 253652480. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:54:31,330][1653645] Updated weights for policy 0, policy_version 495360 (0.0012) [2024-06-15 17:54:35,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43695.6, 300 sec: 44431.2). Total num frames: 1014497280. Throughput: 0: 10626.8. Samples: 253688832. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:54:37,814][1653645] Updated weights for policy 0, policy_version 495417 (0.0017) [2024-06-15 17:54:39,713][1651596] Signal inference workers to stop experience collection... (25750 times) [2024-06-15 17:54:39,762][1653645] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-06-15 17:54:40,025][1651596] Signal inference workers to resume experience collection... (25750 times) [2024-06-15 17:54:40,025][1653645] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-06-15 17:54:40,758][1653645] Updated weights for policy 0, policy_version 495493 (0.0012) [2024-06-15 17:54:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 1014792192. Throughput: 0: 10957.1. Samples: 253760000. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:54:42,027][1653645] Updated weights for policy 0, policy_version 495553 (0.0088) [2024-06-15 17:54:45,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.9, 300 sec: 44542.3). Total num frames: 1015021568. Throughput: 0: 10638.2. Samples: 253819392. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:54:49,131][1653645] Updated weights for policy 0, policy_version 495632 (0.0012) [2024-06-15 17:54:50,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1015152640. Throughput: 0: 10979.6. Samples: 253863424. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:54:51,577][1653645] Updated weights for policy 0, policy_version 495683 (0.0011) [2024-06-15 17:54:52,900][1653645] Updated weights for policy 0, policy_version 495747 (0.0013) [2024-06-15 17:54:54,414][1653645] Updated weights for policy 0, policy_version 495810 (0.0011) [2024-06-15 17:54:55,857][1653645] Updated weights for policy 0, policy_version 495870 (0.0011) [2024-06-15 17:54:55,958][1648982] Fps is (10 sec: 52426.5, 60 sec: 43691.3, 300 sec: 45208.7). Total num frames: 1015545856. Throughput: 0: 10979.5. Samples: 253922304. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:54:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:54:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000495872_1015545856.pth... [2024-06-15 17:54:56,018][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000490624_1004797952.pth [2024-06-15 17:55:00,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 42052.1, 300 sec: 44097.9). Total num frames: 1015545856. Throughput: 0: 11127.4. Samples: 253996544. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:00,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:55:02,177][1653645] Updated weights for policy 0, policy_version 495934 (0.0025) [2024-06-15 17:55:05,377][1653645] Updated weights for policy 0, policy_version 496032 (0.0092) [2024-06-15 17:55:05,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 43144.7, 300 sec: 44764.4). Total num frames: 1015906304. Throughput: 0: 11218.5. Samples: 254027264. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:55:07,423][1653645] Updated weights for policy 0, policy_version 496128 (0.0088) [2024-06-15 17:55:10,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1016070144. Throughput: 0: 10683.7. Samples: 254081024. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:55:14,389][1653645] Updated weights for policy 0, policy_version 496190 (0.0014) [2024-06-15 17:55:15,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.6, 300 sec: 44210.9). Total num frames: 1016266752. Throughput: 0: 11161.6. Samples: 254154752. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:55:16,525][1653645] Updated weights for policy 0, policy_version 496245 (0.0014) [2024-06-15 17:55:18,512][1653645] Updated weights for policy 0, policy_version 496340 (0.0124) [2024-06-15 17:55:18,742][1651596] Signal inference workers to stop experience collection... (25800 times) [2024-06-15 17:55:18,806][1653645] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-06-15 17:55:19,008][1651596] Signal inference workers to resume experience collection... (25800 times) [2024-06-15 17:55:19,008][1653645] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-06-15 17:55:19,353][1653645] Updated weights for policy 0, policy_version 496381 (0.0013) [2024-06-15 17:55:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44875.8). Total num frames: 1016594432. Throughput: 0: 10820.3. Samples: 254175744. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:55:25,239][1653645] Updated weights for policy 0, policy_version 496448 (0.0013) [2024-06-15 17:55:25,966][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1016725504. Throughput: 0: 11241.2. Samples: 254265856. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:25,983][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:55:28,133][1653645] Updated weights for policy 0, policy_version 496544 (0.0011) [2024-06-15 17:55:29,873][1653645] Updated weights for policy 0, policy_version 496624 (0.0014) [2024-06-15 17:55:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1017118720. Throughput: 0: 11150.2. Samples: 254321152. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:55:35,962][1648982] Fps is (10 sec: 42579.3, 60 sec: 44233.5, 300 sec: 44542.3). Total num frames: 1017151488. Throughput: 0: 11069.5. Samples: 254361600. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:35,963][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:55:36,196][1653645] Updated weights for policy 0, policy_version 496672 (0.0031) [2024-06-15 17:55:38,731][1653645] Updated weights for policy 0, policy_version 496736 (0.0013) [2024-06-15 17:55:40,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 44764.7). Total num frames: 1017479168. Throughput: 0: 11241.3. Samples: 254428160. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:55:41,172][1653645] Updated weights for policy 0, policy_version 496833 (0.0014) [2024-06-15 17:55:42,764][1653645] Updated weights for policy 0, policy_version 496892 (0.0011) [2024-06-15 17:55:45,958][1648982] Fps is (10 sec: 49174.3, 60 sec: 43690.6, 300 sec: 44433.4). Total num frames: 1017643008. Throughput: 0: 11093.4. Samples: 254495744. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:45,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:55:48,984][1653645] Updated weights for policy 0, policy_version 496959 (0.0013) [2024-06-15 17:55:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1017872384. Throughput: 0: 11229.9. Samples: 254532608. Policy #0 lag: (min: 0.0, avg: 63.0, max: 256.0) [2024-06-15 17:55:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:55:51,985][1653645] Updated weights for policy 0, policy_version 497043 (0.0018) [2024-06-15 17:55:54,526][1653645] Updated weights for policy 0, policy_version 497139 (0.0078) [2024-06-15 17:55:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 1018167296. Throughput: 0: 10990.9. Samples: 254575616. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:55:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:56:00,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 44237.0, 300 sec: 44209.0). Total num frames: 1018200064. Throughput: 0: 11138.9. Samples: 254656000. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:56:01,291][1653645] Updated weights for policy 0, policy_version 497185 (0.0012) [2024-06-15 17:56:03,218][1653645] Updated weights for policy 0, policy_version 497235 (0.0014) [2024-06-15 17:56:04,570][1651596] Signal inference workers to stop experience collection... (25850 times) [2024-06-15 17:56:04,643][1653645] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-06-15 17:56:04,777][1651596] Signal inference workers to resume experience collection... (25850 times) [2024-06-15 17:56:04,777][1653645] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-06-15 17:56:04,958][1653645] Updated weights for policy 0, policy_version 497298 (0.0083) [2024-06-15 17:56:05,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1018560512. Throughput: 0: 11286.8. Samples: 254683648. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 17:56:06,972][1653645] Updated weights for policy 0, policy_version 497377 (0.0012) [2024-06-15 17:56:10,958][1648982] Fps is (10 sec: 49150.1, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1018691584. Throughput: 0: 10558.5. Samples: 254740992. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:10,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 17:56:13,581][1653645] Updated weights for policy 0, policy_version 497449 (0.0069) [2024-06-15 17:56:14,947][1653645] Updated weights for policy 0, policy_version 497479 (0.0014) [2024-06-15 17:56:15,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 1018888192. Throughput: 0: 10979.6. Samples: 254815232. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:56:17,411][1653645] Updated weights for policy 0, policy_version 497568 (0.0013) [2024-06-15 17:56:19,296][1653645] Updated weights for policy 0, policy_version 497658 (0.0015) [2024-06-15 17:56:20,958][1648982] Fps is (10 sec: 52431.6, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1019215872. Throughput: 0: 10468.6. Samples: 254832640. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:56:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1019281408. Throughput: 0: 10752.0. Samples: 254912000. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 17:56:26,344][1653645] Updated weights for policy 0, policy_version 497722 (0.0015) [2024-06-15 17:56:27,726][1653645] Updated weights for policy 0, policy_version 497764 (0.0014) [2024-06-15 17:56:29,188][1653645] Updated weights for policy 0, policy_version 497824 (0.0026) [2024-06-15 17:56:30,948][1653645] Updated weights for policy 0, policy_version 497904 (0.0038) [2024-06-15 17:56:30,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1019707392. Throughput: 0: 10456.2. Samples: 254966272. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:30,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 17:56:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43147.8, 300 sec: 43875.8). Total num frames: 1019740160. Throughput: 0: 10501.7. Samples: 255005184. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:56:37,142][1653645] Updated weights for policy 0, policy_version 497955 (0.0013) [2024-06-15 17:56:39,312][1653645] Updated weights for policy 0, policy_version 498034 (0.0012) [2024-06-15 17:56:40,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1020100608. Throughput: 0: 11059.2. Samples: 255073280. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:56:41,511][1653645] Updated weights for policy 0, policy_version 498128 (0.0013) [2024-06-15 17:56:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1020264448. Throughput: 0: 10661.0. Samples: 255135744. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 17:56:49,165][1653645] Updated weights for policy 0, policy_version 498177 (0.0012) [2024-06-15 17:56:49,595][1651596] Signal inference workers to stop experience collection... (25900 times) [2024-06-15 17:56:49,629][1653645] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-06-15 17:56:49,859][1651596] Signal inference workers to resume experience collection... (25900 times) [2024-06-15 17:56:49,859][1653645] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-06-15 17:56:50,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 42052.3, 300 sec: 43986.9). Total num frames: 1020395520. Throughput: 0: 10979.5. Samples: 255177728. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:56:51,481][1653645] Updated weights for policy 0, policy_version 498273 (0.0013) [2024-06-15 17:56:53,550][1653645] Updated weights for policy 0, policy_version 498354 (0.0011) [2024-06-15 17:56:55,085][1653645] Updated weights for policy 0, policy_version 498424 (0.0014) [2024-06-15 17:56:55,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1020788736. Throughput: 0: 10854.4. Samples: 255229440. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:56:55,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:56:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000498432_1020788736.pth... [2024-06-15 17:56:56,040][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000493248_1010171904.pth [2024-06-15 17:57:00,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1020788736. Throughput: 0: 11036.4. Samples: 255311872. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:57:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:57:02,676][1653645] Updated weights for policy 0, policy_version 498498 (0.0015) [2024-06-15 17:57:04,995][1653645] Updated weights for policy 0, policy_version 498592 (0.0014) [2024-06-15 17:57:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1021181952. Throughput: 0: 11229.8. Samples: 255337984. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:57:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:57:06,313][1653645] Updated weights for policy 0, policy_version 498656 (0.0026) [2024-06-15 17:57:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43691.0, 300 sec: 43986.9). Total num frames: 1021313024. Throughput: 0: 10865.8. Samples: 255400960. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:57:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:57:13,016][1653645] Updated weights for policy 0, policy_version 498704 (0.0012) [2024-06-15 17:57:14,148][1653645] Updated weights for policy 0, policy_version 498752 (0.0014) [2024-06-15 17:57:15,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 44782.9, 300 sec: 44098.4). Total num frames: 1021575168. Throughput: 0: 11173.0. Samples: 255469056. Policy #0 lag: (min: 74.0, avg: 159.3, max: 335.0) [2024-06-15 17:57:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:57:16,399][1653645] Updated weights for policy 0, policy_version 498832 (0.0053) [2024-06-15 17:57:18,836][1653645] Updated weights for policy 0, policy_version 498928 (0.0014) [2024-06-15 17:57:20,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1021837312. Throughput: 0: 10752.0. Samples: 255489024. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:20,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:57:25,910][1653645] Updated weights for policy 0, policy_version 498964 (0.0014) [2024-06-15 17:57:25,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 1021870080. Throughput: 0: 10945.4. Samples: 255565824. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:26,007][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:57:27,484][1653645] Updated weights for policy 0, policy_version 499024 (0.0011) [2024-06-15 17:57:27,900][1651596] Signal inference workers to stop experience collection... (25950 times) [2024-06-15 17:57:27,963][1653645] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-06-15 17:57:28,258][1651596] Signal inference workers to resume experience collection... (25950 times) [2024-06-15 17:57:28,259][1653645] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-06-15 17:57:29,559][1653645] Updated weights for policy 0, policy_version 499104 (0.0156) [2024-06-15 17:57:30,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 43986.9). Total num frames: 1022230528. Throughput: 0: 10649.6. Samples: 255614976. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:57:31,885][1653645] Updated weights for policy 0, policy_version 499193 (0.0089) [2024-06-15 17:57:35,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1022361600. Throughput: 0: 10422.1. Samples: 255646720. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 17:57:39,025][1653645] Updated weights for policy 0, policy_version 499252 (0.0017) [2024-06-15 17:57:40,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 41506.1, 300 sec: 43875.8). Total num frames: 1022590976. Throughput: 0: 10991.0. Samples: 255724032. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:57:41,205][1653645] Updated weights for policy 0, policy_version 499328 (0.0012) [2024-06-15 17:57:43,845][1653645] Updated weights for policy 0, policy_version 499444 (0.0148) [2024-06-15 17:57:45,960][1648982] Fps is (10 sec: 52417.7, 60 sec: 43689.1, 300 sec: 43986.6). Total num frames: 1022885888. Throughput: 0: 10205.4. Samples: 255771136. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:45,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 17:57:50,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 42052.2, 300 sec: 43320.4). Total num frames: 1022918656. Throughput: 0: 10581.4. Samples: 255814144. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:50,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:57:51,973][1653645] Updated weights for policy 0, policy_version 499520 (0.0014) [2024-06-15 17:57:53,554][1653645] Updated weights for policy 0, policy_version 499600 (0.0013) [2024-06-15 17:57:55,221][1653645] Updated weights for policy 0, policy_version 499667 (0.0013) [2024-06-15 17:57:55,958][1648982] Fps is (10 sec: 49162.1, 60 sec: 43144.7, 300 sec: 43986.9). Total num frames: 1023377408. Throughput: 0: 10501.7. Samples: 255873536. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:57:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 17:58:00,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 43690.4, 300 sec: 43431.4). Total num frames: 1023410176. Throughput: 0: 10808.8. Samples: 255955456. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:58:02,095][1653645] Updated weights for policy 0, policy_version 499728 (0.0013) [2024-06-15 17:58:04,451][1653645] Updated weights for policy 0, policy_version 499824 (0.0013) [2024-06-15 17:58:05,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 42598.6, 300 sec: 43764.7). Total num frames: 1023737856. Throughput: 0: 11036.5. Samples: 255985664. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:05,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 17:58:06,091][1651596] Signal inference workers to stop experience collection... (26000 times) [2024-06-15 17:58:06,139][1653645] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-06-15 17:58:06,376][1651596] Signal inference workers to resume experience collection... (26000 times) [2024-06-15 17:58:06,377][1653645] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-06-15 17:58:06,479][1653645] Updated weights for policy 0, policy_version 499904 (0.0112) [2024-06-15 17:58:07,778][1653645] Updated weights for policy 0, policy_version 499967 (0.0017) [2024-06-15 17:58:10,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1023934464. Throughput: 0: 10615.5. Samples: 256043520. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:58:15,050][1653645] Updated weights for policy 0, policy_version 500020 (0.0013) [2024-06-15 17:58:15,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 42052.1, 300 sec: 43542.5). Total num frames: 1024098304. Throughput: 0: 11207.0. Samples: 256119296. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:15,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 17:58:16,916][1653645] Updated weights for policy 0, policy_version 500098 (0.0015) [2024-06-15 17:58:19,390][1653645] Updated weights for policy 0, policy_version 500195 (0.0016) [2024-06-15 17:58:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1024458752. Throughput: 0: 11002.3. Samples: 256141824. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:20,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 17:58:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.5, 300 sec: 43209.3). Total num frames: 1024491520. Throughput: 0: 10888.5. Samples: 256214016. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:25,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 17:58:25,969][1653645] Updated weights for policy 0, policy_version 500256 (0.0012) [2024-06-15 17:58:27,329][1653645] Updated weights for policy 0, policy_version 500305 (0.0015) [2024-06-15 17:58:28,942][1653645] Updated weights for policy 0, policy_version 500374 (0.0098) [2024-06-15 17:58:30,175][1653645] Updated weights for policy 0, policy_version 500432 (0.0012) [2024-06-15 17:58:30,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 45329.1, 300 sec: 44321.2). Total num frames: 1024950272. Throughput: 0: 11173.5. Samples: 256273920. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 17:58:31,231][1653645] Updated weights for policy 0, policy_version 500478 (0.0025) [2024-06-15 17:58:35,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 1024983040. Throughput: 0: 10968.2. Samples: 256307712. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:58:39,168][1653645] Updated weights for policy 0, policy_version 500532 (0.0012) [2024-06-15 17:58:40,678][1653645] Updated weights for policy 0, policy_version 500593 (0.0012) [2024-06-15 17:58:40,958][1648982] Fps is (10 sec: 29489.8, 60 sec: 44236.6, 300 sec: 43542.6). Total num frames: 1025245184. Throughput: 0: 11286.7. Samples: 256381440. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:40,961][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 17:58:42,552][1653645] Updated weights for policy 0, policy_version 500672 (0.0013) [2024-06-15 17:58:43,944][1653645] Updated weights for policy 0, policy_version 500732 (0.0187) [2024-06-15 17:58:45,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43692.1, 300 sec: 43764.7). Total num frames: 1025507328. Throughput: 0: 10752.0. Samples: 256439296. Policy #0 lag: (min: 111.0, avg: 238.7, max: 367.0) [2024-06-15 17:58:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 17:58:50,267][1651596] Signal inference workers to stop experience collection... (26050 times) [2024-06-15 17:58:50,306][1653645] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-06-15 17:58:50,529][1651596] Signal inference workers to resume experience collection... (26050 times) [2024-06-15 17:58:50,530][1653645] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-06-15 17:58:50,958][1648982] Fps is (10 sec: 32769.0, 60 sec: 44236.9, 300 sec: 42876.3). Total num frames: 1025572864. Throughput: 0: 10865.8. Samples: 256474624. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:58:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 17:58:51,564][1653645] Updated weights for policy 0, policy_version 500803 (0.0013) [2024-06-15 17:58:53,132][1653645] Updated weights for policy 0, policy_version 500866 (0.0012) [2024-06-15 17:58:55,182][1653645] Updated weights for policy 0, policy_version 500945 (0.0013) [2024-06-15 17:58:55,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 1025966080. Throughput: 0: 10911.3. Samples: 256534528. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:58:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 17:58:56,336][1653645] Updated weights for policy 0, policy_version 500991 (0.0015) [2024-06-15 17:58:56,343][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000500992_1026031616.pth... [2024-06-15 17:58:56,409][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000495872_1015545856.pth [2024-06-15 17:59:00,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 1026031616. Throughput: 0: 10717.9. Samples: 256601600. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:00,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 17:59:03,775][1653645] Updated weights for policy 0, policy_version 501046 (0.0132) [2024-06-15 17:59:05,579][1653645] Updated weights for policy 0, policy_version 501121 (0.0013) [2024-06-15 17:59:05,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 1026326528. Throughput: 0: 11002.3. Samples: 256636928. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 17:59:07,218][1653645] Updated weights for policy 0, policy_version 501184 (0.0010) [2024-06-15 17:59:08,189][1653645] Updated weights for policy 0, policy_version 501221 (0.0013) [2024-06-15 17:59:10,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1026555904. Throughput: 0: 10683.8. Samples: 256694784. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 17:59:14,591][1653645] Updated weights for policy 0, policy_version 501264 (0.0125) [2024-06-15 17:59:15,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43144.7, 300 sec: 43098.3). Total num frames: 1026686976. Throughput: 0: 10979.5. Samples: 256768000. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:15,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 17:59:16,344][1653645] Updated weights for policy 0, policy_version 501344 (0.0015) [2024-06-15 17:59:18,595][1653645] Updated weights for policy 0, policy_version 501424 (0.0104) [2024-06-15 17:59:20,135][1653645] Updated weights for policy 0, policy_version 501493 (0.0014) [2024-06-15 17:59:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1027080192. Throughput: 0: 10808.9. Samples: 256794112. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:20,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 17:59:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1027080192. Throughput: 0: 10683.8. Samples: 256862208. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 17:59:26,940][1653645] Updated weights for policy 0, policy_version 501536 (0.0013) [2024-06-15 17:59:27,837][1653645] Updated weights for policy 0, policy_version 501568 (0.0081) [2024-06-15 17:59:28,795][1651596] Signal inference workers to stop experience collection... (26100 times) [2024-06-15 17:59:28,839][1653645] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-06-15 17:59:29,028][1651596] Signal inference workers to resume experience collection... (26100 times) [2024-06-15 17:59:29,029][1653645] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-06-15 17:59:29,638][1653645] Updated weights for policy 0, policy_version 501648 (0.0013) [2024-06-15 17:59:30,959][1648982] Fps is (10 sec: 39317.0, 60 sec: 42051.4, 300 sec: 43986.7). Total num frames: 1027473408. Throughput: 0: 10785.9. Samples: 256924672. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:30,960][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 17:59:31,576][1653645] Updated weights for policy 0, policy_version 501730 (0.0134) [2024-06-15 17:59:35,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 1027604480. Throughput: 0: 10592.7. Samples: 256951296. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:59:38,984][1653645] Updated weights for policy 0, policy_version 501792 (0.0013) [2024-06-15 17:59:40,958][1648982] Fps is (10 sec: 32771.0, 60 sec: 42598.5, 300 sec: 43320.4). Total num frames: 1027801088. Throughput: 0: 10968.1. Samples: 257028096. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:40,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 17:59:41,284][1653645] Updated weights for policy 0, policy_version 501877 (0.0015) [2024-06-15 17:59:43,709][1653645] Updated weights for policy 0, policy_version 501968 (0.0136) [2024-06-15 17:59:45,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1028128768. Throughput: 0: 10547.3. Samples: 257076224. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 17:59:50,407][1653645] Updated weights for policy 0, policy_version 502019 (0.0013) [2024-06-15 17:59:50,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 43690.7, 300 sec: 42876.2). Total num frames: 1028194304. Throughput: 0: 10672.4. Samples: 257117184. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:50,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 17:59:51,489][1653645] Updated weights for policy 0, policy_version 502078 (0.0012) [2024-06-15 17:59:54,254][1653645] Updated weights for policy 0, policy_version 502176 (0.0115) [2024-06-15 17:59:55,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 1028587520. Throughput: 0: 10956.8. Samples: 257187840. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 17:59:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 17:59:56,137][1653645] Updated weights for policy 0, policy_version 502256 (0.0030) [2024-06-15 18:00:00,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43691.0, 300 sec: 43209.3). Total num frames: 1028653056. Throughput: 0: 10797.5. Samples: 257253888. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 18:00:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:00:02,963][1653645] Updated weights for policy 0, policy_version 502306 (0.0013) [2024-06-15 18:00:04,743][1653645] Updated weights for policy 0, policy_version 502369 (0.0023) [2024-06-15 18:00:05,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 43690.4, 300 sec: 43653.6). Total num frames: 1028947968. Throughput: 0: 11059.1. Samples: 257291776. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 18:00:05,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:00:06,018][1653645] Updated weights for policy 0, policy_version 502432 (0.0012) [2024-06-15 18:00:07,640][1653645] Updated weights for policy 0, policy_version 502496 (0.0012) [2024-06-15 18:00:07,755][1651596] Signal inference workers to stop experience collection... (26150 times) [2024-06-15 18:00:07,806][1653645] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-06-15 18:00:08,032][1651596] Signal inference workers to resume experience collection... (26150 times) [2024-06-15 18:00:08,042][1653645] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-06-15 18:00:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1029177344. Throughput: 0: 10877.2. Samples: 257351680. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 18:00:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:00:13,809][1653645] Updated weights for policy 0, policy_version 502544 (0.0105) [2024-06-15 18:00:14,768][1653645] Updated weights for policy 0, policy_version 502592 (0.0011) [2024-06-15 18:00:15,960][1648982] Fps is (10 sec: 39322.8, 60 sec: 44236.8, 300 sec: 43209.3). Total num frames: 1029341184. Throughput: 0: 11241.5. Samples: 257430528. Policy #0 lag: (min: 15.0, avg: 75.5, max: 271.0) [2024-06-15 18:00:15,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:00:17,067][1653645] Updated weights for policy 0, policy_version 502672 (0.0117) [2024-06-15 18:00:17,961][1653645] Updated weights for policy 0, policy_version 502710 (0.0017) [2024-06-15 18:00:19,620][1653645] Updated weights for policy 0, policy_version 502780 (0.0013) [2024-06-15 18:00:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1029701632. Throughput: 0: 11195.8. Samples: 257455104. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:00:25,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.6, 300 sec: 42765.0). Total num frames: 1029734400. Throughput: 0: 11104.7. Samples: 257527808. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:25,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:00:27,833][1653645] Updated weights for policy 0, policy_version 502850 (0.0126) [2024-06-15 18:00:29,370][1653645] Updated weights for policy 0, policy_version 502916 (0.0013) [2024-06-15 18:00:30,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43691.5, 300 sec: 43876.5). Total num frames: 1030094848. Throughput: 0: 11241.2. Samples: 257582080. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:00:31,564][1653645] Updated weights for policy 0, policy_version 502996 (0.0040) [2024-06-15 18:00:35,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 1030225920. Throughput: 0: 10899.9. Samples: 257607680. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:00:38,715][1653645] Updated weights for policy 0, policy_version 503088 (0.0014) [2024-06-15 18:00:40,648][1653645] Updated weights for policy 0, policy_version 503152 (0.0012) [2024-06-15 18:00:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.9, 300 sec: 43431.5). Total num frames: 1030455296. Throughput: 0: 10979.5. Samples: 257681920. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:00:42,045][1653645] Updated weights for policy 0, policy_version 503203 (0.0014) [2024-06-15 18:00:44,369][1653645] Updated weights for policy 0, policy_version 503290 (0.0100) [2024-06-15 18:00:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 1030750208. Throughput: 0: 10740.6. Samples: 257737216. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:00:50,556][1653645] Updated weights for policy 0, policy_version 503323 (0.0013) [2024-06-15 18:00:50,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 1030815744. Throughput: 0: 10763.5. Samples: 257776128. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:00:52,098][1653645] Updated weights for policy 0, policy_version 503376 (0.0121) [2024-06-15 18:00:53,490][1651596] Signal inference workers to stop experience collection... (26200 times) [2024-06-15 18:00:53,539][1653645] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-06-15 18:00:53,762][1651596] Signal inference workers to resume experience collection... (26200 times) [2024-06-15 18:00:53,763][1653645] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-06-15 18:00:54,164][1653645] Updated weights for policy 0, policy_version 503456 (0.0011) [2024-06-15 18:00:55,535][1653645] Updated weights for policy 0, policy_version 503505 (0.0013) [2024-06-15 18:00:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 1031208960. Throughput: 0: 10820.2. Samples: 257838592. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:00:55,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:00:56,429][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000503552_1031274496.pth... [2024-06-15 18:00:56,475][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000498432_1020788736.pth [2024-06-15 18:01:00,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 1031274496. Throughput: 0: 10626.8. Samples: 257908736. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:00,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:01:02,082][1653645] Updated weights for policy 0, policy_version 503571 (0.0015) [2024-06-15 18:01:02,977][1653645] Updated weights for policy 0, policy_version 503614 (0.0027) [2024-06-15 18:01:05,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1031536640. Throughput: 0: 10899.9. Samples: 257945600. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:01:05,980][1653645] Updated weights for policy 0, policy_version 503696 (0.0012) [2024-06-15 18:01:08,312][1653645] Updated weights for policy 0, policy_version 503798 (0.0203) [2024-06-15 18:01:10,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1031798784. Throughput: 0: 10490.3. Samples: 257999872. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:01:15,113][1653645] Updated weights for policy 0, policy_version 503840 (0.0025) [2024-06-15 18:01:15,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 1031929856. Throughput: 0: 10888.5. Samples: 258072064. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:01:16,577][1653645] Updated weights for policy 0, policy_version 503874 (0.0028) [2024-06-15 18:01:18,551][1653645] Updated weights for policy 0, policy_version 503956 (0.0018) [2024-06-15 18:01:19,658][1653645] Updated weights for policy 0, policy_version 504016 (0.0012) [2024-06-15 18:01:20,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 1032323072. Throughput: 0: 10877.1. Samples: 258097152. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:01:25,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1032323072. Throughput: 0: 10763.3. Samples: 258166272. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:01:27,184][1653645] Updated weights for policy 0, policy_version 504098 (0.0108) [2024-06-15 18:01:30,254][1653645] Updated weights for policy 0, policy_version 504181 (0.0013) [2024-06-15 18:01:30,958][1648982] Fps is (10 sec: 29491.6, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 1032617984. Throughput: 0: 10922.7. Samples: 258228736. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:01:31,859][1653645] Updated weights for policy 0, policy_version 504243 (0.0012) [2024-06-15 18:01:32,971][1653645] Updated weights for policy 0, policy_version 504293 (0.0012) [2024-06-15 18:01:35,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.5, 300 sec: 43209.3). Total num frames: 1032847360. Throughput: 0: 10660.9. Samples: 258255872. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:01:38,767][1651596] Signal inference workers to stop experience collection... (26250 times) [2024-06-15 18:01:38,861][1653645] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-06-15 18:01:38,862][1653645] Updated weights for policy 0, policy_version 504338 (0.0048) [2024-06-15 18:01:39,026][1651596] Signal inference workers to resume experience collection... (26250 times) [2024-06-15 18:01:39,027][1653645] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-06-15 18:01:40,773][1653645] Updated weights for policy 0, policy_version 504391 (0.0013) [2024-06-15 18:01:40,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 42598.2, 300 sec: 43209.3). Total num frames: 1033011200. Throughput: 0: 10979.5. Samples: 258332672. Policy #0 lag: (min: 76.0, avg: 176.9, max: 351.0) [2024-06-15 18:01:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:01:42,556][1653645] Updated weights for policy 0, policy_version 504455 (0.0012) [2024-06-15 18:01:44,046][1653645] Updated weights for policy 0, policy_version 504513 (0.0011) [2024-06-15 18:01:45,147][1653645] Updated weights for policy 0, policy_version 504576 (0.0013) [2024-06-15 18:01:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1033371648. Throughput: 0: 10763.4. Samples: 258393088. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:01:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:01:50,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 1033437184. Throughput: 0: 10706.5. Samples: 258427392. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:01:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:01:51,331][1653645] Updated weights for policy 0, policy_version 504636 (0.0048) [2024-06-15 18:01:53,011][1653645] Updated weights for policy 0, policy_version 504677 (0.0018) [2024-06-15 18:01:54,702][1653645] Updated weights for policy 0, policy_version 504738 (0.0011) [2024-06-15 18:01:55,958][1648982] Fps is (10 sec: 45872.5, 60 sec: 43690.3, 300 sec: 44208.9). Total num frames: 1033830400. Throughput: 0: 10979.4. Samples: 258493952. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:01:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:01:56,287][1653645] Updated weights for policy 0, policy_version 504816 (0.0012) [2024-06-15 18:02:00,970][1648982] Fps is (10 sec: 45817.4, 60 sec: 43681.6, 300 sec: 43096.4). Total num frames: 1033895936. Throughput: 0: 10885.5. Samples: 258562048. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:00,971][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:02:02,747][1653645] Updated weights for policy 0, policy_version 504887 (0.0013) [2024-06-15 18:02:04,642][1653645] Updated weights for policy 0, policy_version 504928 (0.0030) [2024-06-15 18:02:05,750][1653645] Updated weights for policy 0, policy_version 504965 (0.0011) [2024-06-15 18:02:05,958][1648982] Fps is (10 sec: 36047.2, 60 sec: 44237.0, 300 sec: 43653.6). Total num frames: 1034190848. Throughput: 0: 11127.5. Samples: 258597888. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:02:06,819][1653645] Updated weights for policy 0, policy_version 505027 (0.0102) [2024-06-15 18:02:10,961][1648982] Fps is (10 sec: 52495.1, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1034420224. Throughput: 0: 10968.2. Samples: 258659840. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:10,962][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:02:13,837][1653645] Updated weights for policy 0, policy_version 505104 (0.0017) [2024-06-15 18:02:14,996][1653645] Updated weights for policy 0, policy_version 505148 (0.0011) [2024-06-15 18:02:15,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 44236.7, 300 sec: 43209.3). Total num frames: 1034584064. Throughput: 0: 11229.8. Samples: 258734080. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:02:16,591][1653645] Updated weights for policy 0, policy_version 505200 (0.0012) [2024-06-15 18:02:17,893][1651596] Signal inference workers to stop experience collection... (26300 times) [2024-06-15 18:02:17,948][1653645] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-06-15 18:02:18,095][1651596] Signal inference workers to resume experience collection... (26300 times) [2024-06-15 18:02:18,097][1653645] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-06-15 18:02:18,099][1653645] Updated weights for policy 0, policy_version 505280 (0.0014) [2024-06-15 18:02:19,190][1653645] Updated weights for policy 0, policy_version 505340 (0.0012) [2024-06-15 18:02:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1034944512. Throughput: 0: 11229.9. Samples: 258761216. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:02:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 1034944512. Throughput: 0: 11150.2. Samples: 258834432. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:02:26,915][1653645] Updated weights for policy 0, policy_version 505392 (0.0026) [2024-06-15 18:02:27,849][1653645] Updated weights for policy 0, policy_version 505427 (0.0014) [2024-06-15 18:02:28,769][1653645] Updated weights for policy 0, policy_version 505476 (0.0089) [2024-06-15 18:02:30,332][1653645] Updated weights for policy 0, policy_version 505552 (0.0020) [2024-06-15 18:02:30,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 46421.3, 300 sec: 44209.0). Total num frames: 1035403264. Throughput: 0: 11207.1. Samples: 258897408. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:02:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 1035468800. Throughput: 0: 11161.5. Samples: 258929664. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:35,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:02:37,045][1653645] Updated weights for policy 0, policy_version 505616 (0.0013) [2024-06-15 18:02:39,200][1653645] Updated weights for policy 0, policy_version 505698 (0.0013) [2024-06-15 18:02:40,754][1653645] Updated weights for policy 0, policy_version 505760 (0.0015) [2024-06-15 18:02:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 46421.6, 300 sec: 43765.0). Total num frames: 1035796480. Throughput: 0: 11309.7. Samples: 259002880. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:02:42,718][1653645] Updated weights for policy 0, policy_version 505840 (0.0012) [2024-06-15 18:02:45,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1035993088. Throughput: 0: 11369.6. Samples: 259073536. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:02:48,166][1653645] Updated weights for policy 0, policy_version 505888 (0.0012) [2024-06-15 18:02:50,461][1653645] Updated weights for policy 0, policy_version 505952 (0.0012) [2024-06-15 18:02:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.4, 300 sec: 43542.6). Total num frames: 1036222464. Throughput: 0: 11480.2. Samples: 259114496. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:50,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:02:52,759][1653645] Updated weights for policy 0, policy_version 506048 (0.0026) [2024-06-15 18:02:54,100][1653645] Updated weights for policy 0, policy_version 506100 (0.0011) [2024-06-15 18:02:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44783.3, 300 sec: 44431.2). Total num frames: 1036517376. Throughput: 0: 11366.4. Samples: 259171328. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:02:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:02:55,975][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000506112_1036517376.pth... [2024-06-15 18:02:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000500992_1026031616.pth [2024-06-15 18:02:59,842][1653645] Updated weights for policy 0, policy_version 506160 (0.0013) [2024-06-15 18:03:00,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45884.9, 300 sec: 43764.7). Total num frames: 1036648448. Throughput: 0: 11491.6. Samples: 259251200. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:03:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:03:01,905][1651596] Signal inference workers to stop experience collection... (26350 times) [2024-06-15 18:03:01,942][1653645] Updated weights for policy 0, policy_version 506210 (0.0014) [2024-06-15 18:03:02,009][1653645] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-06-15 18:03:02,160][1651596] Signal inference workers to resume experience collection... (26350 times) [2024-06-15 18:03:02,161][1653645] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-06-15 18:03:02,619][1653645] Updated weights for policy 0, policy_version 506241 (0.0013) [2024-06-15 18:03:04,569][1653645] Updated weights for policy 0, policy_version 506321 (0.0011) [2024-06-15 18:03:05,780][1653645] Updated weights for policy 0, policy_version 506368 (0.0012) [2024-06-15 18:03:05,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 47513.3, 300 sec: 44431.1). Total num frames: 1037041664. Throughput: 0: 11593.9. Samples: 259282944. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:03:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:03:10,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 1037074432. Throughput: 0: 11559.9. Samples: 259354624. Policy #0 lag: (min: 141.0, avg: 232.5, max: 397.0) [2024-06-15 18:03:10,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:03:11,615][1653645] Updated weights for policy 0, policy_version 506431 (0.0154) [2024-06-15 18:03:13,248][1653645] Updated weights for policy 0, policy_version 506495 (0.0013) [2024-06-15 18:03:15,936][1653645] Updated weights for policy 0, policy_version 506577 (0.0015) [2024-06-15 18:03:15,958][1648982] Fps is (10 sec: 42600.0, 60 sec: 48060.0, 300 sec: 44098.0). Total num frames: 1037467648. Throughput: 0: 11480.2. Samples: 259414016. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:03:16,945][1653645] Updated weights for policy 0, policy_version 506622 (0.0011) [2024-06-15 18:03:20,959][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1037565952. Throughput: 0: 11605.4. Samples: 259451904. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:20,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:03:23,277][1653645] Updated weights for policy 0, policy_version 506672 (0.0014) [2024-06-15 18:03:24,592][1653645] Updated weights for policy 0, policy_version 506726 (0.0012) [2024-06-15 18:03:25,598][1653645] Updated weights for policy 0, policy_version 506772 (0.0011) [2024-06-15 18:03:25,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 49152.3, 300 sec: 43875.8). Total num frames: 1037893632. Throughput: 0: 11650.8. Samples: 259527168. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:03:27,798][1653645] Updated weights for policy 0, policy_version 506864 (0.0136) [2024-06-15 18:03:30,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 1038090240. Throughput: 0: 11559.8. Samples: 259593728. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:03:33,405][1653645] Updated weights for policy 0, policy_version 506896 (0.0011) [2024-06-15 18:03:35,335][1653645] Updated weights for policy 0, policy_version 506976 (0.0095) [2024-06-15 18:03:35,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 48060.0, 300 sec: 44431.2). Total num frames: 1038352384. Throughput: 0: 11616.7. Samples: 259637248. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:35,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:03:36,697][1653645] Updated weights for policy 0, policy_version 507027 (0.0015) [2024-06-15 18:03:38,241][1653645] Updated weights for policy 0, policy_version 507104 (0.0015) [2024-06-15 18:03:40,958][1648982] Fps is (10 sec: 52425.2, 60 sec: 46966.8, 300 sec: 44431.1). Total num frames: 1038614528. Throughput: 0: 11605.1. Samples: 259693568. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:03:44,822][1651596] Signal inference workers to stop experience collection... (26400 times) [2024-06-15 18:03:44,877][1653645] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-06-15 18:03:44,998][1651596] Signal inference workers to resume experience collection... (26400 times) [2024-06-15 18:03:44,999][1653645] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-06-15 18:03:45,146][1653645] Updated weights for policy 0, policy_version 507169 (0.0051) [2024-06-15 18:03:45,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1038778368. Throughput: 0: 11650.8. Samples: 259775488. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:03:46,215][1653645] Updated weights for policy 0, policy_version 507232 (0.0012) [2024-06-15 18:03:47,573][1653645] Updated weights for policy 0, policy_version 507280 (0.0015) [2024-06-15 18:03:49,966][1653645] Updated weights for policy 0, policy_version 507386 (0.0013) [2024-06-15 18:03:50,958][1648982] Fps is (10 sec: 52433.1, 60 sec: 48605.8, 300 sec: 44653.3). Total num frames: 1039138816. Throughput: 0: 11616.8. Samples: 259805696. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:03:55,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 1039204352. Throughput: 0: 11719.2. Samples: 259881984. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:03:55,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:03:56,240][1653645] Updated weights for policy 0, policy_version 507452 (0.0012) [2024-06-15 18:03:57,470][1653645] Updated weights for policy 0, policy_version 507489 (0.0013) [2024-06-15 18:03:59,091][1653645] Updated weights for policy 0, policy_version 507559 (0.0013) [2024-06-15 18:04:00,469][1653645] Updated weights for policy 0, policy_version 507600 (0.0022) [2024-06-15 18:04:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 49152.0, 300 sec: 44986.6). Total num frames: 1039597568. Throughput: 0: 11923.9. Samples: 259950592. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:04:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1039663104. Throughput: 0: 11889.8. Samples: 259986944. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:04:07,137][1653645] Updated weights for policy 0, policy_version 507686 (0.0013) [2024-06-15 18:04:08,467][1653645] Updated weights for policy 0, policy_version 507744 (0.0011) [2024-06-15 18:04:09,596][1653645] Updated weights for policy 0, policy_version 507779 (0.0058) [2024-06-15 18:04:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 49698.2, 300 sec: 45319.8). Total num frames: 1040056320. Throughput: 0: 11776.0. Samples: 260057088. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:04:11,450][1653645] Updated weights for policy 0, policy_version 507841 (0.0017) [2024-06-15 18:04:15,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1040187392. Throughput: 0: 11776.0. Samples: 260123648. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:15,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:04:17,590][1653645] Updated weights for policy 0, policy_version 507924 (0.0014) [2024-06-15 18:04:19,103][1653645] Updated weights for policy 0, policy_version 507974 (0.0050) [2024-06-15 18:04:20,741][1653645] Updated weights for policy 0, policy_version 508035 (0.0012) [2024-06-15 18:04:20,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 1040482304. Throughput: 0: 11616.7. Samples: 260160000. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:04:21,757][1653645] Updated weights for policy 0, policy_version 508091 (0.0054) [2024-06-15 18:04:23,770][1653645] Updated weights for policy 0, policy_version 508131 (0.0013) [2024-06-15 18:04:24,453][1653645] Updated weights for policy 0, policy_version 508160 (0.0014) [2024-06-15 18:04:25,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 46967.3, 300 sec: 44875.6). Total num frames: 1040711680. Throughput: 0: 11924.1. Samples: 260230144. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:25,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:04:27,863][1651596] Signal inference workers to stop experience collection... (26450 times) [2024-06-15 18:04:27,958][1653645] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-06-15 18:04:28,086][1651596] Signal inference workers to resume experience collection... (26450 times) [2024-06-15 18:04:28,087][1653645] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-06-15 18:04:28,364][1653645] Updated weights for policy 0, policy_version 508208 (0.0053) [2024-06-15 18:04:30,111][1653645] Updated weights for policy 0, policy_version 508246 (0.0013) [2024-06-15 18:04:30,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 48059.9, 300 sec: 45319.8). Total num frames: 1040973824. Throughput: 0: 11867.1. Samples: 260309504. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:30,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:04:31,627][1653645] Updated weights for policy 0, policy_version 508308 (0.0012) [2024-06-15 18:04:33,813][1653645] Updated weights for policy 0, policy_version 508356 (0.0013) [2024-06-15 18:04:35,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 48059.7, 300 sec: 45542.0). Total num frames: 1041235968. Throughput: 0: 12037.7. Samples: 260347392. Policy #0 lag: (min: 15.0, avg: 109.8, max: 271.0) [2024-06-15 18:04:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:04:38,404][1653645] Updated weights for policy 0, policy_version 508419 (0.0033) [2024-06-15 18:04:40,476][1653645] Updated weights for policy 0, policy_version 508482 (0.0013) [2024-06-15 18:04:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46422.0, 300 sec: 44986.6). Total num frames: 1041399808. Throughput: 0: 11855.6. Samples: 260415488. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:04:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:04:43,011][1653645] Updated weights for policy 0, policy_version 508567 (0.0017) [2024-06-15 18:04:43,845][1653645] Updated weights for policy 0, policy_version 508606 (0.0065) [2024-06-15 18:04:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 47513.7, 300 sec: 45542.0). Total num frames: 1041629184. Throughput: 0: 11867.0. Samples: 260484608. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:04:45,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:04:47,209][1653645] Updated weights for policy 0, policy_version 508672 (0.0015) [2024-06-15 18:04:50,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1041858560. Throughput: 0: 11821.5. Samples: 260518912. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:04:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:04:51,052][1653645] Updated weights for policy 0, policy_version 508733 (0.0128) [2024-06-15 18:04:52,377][1653645] Updated weights for policy 0, policy_version 508794 (0.0014) [2024-06-15 18:04:55,211][1653645] Updated weights for policy 0, policy_version 508851 (0.0013) [2024-06-15 18:04:55,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 49151.7, 300 sec: 45764.1). Total num frames: 1042153472. Throughput: 0: 11753.2. Samples: 260585984. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:04:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:04:56,036][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000508864_1042153472.pth... [2024-06-15 18:04:56,085][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000503552_1031274496.pth [2024-06-15 18:04:58,425][1653645] Updated weights for policy 0, policy_version 508912 (0.0012) [2024-06-15 18:05:00,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.8, 300 sec: 45208.8). Total num frames: 1042284544. Throughput: 0: 11753.2. Samples: 260652544. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:00,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 18:05:02,152][1653645] Updated weights for policy 0, policy_version 508965 (0.0012) [2024-06-15 18:05:04,029][1653645] Updated weights for policy 0, policy_version 509047 (0.0015) [2024-06-15 18:05:05,958][1648982] Fps is (10 sec: 42600.1, 60 sec: 48605.9, 300 sec: 45430.9). Total num frames: 1042579456. Throughput: 0: 11685.0. Samples: 260685824. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:05:06,622][1653645] Updated weights for policy 0, policy_version 509090 (0.0011) [2024-06-15 18:05:09,681][1653645] Updated weights for policy 0, policy_version 509143 (0.0011) [2024-06-15 18:05:10,552][1653645] Updated weights for policy 0, policy_version 509181 (0.0010) [2024-06-15 18:05:10,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1042808832. Throughput: 0: 11741.9. Samples: 260758528. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:05:14,593][1651596] Signal inference workers to stop experience collection... (26500 times) [2024-06-15 18:05:14,676][1653645] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-06-15 18:05:14,856][1651596] Signal inference workers to resume experience collection... (26500 times) [2024-06-15 18:05:14,856][1653645] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-06-15 18:05:15,040][1653645] Updated weights for policy 0, policy_version 509266 (0.0013) [2024-06-15 18:05:15,761][1653645] Updated weights for policy 0, policy_version 509312 (0.0082) [2024-06-15 18:05:15,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 48059.8, 300 sec: 45319.8). Total num frames: 1043070976. Throughput: 0: 11525.7. Samples: 260828160. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:15,967][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:05:17,447][1653645] Updated weights for policy 0, policy_version 509370 (0.0016) [2024-06-15 18:05:20,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46421.4, 300 sec: 45875.2). Total num frames: 1043267584. Throughput: 0: 11446.0. Samples: 260862464. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:05:21,507][1653645] Updated weights for policy 0, policy_version 509432 (0.0012) [2024-06-15 18:05:23,735][1653645] Updated weights for policy 0, policy_version 509475 (0.0018) [2024-06-15 18:05:25,633][1653645] Updated weights for policy 0, policy_version 509514 (0.0017) [2024-06-15 18:05:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.5, 300 sec: 45430.9). Total num frames: 1043496960. Throughput: 0: 11616.7. Samples: 260938240. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:05:27,501][1653645] Updated weights for policy 0, policy_version 509600 (0.0013) [2024-06-15 18:05:30,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 1043726336. Throughput: 0: 11605.3. Samples: 261006848. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:05:32,098][1653645] Updated weights for policy 0, policy_version 509648 (0.0013) [2024-06-15 18:05:33,214][1653645] Updated weights for policy 0, policy_version 509691 (0.0013) [2024-06-15 18:05:34,758][1653645] Updated weights for policy 0, policy_version 509730 (0.0013) [2024-06-15 18:05:35,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.2, 300 sec: 45875.2). Total num frames: 1043988480. Throughput: 0: 11662.2. Samples: 261043712. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:05:37,030][1653645] Updated weights for policy 0, policy_version 509776 (0.0013) [2024-06-15 18:05:38,387][1653645] Updated weights for policy 0, policy_version 509826 (0.0014) [2024-06-15 18:05:39,528][1653645] Updated weights for policy 0, policy_version 509888 (0.0012) [2024-06-15 18:05:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 47513.4, 300 sec: 45764.1). Total num frames: 1044250624. Throughput: 0: 11639.5. Samples: 261109760. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:40,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:05:44,144][1653645] Updated weights for policy 0, policy_version 509944 (0.0014) [2024-06-15 18:05:45,858][1653645] Updated weights for policy 0, policy_version 509984 (0.0014) [2024-06-15 18:05:45,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 1044447232. Throughput: 0: 11821.5. Samples: 261184512. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:05:48,868][1653645] Updated weights for policy 0, policy_version 510039 (0.0014) [2024-06-15 18:05:50,915][1653645] Updated weights for policy 0, policy_version 510128 (0.0014) [2024-06-15 18:05:50,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 48059.7, 300 sec: 45875.2). Total num frames: 1044742144. Throughput: 0: 11787.4. Samples: 261216256. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:05:55,023][1653645] Updated weights for policy 0, policy_version 510176 (0.0014) [2024-06-15 18:05:55,957][1648982] Fps is (10 sec: 45876.5, 60 sec: 45875.6, 300 sec: 46208.5). Total num frames: 1044905984. Throughput: 0: 11741.9. Samples: 261286912. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:05:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:05:57,003][1653645] Updated weights for policy 0, policy_version 510224 (0.0013) [2024-06-15 18:05:57,952][1653645] Updated weights for policy 0, policy_version 510272 (0.0013) [2024-06-15 18:06:00,671][1651596] Signal inference workers to stop experience collection... (26550 times) [2024-06-15 18:06:00,715][1653645] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-06-15 18:06:00,848][1651596] Signal inference workers to resume experience collection... (26550 times) [2024-06-15 18:06:00,862][1653645] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-06-15 18:06:00,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 47513.8, 300 sec: 46097.4). Total num frames: 1045135360. Throughput: 0: 11685.0. Samples: 261353984. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:06:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:06:02,047][1653645] Updated weights for policy 0, policy_version 510354 (0.0012) [2024-06-15 18:06:05,958][1648982] Fps is (10 sec: 39319.6, 60 sec: 45328.8, 300 sec: 45764.1). Total num frames: 1045299200. Throughput: 0: 11514.2. Samples: 261380608. Policy #0 lag: (min: 1.0, avg: 106.7, max: 257.0) [2024-06-15 18:06:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:06:06,411][1653645] Updated weights for policy 0, policy_version 510408 (0.0020) [2024-06-15 18:06:07,616][1653645] Updated weights for policy 0, policy_version 510460 (0.0011) [2024-06-15 18:06:09,399][1653645] Updated weights for policy 0, policy_version 510528 (0.0013) [2024-06-15 18:06:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 1045561344. Throughput: 0: 11355.0. Samples: 261449216. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:06:13,705][1653645] Updated weights for policy 0, policy_version 510608 (0.0015) [2024-06-15 18:06:14,892][1653645] Updated weights for policy 0, policy_version 510656 (0.0011) [2024-06-15 18:06:15,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 45875.2, 300 sec: 45764.2). Total num frames: 1045823488. Throughput: 0: 11309.6. Samples: 261515776. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:15,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:06:18,775][1653645] Updated weights for policy 0, policy_version 510715 (0.0060) [2024-06-15 18:06:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46967.5, 300 sec: 46652.8). Total num frames: 1046085632. Throughput: 0: 11468.8. Samples: 261559808. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:06:23,212][1653645] Updated weights for policy 0, policy_version 510787 (0.0013) [2024-06-15 18:06:24,303][1653645] Updated weights for policy 0, policy_version 510839 (0.0013) [2024-06-15 18:06:25,700][1653645] Updated weights for policy 0, policy_version 510901 (0.0014) [2024-06-15 18:06:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 47513.6, 300 sec: 46541.7). Total num frames: 1046347776. Throughput: 0: 11491.6. Samples: 261626880. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:06:28,983][1653645] Updated weights for policy 0, policy_version 510931 (0.0011) [2024-06-15 18:06:30,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45875.2, 300 sec: 46208.4). Total num frames: 1046478848. Throughput: 0: 11377.8. Samples: 261696512. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:06:31,053][1653645] Updated weights for policy 0, policy_version 510980 (0.0013) [2024-06-15 18:06:32,077][1653645] Updated weights for policy 0, policy_version 511040 (0.0021) [2024-06-15 18:06:35,957][1648982] Fps is (10 sec: 39322.3, 60 sec: 45875.3, 300 sec: 46541.8). Total num frames: 1046740992. Throughput: 0: 11503.0. Samples: 261733888. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:06:36,496][1653645] Updated weights for policy 0, policy_version 511122 (0.0014) [2024-06-15 18:06:37,567][1653645] Updated weights for policy 0, policy_version 511164 (0.0016) [2024-06-15 18:06:40,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.9, 300 sec: 45875.2). Total num frames: 1046904832. Throughput: 0: 11343.6. Samples: 261797376. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:06:41,650][1653645] Updated weights for policy 0, policy_version 511219 (0.0015) [2024-06-15 18:06:43,856][1653645] Updated weights for policy 0, policy_version 511280 (0.0025) [2024-06-15 18:06:45,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 44783.0, 300 sec: 46430.6). Total num frames: 1047134208. Throughput: 0: 11423.3. Samples: 261868032. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:06:46,576][1651596] Signal inference workers to stop experience collection... (26600 times) [2024-06-15 18:06:46,622][1653645] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-06-15 18:06:46,802][1651596] Signal inference workers to resume experience collection... (26600 times) [2024-06-15 18:06:46,803][1653645] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-06-15 18:06:46,805][1653645] Updated weights for policy 0, policy_version 511328 (0.0013) [2024-06-15 18:06:48,672][1653645] Updated weights for policy 0, policy_version 511392 (0.0012) [2024-06-15 18:06:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44236.8, 300 sec: 45986.4). Total num frames: 1047396352. Throughput: 0: 11537.1. Samples: 261899776. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:06:52,686][1653645] Updated weights for policy 0, policy_version 511456 (0.0139) [2024-06-15 18:06:55,858][1653645] Updated weights for policy 0, policy_version 511549 (0.0095) [2024-06-15 18:06:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.0, 300 sec: 46654.7). Total num frames: 1047658496. Throughput: 0: 11446.0. Samples: 261964288. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:06:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:06:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000511552_1047658496.pth... [2024-06-15 18:06:56,017][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000506112_1036517376.pth [2024-06-15 18:07:00,212][1653645] Updated weights for policy 0, policy_version 511618 (0.0190) [2024-06-15 18:07:00,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 45328.8, 300 sec: 46319.5). Total num frames: 1047855104. Throughput: 0: 11320.8. Samples: 262025216. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:00,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:07:04,206][1653645] Updated weights for policy 0, policy_version 511682 (0.0014) [2024-06-15 18:07:05,176][1653645] Updated weights for policy 0, policy_version 511738 (0.0012) [2024-06-15 18:07:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.4, 300 sec: 46208.4). Total num frames: 1048051712. Throughput: 0: 11025.0. Samples: 262055936. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:07:08,054][1653645] Updated weights for policy 0, policy_version 511792 (0.0012) [2024-06-15 18:07:10,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44782.7, 300 sec: 46319.5). Total num frames: 1048248320. Throughput: 0: 11127.4. Samples: 262127616. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:07:11,989][1653645] Updated weights for policy 0, policy_version 511874 (0.0082) [2024-06-15 18:07:13,128][1653645] Updated weights for policy 0, policy_version 511930 (0.0012) [2024-06-15 18:07:15,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 45875.2). Total num frames: 1048477696. Throughput: 0: 11070.6. Samples: 262194688. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:07:16,660][1653645] Updated weights for policy 0, policy_version 511992 (0.0014) [2024-06-15 18:07:19,324][1653645] Updated weights for policy 0, policy_version 512048 (0.0014) [2024-06-15 18:07:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.6, 300 sec: 46652.8). Total num frames: 1048707072. Throughput: 0: 11025.0. Samples: 262230016. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:20,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:07:21,496][1653645] Updated weights for policy 0, policy_version 512083 (0.0023) [2024-06-15 18:07:23,347][1653645] Updated weights for policy 0, policy_version 512135 (0.0045) [2024-06-15 18:07:24,680][1653645] Updated weights for policy 0, policy_version 512191 (0.0011) [2024-06-15 18:07:25,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 45986.3). Total num frames: 1048969216. Throughput: 0: 11093.3. Samples: 262296576. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:25,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:07:28,055][1653645] Updated weights for policy 0, policy_version 512256 (0.0013) [2024-06-15 18:07:30,960][1648982] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 46430.6). Total num frames: 1049165824. Throughput: 0: 11127.5. Samples: 262368768. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:30,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:07:31,451][1653645] Updated weights for policy 0, policy_version 512311 (0.0014) [2024-06-15 18:07:32,349][1651596] Signal inference workers to stop experience collection... (26650 times) [2024-06-15 18:07:32,375][1653645] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-06-15 18:07:32,549][1651596] Signal inference workers to resume experience collection... (26650 times) [2024-06-15 18:07:32,550][1653645] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-06-15 18:07:33,308][1653645] Updated weights for policy 0, policy_version 512368 (0.0012) [2024-06-15 18:07:35,282][1653645] Updated weights for policy 0, policy_version 512421 (0.0112) [2024-06-15 18:07:35,958][1648982] Fps is (10 sec: 52425.0, 60 sec: 45874.6, 300 sec: 46430.5). Total num frames: 1049493504. Throughput: 0: 11138.7. Samples: 262401024. Policy #0 lag: (min: 2.0, avg: 117.5, max: 258.0) [2024-06-15 18:07:35,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:07:39,438][1653645] Updated weights for policy 0, policy_version 512468 (0.0012) [2024-06-15 18:07:40,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45329.1, 300 sec: 46208.4). Total num frames: 1049624576. Throughput: 0: 11241.3. Samples: 262470144. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:07:40,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 18:07:41,988][1653645] Updated weights for policy 0, policy_version 512528 (0.0013) [2024-06-15 18:07:42,988][1653645] Updated weights for policy 0, policy_version 512569 (0.0011) [2024-06-15 18:07:43,850][1653645] Updated weights for policy 0, policy_version 512594 (0.0010) [2024-06-15 18:07:44,767][1653645] Updated weights for policy 0, policy_version 512638 (0.0071) [2024-06-15 18:07:45,958][1648982] Fps is (10 sec: 39323.3, 60 sec: 45875.0, 300 sec: 46319.5). Total num frames: 1049886720. Throughput: 0: 11389.2. Samples: 262537728. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:07:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:07:47,703][1653645] Updated weights for policy 0, policy_version 512699 (0.0110) [2024-06-15 18:07:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 45764.1). Total num frames: 1050017792. Throughput: 0: 11389.2. Samples: 262568448. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:07:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:07:52,130][1653645] Updated weights for policy 0, policy_version 512767 (0.0111) [2024-06-15 18:07:54,415][1653645] Updated weights for policy 0, policy_version 512827 (0.0013) [2024-06-15 18:07:55,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 44783.0, 300 sec: 46430.6). Total num frames: 1050345472. Throughput: 0: 11229.9. Samples: 262632960. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:07:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:07:56,462][1653645] Updated weights for policy 0, policy_version 512888 (0.0029) [2024-06-15 18:07:59,748][1653645] Updated weights for policy 0, policy_version 512955 (0.0124) [2024-06-15 18:08:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44783.1, 300 sec: 45764.2). Total num frames: 1050542080. Throughput: 0: 11264.0. Samples: 262701568. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:08:03,647][1653645] Updated weights for policy 0, policy_version 513016 (0.0123) [2024-06-15 18:08:05,432][1653645] Updated weights for policy 0, policy_version 513058 (0.0013) [2024-06-15 18:08:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 46430.6). Total num frames: 1050771456. Throughput: 0: 11298.2. Samples: 262738432. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:08:06,978][1653645] Updated weights for policy 0, policy_version 513110 (0.0108) [2024-06-15 18:08:08,029][1653645] Updated weights for policy 0, policy_version 513152 (0.0014) [2024-06-15 18:08:10,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45875.4, 300 sec: 45875.2). Total num frames: 1051000832. Throughput: 0: 11332.3. Samples: 262806528. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:08:11,426][1653645] Updated weights for policy 0, policy_version 513206 (0.0104) [2024-06-15 18:08:14,124][1653645] Updated weights for policy 0, policy_version 513249 (0.0012) [2024-06-15 18:08:15,811][1653645] Updated weights for policy 0, policy_version 513302 (0.0014) [2024-06-15 18:08:15,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 46319.5). Total num frames: 1051230208. Throughput: 0: 11446.0. Samples: 262883840. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:08:17,266][1653645] Updated weights for policy 0, policy_version 513360 (0.0016) [2024-06-15 18:08:17,843][1651596] Signal inference workers to stop experience collection... (26700 times) [2024-06-15 18:08:17,872][1653645] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-06-15 18:08:18,073][1651596] Signal inference workers to resume experience collection... (26700 times) [2024-06-15 18:08:18,074][1653645] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-06-15 18:08:18,473][1653645] Updated weights for policy 0, policy_version 513407 (0.0014) [2024-06-15 18:08:20,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.3, 300 sec: 45986.3). Total num frames: 1051459584. Throughput: 0: 11389.3. Samples: 262913536. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:08:22,005][1653645] Updated weights for policy 0, policy_version 513467 (0.0013) [2024-06-15 18:08:25,299][1653645] Updated weights for policy 0, policy_version 513520 (0.0087) [2024-06-15 18:08:25,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 45875.0, 300 sec: 46208.4). Total num frames: 1051721728. Throughput: 0: 11616.6. Samples: 262992896. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:25,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:08:27,403][1653645] Updated weights for policy 0, policy_version 513571 (0.0014) [2024-06-15 18:08:29,539][1653645] Updated weights for policy 0, policy_version 513656 (0.0014) [2024-06-15 18:08:30,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 46967.4, 300 sec: 46208.4). Total num frames: 1051983872. Throughput: 0: 11423.3. Samples: 263051776. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:30,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:08:33,198][1653645] Updated weights for policy 0, policy_version 513712 (0.0014) [2024-06-15 18:08:35,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43691.1, 300 sec: 45764.2). Total num frames: 1052114944. Throughput: 0: 11593.9. Samples: 263090176. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:08:36,759][1653645] Updated weights for policy 0, policy_version 513750 (0.0014) [2024-06-15 18:08:39,534][1653645] Updated weights for policy 0, policy_version 513840 (0.0093) [2024-06-15 18:08:40,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 47513.4, 300 sec: 46430.6). Total num frames: 1052475392. Throughput: 0: 11650.8. Samples: 263157248. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:40,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:08:41,117][1653645] Updated weights for policy 0, policy_version 513915 (0.0014) [2024-06-15 18:08:45,460][1653645] Updated weights for policy 0, policy_version 513983 (0.0014) [2024-06-15 18:08:45,957][1648982] Fps is (10 sec: 52430.2, 60 sec: 45875.5, 300 sec: 45764.1). Total num frames: 1052639232. Throughput: 0: 11548.5. Samples: 263221248. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:08:49,409][1653645] Updated weights for policy 0, policy_version 514045 (0.0012) [2024-06-15 18:08:50,962][1648982] Fps is (10 sec: 32753.7, 60 sec: 46417.8, 300 sec: 46096.6). Total num frames: 1052803072. Throughput: 0: 11604.2. Samples: 263260672. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:50,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:08:52,300][1653645] Updated weights for policy 0, policy_version 514112 (0.0013) [2024-06-15 18:08:53,649][1653645] Updated weights for policy 0, policy_version 514173 (0.0101) [2024-06-15 18:08:55,958][1648982] Fps is (10 sec: 39318.7, 60 sec: 44782.5, 300 sec: 45541.9). Total num frames: 1053032448. Throughput: 0: 11263.8. Samples: 263313408. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:08:55,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:08:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000514176_1053032448.pth... [2024-06-15 18:08:56,038][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000508864_1042153472.pth [2024-06-15 18:08:58,053][1653645] Updated weights for policy 0, policy_version 514233 (0.0013) [2024-06-15 18:09:00,958][1648982] Fps is (10 sec: 45895.9, 60 sec: 45329.1, 300 sec: 46097.3). Total num frames: 1053261824. Throughput: 0: 11229.9. Samples: 263389184. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:09:00,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:09:02,924][1653645] Updated weights for policy 0, policy_version 514320 (0.0014) [2024-06-15 18:09:04,118][1651596] Signal inference workers to stop experience collection... (26750 times) [2024-06-15 18:09:04,164][1653645] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-06-15 18:09:04,384][1651596] Signal inference workers to resume experience collection... (26750 times) [2024-06-15 18:09:04,385][1653645] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-06-15 18:09:05,114][1653645] Updated weights for policy 0, policy_version 514416 (0.0081) [2024-06-15 18:09:05,958][1648982] Fps is (10 sec: 52432.1, 60 sec: 46421.3, 300 sec: 45764.1). Total num frames: 1053556736. Throughput: 0: 11275.4. Samples: 263420928. Policy #0 lag: (min: 15.0, avg: 125.6, max: 271.0) [2024-06-15 18:09:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:09:09,520][1653645] Updated weights for policy 0, policy_version 514464 (0.0095) [2024-06-15 18:09:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 45764.1). Total num frames: 1053687808. Throughput: 0: 11036.5. Samples: 263489536. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:10,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 18:09:12,218][1653645] Updated weights for policy 0, policy_version 514512 (0.0016) [2024-06-15 18:09:13,544][1653645] Updated weights for policy 0, policy_version 514560 (0.0013) [2024-06-15 18:09:15,410][1653645] Updated weights for policy 0, policy_version 514610 (0.0096) [2024-06-15 18:09:15,969][1648982] Fps is (10 sec: 42551.3, 60 sec: 45866.8, 300 sec: 45762.4). Total num frames: 1053982720. Throughput: 0: 11113.4. Samples: 263552000. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:15,970][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:09:16,943][1653645] Updated weights for policy 0, policy_version 514679 (0.0013) [2024-06-15 18:09:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1054081024. Throughput: 0: 11036.5. Samples: 263586816. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:20,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:09:21,525][1653645] Updated weights for policy 0, policy_version 514706 (0.0011) [2024-06-15 18:09:23,348][1653645] Updated weights for policy 0, policy_version 514753 (0.0018) [2024-06-15 18:09:24,614][1653645] Updated weights for policy 0, policy_version 514809 (0.0017) [2024-06-15 18:09:25,958][1648982] Fps is (10 sec: 36083.5, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1054343168. Throughput: 0: 11138.8. Samples: 263658496. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:25,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:09:27,205][1653645] Updated weights for policy 0, policy_version 514880 (0.0014) [2024-06-15 18:09:28,481][1653645] Updated weights for policy 0, policy_version 514935 (0.0012) [2024-06-15 18:09:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1054605312. Throughput: 0: 11104.7. Samples: 263720960. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:30,958][1648982] Avg episode reward: [(0, '36.830')] [2024-06-15 18:09:33,825][1653645] Updated weights for policy 0, policy_version 514978 (0.0012) [2024-06-15 18:09:35,724][1653645] Updated weights for policy 0, policy_version 515040 (0.0115) [2024-06-15 18:09:35,958][1648982] Fps is (10 sec: 45876.9, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 1054801920. Throughput: 0: 11026.2. Samples: 263756800. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:09:38,268][1653645] Updated weights for policy 0, policy_version 515104 (0.0014) [2024-06-15 18:09:39,821][1653645] Updated weights for policy 0, policy_version 515154 (0.0012) [2024-06-15 18:09:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.9, 300 sec: 45764.1). Total num frames: 1055129600. Throughput: 0: 11230.0. Samples: 263818752. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:09:45,907][1653645] Updated weights for policy 0, policy_version 515224 (0.0026) [2024-06-15 18:09:45,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 42052.1, 300 sec: 45097.6). Total num frames: 1055162368. Throughput: 0: 11025.0. Samples: 263885312. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:45,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:09:48,369][1653645] Updated weights for policy 0, policy_version 515282 (0.0012) [2024-06-15 18:09:50,604][1653645] Updated weights for policy 0, policy_version 515348 (0.0014) [2024-06-15 18:09:50,950][1651596] Signal inference workers to stop experience collection... (26800 times) [2024-06-15 18:09:50,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 44240.1, 300 sec: 45097.7). Total num frames: 1055457280. Throughput: 0: 11013.7. Samples: 263916544. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:09:50,994][1653645] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-06-15 18:09:51,233][1651596] Signal inference workers to resume experience collection... (26800 times) [2024-06-15 18:09:51,233][1653645] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-06-15 18:09:52,815][1653645] Updated weights for policy 0, policy_version 515449 (0.0128) [2024-06-15 18:09:55,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43691.0, 300 sec: 45319.8). Total num frames: 1055653888. Throughput: 0: 10797.5. Samples: 263975424. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:09:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:09:57,816][1653645] Updated weights for policy 0, policy_version 515490 (0.0032) [2024-06-15 18:10:00,421][1653645] Updated weights for policy 0, policy_version 515537 (0.0021) [2024-06-15 18:10:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 44986.6). Total num frames: 1055850496. Throughput: 0: 11027.8. Samples: 264048128. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:00,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:10:01,440][1653645] Updated weights for policy 0, policy_version 515582 (0.0015) [2024-06-15 18:10:03,376][1653645] Updated weights for policy 0, policy_version 515648 (0.0132) [2024-06-15 18:10:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1056178176. Throughput: 0: 10831.6. Samples: 264074240. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:10:09,544][1653645] Updated weights for policy 0, policy_version 515729 (0.0026) [2024-06-15 18:10:10,635][1653645] Updated weights for policy 0, policy_version 515774 (0.0011) [2024-06-15 18:10:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1056309248. Throughput: 0: 10797.6. Samples: 264144384. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:10:13,288][1653645] Updated weights for policy 0, policy_version 515824 (0.0016) [2024-06-15 18:10:14,346][1653645] Updated weights for policy 0, policy_version 515856 (0.0013) [2024-06-15 18:10:15,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43698.7, 300 sec: 45208.7). Total num frames: 1056604160. Throughput: 0: 10752.0. Samples: 264204800. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:10:16,513][1653645] Updated weights for policy 0, policy_version 515952 (0.0017) [2024-06-15 18:10:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1056702464. Throughput: 0: 10638.2. Samples: 264235520. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:20,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:10:23,215][1653645] Updated weights for policy 0, policy_version 516026 (0.0013) [2024-06-15 18:10:25,247][1653645] Updated weights for policy 0, policy_version 516088 (0.0018) [2024-06-15 18:10:25,962][1648982] Fps is (10 sec: 36028.0, 60 sec: 43687.5, 300 sec: 44874.8). Total num frames: 1056964608. Throughput: 0: 10864.7. Samples: 264307712. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:25,963][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:10:26,513][1653645] Updated weights for policy 0, policy_version 516132 (0.0013) [2024-06-15 18:10:27,692][1653645] Updated weights for policy 0, policy_version 516187 (0.0012) [2024-06-15 18:10:30,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1057226752. Throughput: 0: 10752.0. Samples: 264369152. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:10:34,111][1653645] Updated weights for policy 0, policy_version 516257 (0.0015) [2024-06-15 18:10:34,747][1653645] Updated weights for policy 0, policy_version 516288 (0.0034) [2024-06-15 18:10:35,958][1648982] Fps is (10 sec: 42618.3, 60 sec: 43144.5, 300 sec: 44542.3). Total num frames: 1057390592. Throughput: 0: 11059.2. Samples: 264414208. Policy #0 lag: (min: 8.0, avg: 111.4, max: 264.0) [2024-06-15 18:10:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:10:36,600][1653645] Updated weights for policy 0, policy_version 516347 (0.0014) [2024-06-15 18:10:37,498][1651596] Signal inference workers to stop experience collection... (26850 times) [2024-06-15 18:10:37,554][1653645] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-06-15 18:10:37,711][1651596] Signal inference workers to resume experience collection... (26850 times) [2024-06-15 18:10:37,712][1653645] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-06-15 18:10:38,876][1653645] Updated weights for policy 0, policy_version 516416 (0.0037) [2024-06-15 18:10:40,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.5, 300 sec: 45097.6). Total num frames: 1057751040. Throughput: 0: 10979.5. Samples: 264469504. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:10:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:10:45,503][1653645] Updated weights for policy 0, policy_version 516482 (0.0014) [2024-06-15 18:10:45,959][1648982] Fps is (10 sec: 39315.7, 60 sec: 43689.7, 300 sec: 44208.8). Total num frames: 1057783808. Throughput: 0: 11024.7. Samples: 264544256. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:10:45,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:10:46,763][1653645] Updated weights for policy 0, policy_version 516541 (0.0014) [2024-06-15 18:10:49,773][1653645] Updated weights for policy 0, policy_version 516629 (0.0014) [2024-06-15 18:10:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45329.0, 300 sec: 44986.5). Total num frames: 1058177024. Throughput: 0: 11047.8. Samples: 264571392. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:10:50,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:10:51,468][1653645] Updated weights for policy 0, policy_version 516714 (0.0037) [2024-06-15 18:10:55,966][1648982] Fps is (10 sec: 49116.7, 60 sec: 43684.5, 300 sec: 44540.9). Total num frames: 1058275328. Throughput: 0: 11034.3. Samples: 264641024. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:10:55,967][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:10:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000516736_1058275328.pth... [2024-06-15 18:10:56,018][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000511552_1047658496.pth [2024-06-15 18:10:56,061][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000516736_1058275328.pth [2024-06-15 18:10:57,589][1653645] Updated weights for policy 0, policy_version 516752 (0.0020) [2024-06-15 18:10:59,409][1653645] Updated weights for policy 0, policy_version 516802 (0.0012) [2024-06-15 18:11:00,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 1058537472. Throughput: 0: 11127.4. Samples: 264705536. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:11:01,489][1653645] Updated weights for policy 0, policy_version 516867 (0.0061) [2024-06-15 18:11:03,033][1653645] Updated weights for policy 0, policy_version 516930 (0.0013) [2024-06-15 18:11:05,958][1648982] Fps is (10 sec: 52473.8, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1058799616. Throughput: 0: 11059.2. Samples: 264733184. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:11:09,670][1653645] Updated weights for policy 0, policy_version 517024 (0.0013) [2024-06-15 18:11:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1058930688. Throughput: 0: 11128.5. Samples: 264808448. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:11:12,877][1653645] Updated weights for policy 0, policy_version 517104 (0.0014) [2024-06-15 18:11:14,055][1653645] Updated weights for policy 0, policy_version 517155 (0.0061) [2024-06-15 18:11:15,651][1653645] Updated weights for policy 0, policy_version 517216 (0.0028) [2024-06-15 18:11:15,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1059258368. Throughput: 0: 11070.7. Samples: 264867328. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:11:20,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1059323904. Throughput: 0: 10843.0. Samples: 264902144. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:11:21,273][1653645] Updated weights for policy 0, policy_version 517270 (0.0013) [2024-06-15 18:11:22,268][1653645] Updated weights for policy 0, policy_version 517311 (0.0014) [2024-06-15 18:11:22,853][1651596] Signal inference workers to stop experience collection... (26900 times) [2024-06-15 18:11:22,916][1653645] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-06-15 18:11:23,098][1651596] Signal inference workers to resume experience collection... (26900 times) [2024-06-15 18:11:23,118][1653645] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-06-15 18:11:24,149][1653645] Updated weights for policy 0, policy_version 517376 (0.0013) [2024-06-15 18:11:25,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45878.8, 300 sec: 44875.5). Total num frames: 1059717120. Throughput: 0: 11309.6. Samples: 264978432. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:25,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:11:26,245][1653645] Updated weights for policy 0, policy_version 517458 (0.0013) [2024-06-15 18:11:30,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1059848192. Throughput: 0: 11014.1. Samples: 265039872. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:11:33,424][1653645] Updated weights for policy 0, policy_version 517507 (0.0055) [2024-06-15 18:11:34,575][1653645] Updated weights for policy 0, policy_version 517559 (0.0014) [2024-06-15 18:11:35,970][1648982] Fps is (10 sec: 32727.1, 60 sec: 44227.6, 300 sec: 44540.4). Total num frames: 1060044800. Throughput: 0: 11249.5. Samples: 265077760. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:35,971][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:11:36,583][1653645] Updated weights for policy 0, policy_version 517634 (0.0013) [2024-06-15 18:11:38,451][1653645] Updated weights for policy 0, policy_version 517705 (0.0012) [2024-06-15 18:11:39,681][1653645] Updated weights for policy 0, policy_version 517760 (0.0012) [2024-06-15 18:11:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1060372480. Throughput: 0: 10811.0. Samples: 265127424. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:11:45,958][1648982] Fps is (10 sec: 32808.2, 60 sec: 43145.4, 300 sec: 43986.8). Total num frames: 1060372480. Throughput: 0: 11138.8. Samples: 265206784. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:11:47,456][1653645] Updated weights for policy 0, policy_version 517829 (0.0014) [2024-06-15 18:11:49,609][1653645] Updated weights for policy 0, policy_version 517906 (0.0013) [2024-06-15 18:11:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1060765696. Throughput: 0: 11127.5. Samples: 265233920. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:11:51,373][1653645] Updated weights for policy 0, policy_version 517984 (0.0013) [2024-06-15 18:11:55,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43696.9, 300 sec: 44209.1). Total num frames: 1060896768. Throughput: 0: 10808.9. Samples: 265294848. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:11:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:11:58,378][1653645] Updated weights for policy 0, policy_version 518032 (0.0069) [2024-06-15 18:11:59,509][1653645] Updated weights for policy 0, policy_version 518080 (0.0013) [2024-06-15 18:12:00,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43144.7, 300 sec: 44320.1). Total num frames: 1061126144. Throughput: 0: 11025.1. Samples: 265363456. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:12:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:12:01,727][1653645] Updated weights for policy 0, policy_version 518176 (0.0159) [2024-06-15 18:12:01,814][1651596] Signal inference workers to stop experience collection... (26950 times) [2024-06-15 18:12:01,866][1653645] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-06-15 18:12:02,018][1651596] Signal inference workers to resume experience collection... (26950 times) [2024-06-15 18:12:02,019][1653645] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-06-15 18:12:03,809][1653645] Updated weights for policy 0, policy_version 518272 (0.0012) [2024-06-15 18:12:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.8, 300 sec: 44653.4). Total num frames: 1061421056. Throughput: 0: 10774.8. Samples: 265387008. Policy #0 lag: (min: 3.0, avg: 124.4, max: 259.0) [2024-06-15 18:12:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:12:10,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 42598.4, 300 sec: 44097.9). Total num frames: 1061486592. Throughput: 0: 10797.4. Samples: 265464320. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:12:11,514][1653645] Updated weights for policy 0, policy_version 518336 (0.0025) [2024-06-15 18:12:13,988][1653645] Updated weights for policy 0, policy_version 518432 (0.0085) [2024-06-15 18:12:15,441][1653645] Updated weights for policy 0, policy_version 518500 (0.0012) [2024-06-15 18:12:15,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 44236.8, 300 sec: 44764.5). Total num frames: 1061912576. Throughput: 0: 10581.3. Samples: 265516032. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:12:20,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 1061945344. Throughput: 0: 10607.0. Samples: 265554944. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:12:22,686][1653645] Updated weights for policy 0, policy_version 518544 (0.0015) [2024-06-15 18:12:24,292][1653645] Updated weights for policy 0, policy_version 518608 (0.0013) [2024-06-15 18:12:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1062273024. Throughput: 0: 11070.6. Samples: 265625600. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:12:25,959][1653645] Updated weights for policy 0, policy_version 518690 (0.0014) [2024-06-15 18:12:27,558][1653645] Updated weights for policy 0, policy_version 518752 (0.0126) [2024-06-15 18:12:30,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.7, 300 sec: 43987.0). Total num frames: 1062469632. Throughput: 0: 10638.3. Samples: 265685504. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:12:34,737][1653645] Updated weights for policy 0, policy_version 518802 (0.0036) [2024-06-15 18:12:35,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 42607.3, 300 sec: 43986.9). Total num frames: 1062600704. Throughput: 0: 10945.4. Samples: 265726464. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:12:36,148][1653645] Updated weights for policy 0, policy_version 518868 (0.0012) [2024-06-15 18:12:37,702][1653645] Updated weights for policy 0, policy_version 518930 (0.0169) [2024-06-15 18:12:39,101][1653645] Updated weights for policy 0, policy_version 518992 (0.0012) [2024-06-15 18:12:40,288][1653645] Updated weights for policy 0, policy_version 519036 (0.0012) [2024-06-15 18:12:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1062993920. Throughput: 0: 10820.3. Samples: 265781760. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:12:45,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1062993920. Throughput: 0: 10922.6. Samples: 265854976. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:12:47,801][1651596] Signal inference workers to stop experience collection... (27000 times) [2024-06-15 18:12:47,844][1653645] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-06-15 18:12:47,854][1653645] Updated weights for policy 0, policy_version 519091 (0.0013) [2024-06-15 18:12:48,042][1651596] Signal inference workers to resume experience collection... (27000 times) [2024-06-15 18:12:48,043][1653645] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-06-15 18:12:49,496][1653645] Updated weights for policy 0, policy_version 519168 (0.0110) [2024-06-15 18:12:50,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 1063354368. Throughput: 0: 11127.4. Samples: 265887744. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:12:51,594][1653645] Updated weights for policy 0, policy_version 519248 (0.0105) [2024-06-15 18:12:52,735][1653645] Updated weights for policy 0, policy_version 519293 (0.0013) [2024-06-15 18:12:55,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 1063518208. Throughput: 0: 10638.2. Samples: 265943040. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:12:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:12:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000519296_1063518208.pth... [2024-06-15 18:12:56,005][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000514176_1053032448.pth [2024-06-15 18:13:00,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 42052.2, 300 sec: 43653.6). Total num frames: 1063649280. Throughput: 0: 11104.7. Samples: 266015744. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:13:01,251][1653645] Updated weights for policy 0, policy_version 519378 (0.0066) [2024-06-15 18:13:03,352][1653645] Updated weights for policy 0, policy_version 519458 (0.0059) [2024-06-15 18:13:05,570][1653645] Updated weights for policy 0, policy_version 519545 (0.0013) [2024-06-15 18:13:05,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1064042496. Throughput: 0: 10797.6. Samples: 266040832. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:13:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.6, 300 sec: 43431.5). Total num frames: 1064042496. Throughput: 0: 10592.7. Samples: 266102272. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:13:12,865][1653645] Updated weights for policy 0, policy_version 519602 (0.0026) [2024-06-15 18:13:15,073][1653645] Updated weights for policy 0, policy_version 519680 (0.0021) [2024-06-15 18:13:15,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 40413.9, 300 sec: 43653.6). Total num frames: 1064337408. Throughput: 0: 10649.6. Samples: 266164736. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:13:17,310][1653645] Updated weights for policy 0, policy_version 519765 (0.0174) [2024-06-15 18:13:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.9, 300 sec: 43542.6). Total num frames: 1064566784. Throughput: 0: 10308.3. Samples: 266190336. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:20,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:13:23,829][1653645] Updated weights for policy 0, policy_version 519812 (0.0015) [2024-06-15 18:13:25,358][1653645] Updated weights for policy 0, policy_version 519872 (0.0176) [2024-06-15 18:13:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 43209.4). Total num frames: 1064730624. Throughput: 0: 10820.3. Samples: 266268672. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:25,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 18:13:27,068][1651596] Signal inference workers to stop experience collection... (27050 times) [2024-06-15 18:13:27,125][1653645] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-06-15 18:13:27,402][1651596] Signal inference workers to resume experience collection... (27050 times) [2024-06-15 18:13:27,404][1653645] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-06-15 18:13:27,629][1653645] Updated weights for policy 0, policy_version 519955 (0.0016) [2024-06-15 18:13:29,076][1653645] Updated weights for policy 0, policy_version 520022 (0.0012) [2024-06-15 18:13:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1065091072. Throughput: 0: 10296.9. Samples: 266318336. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:13:35,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 41506.1, 300 sec: 42765.0). Total num frames: 1065091072. Throughput: 0: 10478.9. Samples: 266359296. Policy #0 lag: (min: 15.0, avg: 68.8, max: 271.0) [2024-06-15 18:13:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:13:36,309][1653645] Updated weights for policy 0, policy_version 520080 (0.0012) [2024-06-15 18:13:37,563][1653645] Updated weights for policy 0, policy_version 520128 (0.0010) [2024-06-15 18:13:40,074][1653645] Updated weights for policy 0, policy_version 520227 (0.0118) [2024-06-15 18:13:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 1065517056. Throughput: 0: 10706.6. Samples: 266424832. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:13:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:13:41,886][1653645] Updated weights for policy 0, policy_version 520320 (0.0103) [2024-06-15 18:13:45,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.8, 300 sec: 43432.2). Total num frames: 1065615360. Throughput: 0: 10535.8. Samples: 266489856. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:13:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:13:50,283][1653645] Updated weights for policy 0, policy_version 520416 (0.0083) [2024-06-15 18:13:50,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 42052.1, 300 sec: 43542.6). Total num frames: 1065877504. Throughput: 0: 10865.7. Samples: 266529792. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:13:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:13:52,237][1653645] Updated weights for policy 0, policy_version 520513 (0.0014) [2024-06-15 18:13:53,634][1653645] Updated weights for policy 0, policy_version 520569 (0.0011) [2024-06-15 18:13:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43691.0, 300 sec: 43653.7). Total num frames: 1066139648. Throughput: 0: 10638.2. Samples: 266580992. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:13:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:14:00,958][1648982] Fps is (10 sec: 32769.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1066205184. Throughput: 0: 11025.1. Samples: 266660864. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:14:01,892][1653645] Updated weights for policy 0, policy_version 520665 (0.0081) [2024-06-15 18:14:03,511][1653645] Updated weights for policy 0, policy_version 520738 (0.0083) [2024-06-15 18:14:04,828][1651596] Signal inference workers to stop experience collection... (27100 times) [2024-06-15 18:14:04,875][1653645] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-06-15 18:14:05,096][1651596] Signal inference workers to resume experience collection... (27100 times) [2024-06-15 18:14:05,097][1653645] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-06-15 18:14:05,912][1653645] Updated weights for policy 0, policy_version 520828 (0.0015) [2024-06-15 18:14:05,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1066663936. Throughput: 0: 10979.6. Samples: 266684416. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:14:10,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 43690.6, 300 sec: 42988.8). Total num frames: 1066663936. Throughput: 0: 10638.2. Samples: 266747392. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:14:13,103][1653645] Updated weights for policy 0, policy_version 520867 (0.0011) [2024-06-15 18:14:14,728][1653645] Updated weights for policy 0, policy_version 520930 (0.0012) [2024-06-15 18:14:15,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1066958848. Throughput: 0: 11059.2. Samples: 266816000. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:14:16,895][1653645] Updated weights for policy 0, policy_version 521024 (0.0014) [2024-06-15 18:14:20,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1067188224. Throughput: 0: 10649.6. Samples: 266838528. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:14:23,882][1653645] Updated weights for policy 0, policy_version 521090 (0.0094) [2024-06-15 18:14:25,796][1653645] Updated weights for policy 0, policy_version 521168 (0.0011) [2024-06-15 18:14:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 1067352064. Throughput: 0: 11013.7. Samples: 266920448. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:25,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 18:14:27,543][1653645] Updated weights for policy 0, policy_version 521238 (0.0013) [2024-06-15 18:14:28,903][1653645] Updated weights for policy 0, policy_version 521312 (0.0013) [2024-06-15 18:14:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1067712512. Throughput: 0: 10752.0. Samples: 266973696. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:14:35,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 1067712512. Throughput: 0: 10729.3. Samples: 267012608. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:35,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 18:14:36,548][1653645] Updated weights for policy 0, policy_version 521376 (0.0012) [2024-06-15 18:14:38,876][1653645] Updated weights for policy 0, policy_version 521473 (0.0026) [2024-06-15 18:14:39,985][1653645] Updated weights for policy 0, policy_version 521521 (0.0014) [2024-06-15 18:14:40,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 1068171264. Throughput: 0: 11036.4. Samples: 267077632. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:14:41,495][1653645] Updated weights for policy 0, policy_version 521599 (0.0073) [2024-06-15 18:14:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 1068236800. Throughput: 0: 10729.2. Samples: 267143680. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:45,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:14:49,376][1653645] Updated weights for policy 0, policy_version 521662 (0.0118) [2024-06-15 18:14:49,957][1651596] Signal inference workers to stop experience collection... (27150 times) [2024-06-15 18:14:50,005][1653645] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-06-15 18:14:50,251][1651596] Signal inference workers to resume experience collection... (27150 times) [2024-06-15 18:14:50,251][1653645] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-06-15 18:14:50,958][1648982] Fps is (10 sec: 29492.0, 60 sec: 43144.7, 300 sec: 43431.5). Total num frames: 1068466176. Throughput: 0: 11047.8. Samples: 267181568. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:50,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:14:51,068][1653645] Updated weights for policy 0, policy_version 521728 (0.0102) [2024-06-15 18:14:53,192][1653645] Updated weights for policy 0, policy_version 521824 (0.0079) [2024-06-15 18:14:55,958][1648982] Fps is (10 sec: 52426.4, 60 sec: 43690.3, 300 sec: 43764.7). Total num frames: 1068761088. Throughput: 0: 10672.3. Samples: 267227648. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:14:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:14:55,969][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000521856_1068761088.pth... [2024-06-15 18:14:56,023][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000516736_1058275328.pth [2024-06-15 18:15:00,958][1648982] Fps is (10 sec: 29490.4, 60 sec: 42598.1, 300 sec: 42653.9). Total num frames: 1068761088. Throughput: 0: 10888.4. Samples: 267305984. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:15:00,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:15:01,642][1653645] Updated weights for policy 0, policy_version 521904 (0.0013) [2024-06-15 18:15:03,032][1653645] Updated weights for policy 0, policy_version 521952 (0.0014) [2024-06-15 18:15:04,536][1653645] Updated weights for policy 0, policy_version 522001 (0.0024) [2024-06-15 18:15:05,901][1653645] Updated weights for policy 0, policy_version 522064 (0.0012) [2024-06-15 18:15:05,958][1648982] Fps is (10 sec: 42599.9, 60 sec: 42052.2, 300 sec: 43653.6). Total num frames: 1069187072. Throughput: 0: 11013.7. Samples: 267334144. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:15:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:15:10,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 1069285376. Throughput: 0: 10524.5. Samples: 267394048. Policy #0 lag: (min: 77.0, avg: 138.1, max: 316.0) [2024-06-15 18:15:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:15:13,463][1653645] Updated weights for policy 0, policy_version 522144 (0.0013) [2024-06-15 18:15:15,679][1653645] Updated weights for policy 0, policy_version 522229 (0.0023) [2024-06-15 18:15:15,961][1648982] Fps is (10 sec: 36031.3, 60 sec: 43141.8, 300 sec: 43542.0). Total num frames: 1069547520. Throughput: 0: 10762.5. Samples: 267458048. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:15,962][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:15:17,389][1653645] Updated weights for policy 0, policy_version 522304 (0.0012) [2024-06-15 18:15:18,662][1653645] Updated weights for policy 0, policy_version 522366 (0.0013) [2024-06-15 18:15:20,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 43543.2). Total num frames: 1069809664. Throughput: 0: 10456.2. Samples: 267483136. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:15:25,958][1648982] Fps is (10 sec: 32780.5, 60 sec: 42052.3, 300 sec: 42876.1). Total num frames: 1069875200. Throughput: 0: 10763.5. Samples: 267561984. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:15:26,401][1653645] Updated weights for policy 0, policy_version 522420 (0.0012) [2024-06-15 18:15:28,288][1653645] Updated weights for policy 0, policy_version 522500 (0.0076) [2024-06-15 18:15:29,369][1651596] Signal inference workers to stop experience collection... (27200 times) [2024-06-15 18:15:29,410][1653645] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-06-15 18:15:29,522][1651596] Signal inference workers to resume experience collection... (27200 times) [2024-06-15 18:15:29,523][1653645] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-06-15 18:15:29,957][1653645] Updated weights for policy 0, policy_version 522580 (0.0013) [2024-06-15 18:15:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1070333952. Throughput: 0: 10467.5. Samples: 267614720. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:15:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 42654.0). Total num frames: 1070333952. Throughput: 0: 10479.0. Samples: 267653120. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:15:37,272][1653645] Updated weights for policy 0, policy_version 522642 (0.0013) [2024-06-15 18:15:38,820][1653645] Updated weights for policy 0, policy_version 522707 (0.0013) [2024-06-15 18:15:40,168][1653645] Updated weights for policy 0, policy_version 522768 (0.0013) [2024-06-15 18:15:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42052.4, 300 sec: 43764.9). Total num frames: 1070694400. Throughput: 0: 11047.9. Samples: 267724800. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:15:41,204][1653645] Updated weights for policy 0, policy_version 522818 (0.0135) [2024-06-15 18:15:45,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.5, 300 sec: 42987.2). Total num frames: 1070858240. Throughput: 0: 10638.2. Samples: 267784704. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:45,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:15:49,061][1653645] Updated weights for policy 0, policy_version 522882 (0.0015) [2024-06-15 18:15:50,857][1653645] Updated weights for policy 0, policy_version 522960 (0.0012) [2024-06-15 18:15:50,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 42598.4, 300 sec: 43210.6). Total num frames: 1071022080. Throughput: 0: 10911.3. Samples: 267825152. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:15:51,787][1653645] Updated weights for policy 0, policy_version 523008 (0.0014) [2024-06-15 18:15:54,062][1653645] Updated weights for policy 0, policy_version 523095 (0.0029) [2024-06-15 18:15:54,788][1653645] Updated weights for policy 0, policy_version 523129 (0.0012) [2024-06-15 18:15:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1071382528. Throughput: 0: 10729.2. Samples: 267876864. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:15:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:16:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44237.0, 300 sec: 42765.0). Total num frames: 1071415296. Throughput: 0: 11162.5. Samples: 267960320. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:16:01,514][1653645] Updated weights for policy 0, policy_version 523184 (0.0013) [2024-06-15 18:16:03,308][1653645] Updated weights for policy 0, policy_version 523260 (0.0014) [2024-06-15 18:16:05,704][1653645] Updated weights for policy 0, policy_version 523344 (0.0184) [2024-06-15 18:16:05,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1071808512. Throughput: 0: 11150.2. Samples: 267984896. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:16:10,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 1071906816. Throughput: 0: 10774.7. Samples: 268046848. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:16:12,516][1651596] Signal inference workers to stop experience collection... (27250 times) [2024-06-15 18:16:12,552][1653645] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-06-15 18:16:12,566][1653645] Updated weights for policy 0, policy_version 523429 (0.0015) [2024-06-15 18:16:12,674][1651596] Signal inference workers to resume experience collection... (27250 times) [2024-06-15 18:16:12,675][1653645] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-06-15 18:16:14,267][1653645] Updated weights for policy 0, policy_version 523488 (0.0012) [2024-06-15 18:16:15,960][1648982] Fps is (10 sec: 36044.5, 60 sec: 43693.4, 300 sec: 43542.6). Total num frames: 1072168960. Throughput: 0: 11252.6. Samples: 268121088. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:15,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:16:16,554][1653645] Updated weights for policy 0, policy_version 523552 (0.0012) [2024-06-15 18:16:18,054][1653645] Updated weights for policy 0, policy_version 523616 (0.0157) [2024-06-15 18:16:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 1072431104. Throughput: 0: 11002.3. Samples: 268148224. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:16:23,316][1653645] Updated weights for policy 0, policy_version 523649 (0.0013) [2024-06-15 18:16:24,770][1653645] Updated weights for policy 0, policy_version 523709 (0.0011) [2024-06-15 18:16:25,706][1653645] Updated weights for policy 0, policy_version 523746 (0.0012) [2024-06-15 18:16:25,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 46421.3, 300 sec: 43431.5). Total num frames: 1072660480. Throughput: 0: 11025.1. Samples: 268220928. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:25,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 18:16:28,441][1653645] Updated weights for policy 0, policy_version 523808 (0.0028) [2024-06-15 18:16:30,681][1653645] Updated weights for policy 0, policy_version 523888 (0.0029) [2024-06-15 18:16:30,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43144.6, 300 sec: 43655.5). Total num frames: 1072922624. Throughput: 0: 11070.7. Samples: 268282880. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:16:35,672][1653645] Updated weights for policy 0, policy_version 523922 (0.0011) [2024-06-15 18:16:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44782.9, 300 sec: 42876.1). Total num frames: 1073020928. Throughput: 0: 10934.0. Samples: 268317184. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:16:37,334][1653645] Updated weights for policy 0, policy_version 524003 (0.0133) [2024-06-15 18:16:40,862][1653645] Updated weights for policy 0, policy_version 524064 (0.0015) [2024-06-15 18:16:40,958][1648982] Fps is (10 sec: 36043.3, 60 sec: 43144.3, 300 sec: 43764.7). Total num frames: 1073283072. Throughput: 0: 11332.2. Samples: 268386816. Policy #0 lag: (min: 15.0, avg: 78.5, max: 271.0) [2024-06-15 18:16:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:16:42,811][1653645] Updated weights for policy 0, policy_version 524151 (0.0079) [2024-06-15 18:16:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 1073479680. Throughput: 0: 10854.4. Samples: 268448768. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:16:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:16:47,909][1653645] Updated weights for policy 0, policy_version 524198 (0.0016) [2024-06-15 18:16:48,975][1653645] Updated weights for policy 0, policy_version 524256 (0.0109) [2024-06-15 18:16:50,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 45329.0, 300 sec: 43542.6). Total num frames: 1073741824. Throughput: 0: 11070.6. Samples: 268483072. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:16:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:16:52,003][1653645] Updated weights for policy 0, policy_version 524292 (0.0021) [2024-06-15 18:16:53,434][1651596] Signal inference workers to stop experience collection... (27300 times) [2024-06-15 18:16:53,531][1653645] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-06-15 18:16:53,551][1653645] Updated weights for policy 0, policy_version 524361 (0.0132) [2024-06-15 18:16:53,634][1651596] Signal inference workers to resume experience collection... (27300 times) [2024-06-15 18:16:53,635][1653645] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-06-15 18:16:55,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 1074003968. Throughput: 0: 11093.3. Samples: 268546048. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:16:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:16:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000524416_1074003968.pth... [2024-06-15 18:16:56,019][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000519296_1063518208.pth [2024-06-15 18:16:58,777][1653645] Updated weights for policy 0, policy_version 524432 (0.0015) [2024-06-15 18:16:59,653][1653645] Updated weights for policy 0, policy_version 524477 (0.0017) [2024-06-15 18:17:00,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46421.4, 300 sec: 43320.4). Total num frames: 1074200576. Throughput: 0: 11104.7. Samples: 268620800. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:17:01,229][1653645] Updated weights for policy 0, policy_version 524538 (0.0011) [2024-06-15 18:17:05,304][1653645] Updated weights for policy 0, policy_version 524609 (0.0110) [2024-06-15 18:17:05,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1074462720. Throughput: 0: 11264.0. Samples: 268655104. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:17:06,554][1653645] Updated weights for policy 0, policy_version 524672 (0.0022) [2024-06-15 18:17:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44782.9, 300 sec: 42987.2). Total num frames: 1074593792. Throughput: 0: 10990.9. Samples: 268715520. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:17:11,502][1653645] Updated weights for policy 0, policy_version 524729 (0.0013) [2024-06-15 18:17:13,066][1653645] Updated weights for policy 0, policy_version 524784 (0.0016) [2024-06-15 18:17:15,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1074790400. Throughput: 0: 11195.7. Samples: 268786688. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:17:17,856][1653645] Updated weights for policy 0, policy_version 524880 (0.0015) [2024-06-15 18:17:18,822][1653645] Updated weights for policy 0, policy_version 524924 (0.0037) [2024-06-15 18:17:20,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.5, 300 sec: 43320.4). Total num frames: 1075052544. Throughput: 0: 10922.6. Samples: 268808704. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:17:23,223][1653645] Updated weights for policy 0, policy_version 524976 (0.0012) [2024-06-15 18:17:24,459][1653645] Updated weights for policy 0, policy_version 525024 (0.0024) [2024-06-15 18:17:25,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 1075314688. Throughput: 0: 10979.6. Samples: 268880896. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:17:28,187][1653645] Updated weights for policy 0, policy_version 525072 (0.0012) [2024-06-15 18:17:30,080][1653645] Updated weights for policy 0, policy_version 525152 (0.0013) [2024-06-15 18:17:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44236.6, 300 sec: 43986.8). Total num frames: 1075576832. Throughput: 0: 11002.3. Samples: 268943872. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:17:33,646][1653645] Updated weights for policy 0, policy_version 525211 (0.0110) [2024-06-15 18:17:35,461][1653645] Updated weights for policy 0, policy_version 525254 (0.0014) [2024-06-15 18:17:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.0, 300 sec: 43209.3). Total num frames: 1075740672. Throughput: 0: 11173.0. Samples: 268985856. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:17:36,835][1653645] Updated weights for policy 0, policy_version 525311 (0.0013) [2024-06-15 18:17:40,183][1651596] Signal inference workers to stop experience collection... (27350 times) [2024-06-15 18:17:40,246][1653645] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-06-15 18:17:40,441][1651596] Signal inference workers to resume experience collection... (27350 times) [2024-06-15 18:17:40,441][1653645] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-06-15 18:17:40,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44783.1, 300 sec: 43986.9). Total num frames: 1075970048. Throughput: 0: 11389.2. Samples: 269058560. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:17:41,388][1653645] Updated weights for policy 0, policy_version 525401 (0.0012) [2024-06-15 18:17:42,327][1653645] Updated weights for policy 0, policy_version 525439 (0.0012) [2024-06-15 18:17:45,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 43542.6). Total num frames: 1076199424. Throughput: 0: 11104.7. Samples: 269120512. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:17:46,034][1653645] Updated weights for policy 0, policy_version 525503 (0.0055) [2024-06-15 18:17:48,525][1653645] Updated weights for policy 0, policy_version 525563 (0.0014) [2024-06-15 18:17:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1076363264. Throughput: 0: 11047.8. Samples: 269152256. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:17:52,341][1653645] Updated weights for policy 0, policy_version 525616 (0.0034) [2024-06-15 18:17:53,874][1653645] Updated weights for policy 0, policy_version 525680 (0.0013) [2024-06-15 18:17:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1076625408. Throughput: 0: 11104.7. Samples: 269215232. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:17:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:17:57,611][1653645] Updated weights for policy 0, policy_version 525749 (0.0014) [2024-06-15 18:17:59,550][1653645] Updated weights for policy 0, policy_version 525792 (0.0046) [2024-06-15 18:18:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 1076887552. Throughput: 0: 11013.7. Samples: 269282304. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:18:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:18:04,299][1653645] Updated weights for policy 0, policy_version 525856 (0.0013) [2024-06-15 18:18:05,385][1653645] Updated weights for policy 0, policy_version 525910 (0.0082) [2024-06-15 18:18:05,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 1077116928. Throughput: 0: 11377.8. Samples: 269320704. Policy #0 lag: (min: 79.0, avg: 173.0, max: 335.0) [2024-06-15 18:18:05,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:18:08,597][1653645] Updated weights for policy 0, policy_version 525968 (0.0013) [2024-06-15 18:18:09,538][1653645] Updated weights for policy 0, policy_version 526007 (0.0011) [2024-06-15 18:18:10,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 45875.1, 300 sec: 44097.9). Total num frames: 1077346304. Throughput: 0: 11275.4. Samples: 269388288. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:18:11,033][1653645] Updated weights for policy 0, policy_version 526049 (0.0014) [2024-06-15 18:18:15,590][1653645] Updated weights for policy 0, policy_version 526112 (0.0013) [2024-06-15 18:18:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45328.9, 300 sec: 43875.7). Total num frames: 1077510144. Throughput: 0: 11457.4. Samples: 269459456. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:18:17,267][1653645] Updated weights for policy 0, policy_version 526196 (0.0013) [2024-06-15 18:18:20,708][1653645] Updated weights for policy 0, policy_version 526256 (0.0011) [2024-06-15 18:18:20,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 1077772288. Throughput: 0: 11150.2. Samples: 269487616. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:18:21,781][1653645] Updated weights for policy 0, policy_version 526276 (0.0011) [2024-06-15 18:18:23,115][1653645] Updated weights for policy 0, policy_version 526336 (0.0011) [2024-06-15 18:18:25,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1077936128. Throughput: 0: 11104.7. Samples: 269558272. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:18:26,718][1651596] Signal inference workers to stop experience collection... (27400 times) [2024-06-15 18:18:26,759][1653645] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-06-15 18:18:26,885][1651596] Signal inference workers to resume experience collection... (27400 times) [2024-06-15 18:18:26,886][1653645] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-06-15 18:18:28,242][1653645] Updated weights for policy 0, policy_version 526418 (0.0013) [2024-06-15 18:18:29,258][1653645] Updated weights for policy 0, policy_version 526464 (0.0069) [2024-06-15 18:18:30,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1078198272. Throughput: 0: 11241.2. Samples: 269626368. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:18:32,023][1653645] Updated weights for policy 0, policy_version 526523 (0.0013) [2024-06-15 18:18:34,309][1653645] Updated weights for policy 0, policy_version 526581 (0.0016) [2024-06-15 18:18:35,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45329.2, 300 sec: 43875.8). Total num frames: 1078460416. Throughput: 0: 11309.5. Samples: 269661184. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:18:38,624][1653645] Updated weights for policy 0, policy_version 526614 (0.0013) [2024-06-15 18:18:40,030][1653645] Updated weights for policy 0, policy_version 526688 (0.0015) [2024-06-15 18:18:40,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1078722560. Throughput: 0: 11423.3. Samples: 269729280. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:40,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 18:18:42,773][1653645] Updated weights for policy 0, policy_version 526737 (0.0012) [2024-06-15 18:18:43,857][1653645] Updated weights for policy 0, policy_version 526784 (0.0014) [2024-06-15 18:18:45,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 44320.2). Total num frames: 1078951936. Throughput: 0: 11389.2. Samples: 269794816. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:18:46,108][1653645] Updated weights for policy 0, policy_version 526841 (0.0017) [2024-06-15 18:18:49,672][1653645] Updated weights for policy 0, policy_version 526868 (0.0012) [2024-06-15 18:18:50,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 45875.0, 300 sec: 43986.8). Total num frames: 1079115776. Throughput: 0: 11332.3. Samples: 269830656. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:50,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:18:51,395][1653645] Updated weights for policy 0, policy_version 526928 (0.0016) [2024-06-15 18:18:52,390][1653645] Updated weights for policy 0, policy_version 526973 (0.0016) [2024-06-15 18:18:55,245][1653645] Updated weights for policy 0, policy_version 527037 (0.0024) [2024-06-15 18:18:55,958][1648982] Fps is (10 sec: 42596.9, 60 sec: 45875.0, 300 sec: 44653.3). Total num frames: 1079377920. Throughput: 0: 11400.5. Samples: 269901312. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:18:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:18:55,971][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000527040_1079377920.pth... [2024-06-15 18:18:56,067][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000521856_1068761088.pth [2024-06-15 18:18:57,507][1653645] Updated weights for policy 0, policy_version 527096 (0.0012) [2024-06-15 18:19:00,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 1079508992. Throughput: 0: 11286.8. Samples: 269967360. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:19:03,216][1653645] Updated weights for policy 0, policy_version 527169 (0.0014) [2024-06-15 18:19:04,294][1653645] Updated weights for policy 0, policy_version 527231 (0.0034) [2024-06-15 18:19:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1079771136. Throughput: 0: 11286.7. Samples: 269995520. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:05,959][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 18:19:07,472][1653645] Updated weights for policy 0, policy_version 527293 (0.0014) [2024-06-15 18:19:08,649][1651596] Signal inference workers to stop experience collection... (27450 times) [2024-06-15 18:19:08,712][1653645] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-06-15 18:19:08,950][1651596] Signal inference workers to resume experience collection... (27450 times) [2024-06-15 18:19:08,951][1653645] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-06-15 18:19:09,065][1653645] Updated weights for policy 0, policy_version 527348 (0.0091) [2024-06-15 18:19:10,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 1080033280. Throughput: 0: 11127.4. Samples: 270059008. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:10,958][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 18:19:13,851][1653645] Updated weights for policy 0, policy_version 527392 (0.0018) [2024-06-15 18:19:15,688][1653645] Updated weights for policy 0, policy_version 527472 (0.0013) [2024-06-15 18:19:15,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 1080262656. Throughput: 0: 11161.6. Samples: 270128640. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:15,958][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 18:19:19,320][1653645] Updated weights for policy 0, policy_version 527536 (0.0153) [2024-06-15 18:19:20,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 1080524800. Throughput: 0: 11150.2. Samples: 270162944. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:20,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 18:19:25,958][1648982] Fps is (10 sec: 29490.9, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1080557568. Throughput: 0: 10979.5. Samples: 270223360. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:19:25,983][1653645] Updated weights for policy 0, policy_version 527619 (0.0015) [2024-06-15 18:19:27,549][1653645] Updated weights for policy 0, policy_version 527684 (0.0011) [2024-06-15 18:19:30,575][1653645] Updated weights for policy 0, policy_version 527749 (0.0013) [2024-06-15 18:19:30,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 1080852480. Throughput: 0: 10979.6. Samples: 270288896. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:30,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 18:19:32,467][1653645] Updated weights for policy 0, policy_version 527824 (0.0013) [2024-06-15 18:19:35,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.6, 300 sec: 43764.8). Total num frames: 1081081856. Throughput: 0: 10809.0. Samples: 270317056. Policy #0 lag: (min: 31.0, avg: 168.5, max: 287.0) [2024-06-15 18:19:35,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:19:38,055][1653645] Updated weights for policy 0, policy_version 527889 (0.0014) [2024-06-15 18:19:40,150][1653645] Updated weights for policy 0, policy_version 527970 (0.0013) [2024-06-15 18:19:40,983][1648982] Fps is (10 sec: 49031.2, 60 sec: 43672.8, 300 sec: 44427.5). Total num frames: 1081344000. Throughput: 0: 10871.3. Samples: 270390784. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:19:40,985][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:19:42,883][1653645] Updated weights for policy 0, policy_version 528048 (0.0162) [2024-06-15 18:19:44,824][1653645] Updated weights for policy 0, policy_version 528096 (0.0013) [2024-06-15 18:19:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1081606144. Throughput: 0: 10649.6. Samples: 270446592. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:19:45,958][1648982] Avg episode reward: [(0, '36.780')] [2024-06-15 18:19:49,759][1653645] Updated weights for policy 0, policy_version 528144 (0.0030) [2024-06-15 18:19:50,958][1648982] Fps is (10 sec: 36134.0, 60 sec: 43144.8, 300 sec: 43875.9). Total num frames: 1081704448. Throughput: 0: 11036.5. Samples: 270492160. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:19:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:19:51,581][1653645] Updated weights for policy 0, policy_version 528211 (0.0013) [2024-06-15 18:19:53,273][1653645] Updated weights for policy 0, policy_version 528280 (0.0012) [2024-06-15 18:19:53,559][1651596] Signal inference workers to stop experience collection... (27500 times) [2024-06-15 18:19:53,602][1653645] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-06-15 18:19:53,784][1651596] Signal inference workers to resume experience collection... (27500 times) [2024-06-15 18:19:53,790][1653645] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-06-15 18:19:55,702][1653645] Updated weights for policy 0, policy_version 528339 (0.0013) [2024-06-15 18:19:55,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44783.1, 300 sec: 45097.7). Total num frames: 1082064896. Throughput: 0: 11047.8. Samples: 270556160. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:19:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:20:00,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1082130432. Throughput: 0: 11264.0. Samples: 270635520. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:00,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:20:01,610][1653645] Updated weights for policy 0, policy_version 528400 (0.0012) [2024-06-15 18:20:04,454][1653645] Updated weights for policy 0, policy_version 528512 (0.0228) [2024-06-15 18:20:05,639][1653645] Updated weights for policy 0, policy_version 528572 (0.0013) [2024-06-15 18:20:05,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 1082523648. Throughput: 0: 11116.1. Samples: 270663168. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:05,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:20:10,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.5, 300 sec: 44431.7). Total num frames: 1082654720. Throughput: 0: 11013.7. Samples: 270718976. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:10,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 18:20:14,290][1653645] Updated weights for policy 0, policy_version 528672 (0.0116) [2024-06-15 18:20:15,678][1653645] Updated weights for policy 0, policy_version 528736 (0.0015) [2024-06-15 18:20:15,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43144.6, 300 sec: 44209.0). Total num frames: 1082851328. Throughput: 0: 11275.4. Samples: 270796288. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:20:17,297][1653645] Updated weights for policy 0, policy_version 528801 (0.0085) [2024-06-15 18:20:18,511][1653645] Updated weights for policy 0, policy_version 528838 (0.0065) [2024-06-15 18:20:19,379][1653645] Updated weights for policy 0, policy_version 528891 (0.0017) [2024-06-15 18:20:20,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 44236.9, 300 sec: 45097.7). Total num frames: 1083179008. Throughput: 0: 11207.1. Samples: 270821376. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:20,960][1648982] Avg episode reward: [(0, '37.080')] [2024-06-15 18:20:25,677][1653645] Updated weights for policy 0, policy_version 528948 (0.0040) [2024-06-15 18:20:25,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 1083310080. Throughput: 0: 11497.9. Samples: 270907904. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:25,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:20:27,072][1653645] Updated weights for policy 0, policy_version 529010 (0.0129) [2024-06-15 18:20:28,898][1653645] Updated weights for policy 0, policy_version 529079 (0.0031) [2024-06-15 18:20:30,083][1653645] Updated weights for policy 0, policy_version 529111 (0.0019) [2024-06-15 18:20:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1083670528. Throughput: 0: 11434.7. Samples: 270961152. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:30,958][1648982] Avg episode reward: [(0, '36.990')] [2024-06-15 18:20:30,993][1653645] Updated weights for policy 0, policy_version 529152 (0.0043) [2024-06-15 18:20:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1083703296. Throughput: 0: 11252.6. Samples: 270998528. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:20:36,894][1653645] Updated weights for policy 0, policy_version 529200 (0.0015) [2024-06-15 18:20:37,455][1651596] Signal inference workers to stop experience collection... (27550 times) [2024-06-15 18:20:37,655][1653645] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-06-15 18:20:37,693][1651596] Signal inference workers to resume experience collection... (27550 times) [2024-06-15 18:20:37,696][1653645] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-06-15 18:20:38,186][1653645] Updated weights for policy 0, policy_version 529248 (0.0013) [2024-06-15 18:20:39,563][1653645] Updated weights for policy 0, policy_version 529298 (0.0013) [2024-06-15 18:20:40,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 46440.4, 300 sec: 44986.6). Total num frames: 1084129280. Throughput: 0: 11559.9. Samples: 271076352. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:20:41,386][1653645] Updated weights for policy 0, policy_version 529378 (0.0023) [2024-06-15 18:20:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1084227584. Throughput: 0: 11275.4. Samples: 271142912. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:20:49,400][1653645] Updated weights for policy 0, policy_version 529472 (0.0013) [2024-06-15 18:20:50,959][1648982] Fps is (10 sec: 39321.4, 60 sec: 46967.4, 300 sec: 44542.3). Total num frames: 1084522496. Throughput: 0: 11571.2. Samples: 271183872. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:50,960][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 18:20:52,094][1653645] Updated weights for policy 0, policy_version 529600 (0.0019) [2024-06-15 18:20:53,785][1653645] Updated weights for policy 0, policy_version 529662 (0.0029) [2024-06-15 18:20:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44783.0, 300 sec: 45208.7). Total num frames: 1084751872. Throughput: 0: 11298.2. Samples: 271227392. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:20:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:20:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000529664_1084751872.pth... [2024-06-15 18:20:56,088][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000524416_1074003968.pth [2024-06-15 18:21:00,497][1653645] Updated weights for policy 0, policy_version 529697 (0.0030) [2024-06-15 18:21:00,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 1084850176. Throughput: 0: 11525.7. Samples: 271314944. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:21:00,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 18:21:02,263][1653645] Updated weights for policy 0, policy_version 529778 (0.0012) [2024-06-15 18:21:04,027][1653645] Updated weights for policy 0, policy_version 529842 (0.0012) [2024-06-15 18:21:05,956][1653645] Updated weights for policy 0, policy_version 529910 (0.0014) [2024-06-15 18:21:05,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 1085243392. Throughput: 0: 11491.5. Samples: 271338496. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:21:05,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 18:21:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1085276160. Throughput: 0: 11093.3. Samples: 271407104. Policy #0 lag: (min: 78.0, avg: 153.3, max: 335.0) [2024-06-15 18:21:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:21:12,404][1653645] Updated weights for policy 0, policy_version 529953 (0.0022) [2024-06-15 18:21:14,073][1653645] Updated weights for policy 0, policy_version 530032 (0.0012) [2024-06-15 18:21:14,994][1651596] Signal inference workers to stop experience collection... (27600 times) [2024-06-15 18:21:15,026][1653645] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-06-15 18:21:15,252][1651596] Signal inference workers to resume experience collection... (27600 times) [2024-06-15 18:21:15,253][1653645] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-06-15 18:21:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1085636608. Throughput: 0: 11366.4. Samples: 271472640. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:15,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 18:21:16,114][1653645] Updated weights for policy 0, policy_version 530112 (0.0130) [2024-06-15 18:21:17,683][1653645] Updated weights for policy 0, policy_version 530176 (0.0012) [2024-06-15 18:21:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1085800448. Throughput: 0: 11047.8. Samples: 271495680. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:20,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:21:25,254][1653645] Updated weights for policy 0, policy_version 530244 (0.0012) [2024-06-15 18:21:25,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 44782.6, 300 sec: 44320.0). Total num frames: 1085997056. Throughput: 0: 10979.5. Samples: 271570432. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:25,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:21:26,891][1653645] Updated weights for policy 0, policy_version 530306 (0.0012) [2024-06-15 18:21:28,753][1653645] Updated weights for policy 0, policy_version 530372 (0.0012) [2024-06-15 18:21:30,056][1653645] Updated weights for policy 0, policy_version 530432 (0.0013) [2024-06-15 18:21:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 45097.6). Total num frames: 1086324736. Throughput: 0: 10706.5. Samples: 271624704. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:21:35,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1086390272. Throughput: 0: 10683.7. Samples: 271664640. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:35,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 18:21:36,596][1653645] Updated weights for policy 0, policy_version 530496 (0.0021) [2024-06-15 18:21:37,951][1653645] Updated weights for policy 0, policy_version 530546 (0.0010) [2024-06-15 18:21:39,521][1653645] Updated weights for policy 0, policy_version 530610 (0.0015) [2024-06-15 18:21:40,964][1648982] Fps is (10 sec: 42571.5, 60 sec: 43686.1, 300 sec: 44985.6). Total num frames: 1086750720. Throughput: 0: 11182.8. Samples: 271730688. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:40,969][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:21:41,299][1653645] Updated weights for policy 0, policy_version 530661 (0.0012) [2024-06-15 18:21:45,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1086849024. Throughput: 0: 10729.2. Samples: 271797760. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:45,959][1648982] Avg episode reward: [(0, '37.100')] [2024-06-15 18:21:48,083][1653645] Updated weights for policy 0, policy_version 530709 (0.0013) [2024-06-15 18:21:49,429][1653645] Updated weights for policy 0, policy_version 530755 (0.0022) [2024-06-15 18:21:50,958][1648982] Fps is (10 sec: 36067.4, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 1087111168. Throughput: 0: 10922.7. Samples: 271830016. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:21:51,201][1653645] Updated weights for policy 0, policy_version 530832 (0.0053) [2024-06-15 18:21:52,365][1653645] Updated weights for policy 0, policy_version 530879 (0.0013) [2024-06-15 18:21:53,949][1653645] Updated weights for policy 0, policy_version 530942 (0.0012) [2024-06-15 18:21:55,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1087373312. Throughput: 0: 10672.3. Samples: 271887360. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:21:55,958][1648982] Avg episode reward: [(0, '37.010')] [2024-06-15 18:22:00,655][1653645] Updated weights for policy 0, policy_version 531006 (0.0013) [2024-06-15 18:22:00,957][1648982] Fps is (10 sec: 39322.3, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1087504384. Throughput: 0: 10797.6. Samples: 271958528. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:00,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:22:02,076][1651596] Signal inference workers to stop experience collection... (27650 times) [2024-06-15 18:22:02,162][1653645] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-06-15 18:22:02,334][1651596] Signal inference workers to resume experience collection... (27650 times) [2024-06-15 18:22:02,371][1653645] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-06-15 18:22:03,412][1653645] Updated weights for policy 0, policy_version 531060 (0.0022) [2024-06-15 18:22:05,836][1653645] Updated weights for policy 0, policy_version 531168 (0.0013) [2024-06-15 18:22:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 44875.5). Total num frames: 1087832064. Throughput: 0: 10979.6. Samples: 271989760. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:22:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1087897600. Throughput: 0: 10638.3. Samples: 272049152. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:22:12,207][1653645] Updated weights for policy 0, policy_version 531220 (0.0017) [2024-06-15 18:22:15,019][1653645] Updated weights for policy 0, policy_version 531282 (0.0013) [2024-06-15 18:22:15,958][1648982] Fps is (10 sec: 32767.0, 60 sec: 42052.1, 300 sec: 44431.2). Total num frames: 1088159744. Throughput: 0: 10956.7. Samples: 272117760. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:15,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:22:16,633][1653645] Updated weights for policy 0, policy_version 531352 (0.0015) [2024-06-15 18:22:18,354][1653645] Updated weights for policy 0, policy_version 531424 (0.0089) [2024-06-15 18:22:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1088421888. Throughput: 0: 10592.7. Samples: 272141312. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:22:23,747][1653645] Updated weights for policy 0, policy_version 531460 (0.0012) [2024-06-15 18:22:25,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 42598.7, 300 sec: 43986.9). Total num frames: 1088552960. Throughput: 0: 10742.1. Samples: 272214016. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:22:27,587][1653645] Updated weights for policy 0, policy_version 531552 (0.0077) [2024-06-15 18:22:29,640][1653645] Updated weights for policy 0, policy_version 531634 (0.0013) [2024-06-15 18:22:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 44542.3). Total num frames: 1088880640. Throughput: 0: 10524.5. Samples: 272271360. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:22:31,390][1653645] Updated weights for policy 0, policy_version 531712 (0.0017) [2024-06-15 18:22:35,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 1088978944. Throughput: 0: 10524.4. Samples: 272303616. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:22:36,779][1653645] Updated weights for policy 0, policy_version 531769 (0.0014) [2024-06-15 18:22:39,701][1653645] Updated weights for policy 0, policy_version 531809 (0.0018) [2024-06-15 18:22:40,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 41510.5, 300 sec: 44209.0). Total num frames: 1089241088. Throughput: 0: 10979.6. Samples: 272381440. Policy #0 lag: (min: 9.0, avg: 68.3, max: 265.0) [2024-06-15 18:22:40,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 18:22:41,430][1653645] Updated weights for policy 0, policy_version 531875 (0.0016) [2024-06-15 18:22:42,165][1651596] Signal inference workers to stop experience collection... (27700 times) [2024-06-15 18:22:42,262][1653645] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-06-15 18:22:42,521][1651596] Signal inference workers to resume experience collection... (27700 times) [2024-06-15 18:22:42,523][1653645] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-06-15 18:22:43,547][1653645] Updated weights for policy 0, policy_version 531963 (0.0092) [2024-06-15 18:22:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1089470464. Throughput: 0: 10569.9. Samples: 272434176. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:22:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:22:49,146][1653645] Updated weights for policy 0, policy_version 532032 (0.0021) [2024-06-15 18:22:50,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 1089601536. Throughput: 0: 10763.4. Samples: 272474112. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:22:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:22:53,315][1653645] Updated weights for policy 0, policy_version 532115 (0.0120) [2024-06-15 18:22:55,199][1653645] Updated weights for policy 0, policy_version 532193 (0.0079) [2024-06-15 18:22:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1089994752. Throughput: 0: 10672.3. Samples: 272529408. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:22:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:22:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000532224_1089994752.pth... [2024-06-15 18:22:56,016][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000527040_1079377920.pth [2024-06-15 18:23:00,393][1653645] Updated weights for policy 0, policy_version 532259 (0.0025) [2024-06-15 18:23:00,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1090125824. Throughput: 0: 10763.5. Samples: 272602112. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:00,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:23:03,992][1653645] Updated weights for policy 0, policy_version 532323 (0.0012) [2024-06-15 18:23:05,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 44098.0). Total num frames: 1090355200. Throughput: 0: 11150.2. Samples: 272643072. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:05,959][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 18:23:06,212][1653645] Updated weights for policy 0, policy_version 532416 (0.0016) [2024-06-15 18:23:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1090519040. Throughput: 0: 10717.9. Samples: 272696320. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:23:11,383][1653645] Updated weights for policy 0, policy_version 532496 (0.0012) [2024-06-15 18:23:15,743][1653645] Updated weights for policy 0, policy_version 532546 (0.0011) [2024-06-15 18:23:15,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 42052.5, 300 sec: 43764.7). Total num frames: 1090682880. Throughput: 0: 11127.5. Samples: 272772096. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:23:18,011][1653645] Updated weights for policy 0, policy_version 532645 (0.0019) [2024-06-15 18:23:19,367][1653645] Updated weights for policy 0, policy_version 532704 (0.0026) [2024-06-15 18:23:20,059][1653645] Updated weights for policy 0, policy_version 532736 (0.0011) [2024-06-15 18:23:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1091043328. Throughput: 0: 10922.7. Samples: 272795136. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:23:25,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1091174400. Throughput: 0: 10786.2. Samples: 272866816. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:23:27,893][1653645] Updated weights for policy 0, policy_version 532832 (0.0013) [2024-06-15 18:23:27,991][1651596] Signal inference workers to stop experience collection... (27750 times) [2024-06-15 18:23:28,034][1653645] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-06-15 18:23:28,232][1651596] Signal inference workers to resume experience collection... (27750 times) [2024-06-15 18:23:28,233][1653645] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-06-15 18:23:29,557][1653645] Updated weights for policy 0, policy_version 532898 (0.0013) [2024-06-15 18:23:30,968][1648982] Fps is (10 sec: 42555.2, 60 sec: 43137.2, 300 sec: 44096.4). Total num frames: 1091469312. Throughput: 0: 11125.0. Samples: 272934912. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:30,969][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 18:23:31,611][1653645] Updated weights for policy 0, policy_version 532981 (0.0043) [2024-06-15 18:23:35,282][1653645] Updated weights for policy 0, policy_version 533029 (0.0011) [2024-06-15 18:23:35,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1091698688. Throughput: 0: 10911.3. Samples: 272965120. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:35,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:23:38,613][1653645] Updated weights for policy 0, policy_version 533072 (0.0011) [2024-06-15 18:23:40,958][1648982] Fps is (10 sec: 42640.9, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 1091895296. Throughput: 0: 11423.3. Samples: 273043456. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:40,959][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 18:23:41,117][1653645] Updated weights for policy 0, policy_version 533156 (0.0013) [2024-06-15 18:23:43,082][1653645] Updated weights for policy 0, policy_version 533233 (0.0016) [2024-06-15 18:23:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1092091904. Throughput: 0: 11059.2. Samples: 273099776. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:23:47,088][1653645] Updated weights for policy 0, policy_version 533303 (0.0080) [2024-06-15 18:23:50,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 1092288512. Throughput: 0: 11002.3. Samples: 273138176. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:50,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:23:51,031][1653645] Updated weights for policy 0, policy_version 533347 (0.0014) [2024-06-15 18:23:53,639][1653645] Updated weights for policy 0, policy_version 533440 (0.0039) [2024-06-15 18:23:55,005][1653645] Updated weights for policy 0, policy_version 533504 (0.0013) [2024-06-15 18:23:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1092616192. Throughput: 0: 11059.2. Samples: 273193984. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:23:55,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 18:23:59,463][1653645] Updated weights for policy 0, policy_version 533560 (0.0021) [2024-06-15 18:24:00,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1092747264. Throughput: 0: 11025.1. Samples: 273268224. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:24:00,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:24:03,514][1653645] Updated weights for policy 0, policy_version 533616 (0.0013) [2024-06-15 18:24:04,625][1653645] Updated weights for policy 0, policy_version 533652 (0.0024) [2024-06-15 18:24:05,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44783.1, 300 sec: 44098.0). Total num frames: 1093042176. Throughput: 0: 11366.4. Samples: 273306624. Policy #0 lag: (min: 60.0, avg: 150.5, max: 338.0) [2024-06-15 18:24:05,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 18:24:06,375][1651596] Signal inference workers to stop experience collection... (27800 times) [2024-06-15 18:24:06,474][1653645] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-06-15 18:24:06,574][1651596] Signal inference workers to resume experience collection... (27800 times) [2024-06-15 18:24:06,574][1653645] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-06-15 18:24:06,576][1653645] Updated weights for policy 0, policy_version 533744 (0.0013) [2024-06-15 18:24:10,585][1653645] Updated weights for policy 0, policy_version 533792 (0.0025) [2024-06-15 18:24:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 1093206016. Throughput: 0: 11173.0. Samples: 273369600. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:10,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 18:24:14,559][1653645] Updated weights for policy 0, policy_version 533856 (0.0035) [2024-06-15 18:24:15,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45875.0, 300 sec: 43764.7). Total num frames: 1093435392. Throughput: 0: 11277.9. Samples: 273442304. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:15,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 18:24:16,811][1653645] Updated weights for policy 0, policy_version 533952 (0.0012) [2024-06-15 18:24:18,304][1653645] Updated weights for policy 0, policy_version 534014 (0.0011) [2024-06-15 18:24:20,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1093664768. Throughput: 0: 11070.6. Samples: 273463296. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:24:22,528][1653645] Updated weights for policy 0, policy_version 534070 (0.0013) [2024-06-15 18:24:25,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 1093828608. Throughput: 0: 11093.4. Samples: 273542656. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:25,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:24:26,951][1653645] Updated weights for policy 0, policy_version 534144 (0.0141) [2024-06-15 18:24:28,336][1653645] Updated weights for policy 0, policy_version 534208 (0.0013) [2024-06-15 18:24:29,624][1653645] Updated weights for policy 0, policy_version 534272 (0.0028) [2024-06-15 18:24:30,965][1648982] Fps is (10 sec: 52392.5, 60 sec: 45331.6, 300 sec: 44430.2). Total num frames: 1094189056. Throughput: 0: 11159.9. Samples: 273602048. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:30,966][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:24:33,930][1653645] Updated weights for policy 0, policy_version 534336 (0.0094) [2024-06-15 18:24:35,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.7, 300 sec: 43990.5). Total num frames: 1094320128. Throughput: 0: 11059.3. Samples: 273635840. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:24:38,310][1653645] Updated weights for policy 0, policy_version 534374 (0.0014) [2024-06-15 18:24:40,031][1653645] Updated weights for policy 0, policy_version 534450 (0.0013) [2024-06-15 18:24:40,959][1648982] Fps is (10 sec: 45906.4, 60 sec: 45875.3, 300 sec: 44209.0). Total num frames: 1094647808. Throughput: 0: 11468.8. Samples: 273710080. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:40,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:24:41,716][1653645] Updated weights for policy 0, policy_version 534528 (0.0041) [2024-06-15 18:24:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1094844416. Throughput: 0: 11241.2. Samples: 273774080. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:24:48,620][1653645] Updated weights for policy 0, policy_version 534595 (0.0018) [2024-06-15 18:24:50,114][1651596] Signal inference workers to stop experience collection... (27850 times) [2024-06-15 18:24:50,218][1653645] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-06-15 18:24:50,425][1651596] Signal inference workers to resume experience collection... (27850 times) [2024-06-15 18:24:50,426][1653645] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-06-15 18:24:50,634][1653645] Updated weights for policy 0, policy_version 534683 (0.0190) [2024-06-15 18:24:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 1095041024. Throughput: 0: 11366.4. Samples: 273818112. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:24:52,491][1653645] Updated weights for policy 0, policy_version 534754 (0.0031) [2024-06-15 18:24:55,993][1648982] Fps is (10 sec: 42448.5, 60 sec: 44210.8, 300 sec: 44536.9). Total num frames: 1095270400. Throughput: 0: 11107.4. Samples: 273869824. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:24:55,994][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:24:56,342][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000534832_1095335936.pth... [2024-06-15 18:24:56,403][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000529664_1084751872.pth [2024-06-15 18:24:56,504][1653645] Updated weights for policy 0, policy_version 534836 (0.0114) [2024-06-15 18:25:00,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1095368704. Throughput: 0: 11161.6. Samples: 273944576. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:00,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:25:01,672][1653645] Updated weights for policy 0, policy_version 534880 (0.0187) [2024-06-15 18:25:03,642][1653645] Updated weights for policy 0, policy_version 534967 (0.0019) [2024-06-15 18:25:04,657][1653645] Updated weights for policy 0, policy_version 534994 (0.0012) [2024-06-15 18:25:05,771][1653645] Updated weights for policy 0, policy_version 535040 (0.0012) [2024-06-15 18:25:05,958][1648982] Fps is (10 sec: 49326.3, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1095761920. Throughput: 0: 11286.7. Samples: 273971200. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:25:07,977][1653645] Updated weights for policy 0, policy_version 535096 (0.0014) [2024-06-15 18:25:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44782.7, 300 sec: 44209.0). Total num frames: 1095892992. Throughput: 0: 11070.5. Samples: 274040832. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:25:14,387][1653645] Updated weights for policy 0, policy_version 535184 (0.0035) [2024-06-15 18:25:15,370][1653645] Updated weights for policy 0, policy_version 535228 (0.0010) [2024-06-15 18:25:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.2, 300 sec: 43986.9). Total num frames: 1096155136. Throughput: 0: 11208.8. Samples: 274106368. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:25:16,932][1653645] Updated weights for policy 0, policy_version 535295 (0.0013) [2024-06-15 18:25:19,495][1653645] Updated weights for policy 0, policy_version 535358 (0.0013) [2024-06-15 18:25:20,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1096417280. Throughput: 0: 11264.0. Samples: 274142720. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:25:25,877][1653645] Updated weights for policy 0, policy_version 535424 (0.0013) [2024-06-15 18:25:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 43653.6). Total num frames: 1096548352. Throughput: 0: 11309.5. Samples: 274219008. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:25,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 18:25:27,359][1653645] Updated weights for policy 0, policy_version 535479 (0.0014) [2024-06-15 18:25:28,021][1653645] Updated weights for policy 0, policy_version 535505 (0.0150) [2024-06-15 18:25:30,177][1653645] Updated weights for policy 0, policy_version 535590 (0.0011) [2024-06-15 18:25:30,960][1648982] Fps is (10 sec: 52428.6, 60 sec: 45880.4, 300 sec: 44875.5). Total num frames: 1096941568. Throughput: 0: 11047.8. Samples: 274271232. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:30,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:25:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 1096941568. Throughput: 0: 10899.9. Samples: 274308608. Policy #0 lag: (min: 16.0, avg: 141.4, max: 272.0) [2024-06-15 18:25:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:25:36,397][1651596] Signal inference workers to stop experience collection... (27900 times) [2024-06-15 18:25:36,457][1653645] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-06-15 18:25:36,663][1651596] Signal inference workers to resume experience collection... (27900 times) [2024-06-15 18:25:36,671][1653645] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-06-15 18:25:37,902][1653645] Updated weights for policy 0, policy_version 535680 (0.0092) [2024-06-15 18:25:39,107][1653645] Updated weights for policy 0, policy_version 535744 (0.0011) [2024-06-15 18:25:40,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 1097236480. Throughput: 0: 11124.8. Samples: 274370048. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:25:40,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:25:41,805][1653645] Updated weights for policy 0, policy_version 535808 (0.0113) [2024-06-15 18:25:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1097465856. Throughput: 0: 10843.1. Samples: 274432512. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:25:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:25:49,465][1653645] Updated weights for policy 0, policy_version 535892 (0.0012) [2024-06-15 18:25:50,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 43653.7). Total num frames: 1097629696. Throughput: 0: 11047.8. Samples: 274468352. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:25:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:25:50,978][1653645] Updated weights for policy 0, policy_version 535955 (0.0012) [2024-06-15 18:25:52,928][1653645] Updated weights for policy 0, policy_version 536006 (0.0011) [2024-06-15 18:25:54,478][1653645] Updated weights for policy 0, policy_version 536080 (0.0105) [2024-06-15 18:25:55,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45355.7, 300 sec: 44542.2). Total num frames: 1097990144. Throughput: 0: 10922.7. Samples: 274532352. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:25:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:26:00,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.9, 300 sec: 43209.3). Total num frames: 1097990144. Throughput: 0: 10979.6. Samples: 274600448. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:26:02,164][1653645] Updated weights for policy 0, policy_version 536176 (0.0028) [2024-06-15 18:26:04,978][1653645] Updated weights for policy 0, policy_version 536259 (0.0012) [2024-06-15 18:26:05,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1098350592. Throughput: 0: 10797.5. Samples: 274628608. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:05,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 18:26:06,912][1653645] Updated weights for policy 0, policy_version 536338 (0.0012) [2024-06-15 18:26:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.8, 300 sec: 43653.7). Total num frames: 1098514432. Throughput: 0: 10433.4. Samples: 274688512. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:26:13,529][1653645] Updated weights for policy 0, policy_version 536400 (0.0014) [2024-06-15 18:26:15,017][1653645] Updated weights for policy 0, policy_version 536467 (0.0013) [2024-06-15 18:26:15,959][1648982] Fps is (10 sec: 42593.5, 60 sec: 43689.8, 300 sec: 43986.7). Total num frames: 1098776576. Throughput: 0: 10911.0. Samples: 274762240. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:15,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:26:17,303][1651596] Signal inference workers to stop experience collection... (27950 times) [2024-06-15 18:26:17,364][1653645] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-06-15 18:26:17,649][1651596] Signal inference workers to resume experience collection... (27950 times) [2024-06-15 18:26:17,650][1653645] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-06-15 18:26:17,810][1653645] Updated weights for policy 0, policy_version 536545 (0.0014) [2024-06-15 18:26:19,787][1653645] Updated weights for policy 0, policy_version 536624 (0.0022) [2024-06-15 18:26:20,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 44209.1). Total num frames: 1099038720. Throughput: 0: 10717.8. Samples: 274790912. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:20,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 18:26:25,958][1648982] Fps is (10 sec: 32771.7, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 1099104256. Throughput: 0: 10877.1. Samples: 274859520. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:26:27,089][1653645] Updated weights for policy 0, policy_version 536720 (0.0012) [2024-06-15 18:26:28,067][1653645] Updated weights for policy 0, policy_version 536760 (0.0013) [2024-06-15 18:26:29,612][1653645] Updated weights for policy 0, policy_version 536816 (0.0012) [2024-06-15 18:26:30,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1099497472. Throughput: 0: 10808.9. Samples: 274918912. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:26:31,593][1653645] Updated weights for policy 0, policy_version 536890 (0.0029) [2024-06-15 18:26:35,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 43690.5, 300 sec: 43432.4). Total num frames: 1099563008. Throughput: 0: 10672.3. Samples: 274948608. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:35,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:26:38,585][1653645] Updated weights for policy 0, policy_version 536954 (0.0013) [2024-06-15 18:26:40,268][1653645] Updated weights for policy 0, policy_version 537009 (0.0013) [2024-06-15 18:26:40,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1099857920. Throughput: 0: 10945.4. Samples: 275024896. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:40,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:26:41,933][1653645] Updated weights for policy 0, policy_version 537081 (0.0013) [2024-06-15 18:26:43,043][1653645] Updated weights for policy 0, policy_version 537130 (0.0040) [2024-06-15 18:26:45,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1100087296. Throughput: 0: 10729.2. Samples: 275083264. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:26:48,972][1653645] Updated weights for policy 0, policy_version 537170 (0.0028) [2024-06-15 18:26:49,766][1653645] Updated weights for policy 0, policy_version 537212 (0.0012) [2024-06-15 18:26:50,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 43690.4, 300 sec: 43653.6). Total num frames: 1100251136. Throughput: 0: 11070.5. Samples: 275126784. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:50,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:26:51,993][1653645] Updated weights for policy 0, policy_version 537282 (0.0016) [2024-06-15 18:26:53,479][1653645] Updated weights for policy 0, policy_version 537344 (0.0011) [2024-06-15 18:26:55,958][1648982] Fps is (10 sec: 52426.6, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1100611584. Throughput: 0: 11036.3. Samples: 275185152. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:26:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:26:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000537408_1100611584.pth... [2024-06-15 18:26:56,004][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000532224_1089994752.pth [2024-06-15 18:27:00,739][1651596] Signal inference workers to stop experience collection... (28000 times) [2024-06-15 18:27:00,775][1653645] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-06-15 18:27:00,947][1651596] Signal inference workers to resume experience collection... (28000 times) [2024-06-15 18:27:00,947][1653645] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-06-15 18:27:00,950][1653645] Updated weights for policy 0, policy_version 537440 (0.0013) [2024-06-15 18:27:00,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 1100677120. Throughput: 0: 11116.4. Samples: 275262464. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:27:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:27:03,164][1653645] Updated weights for policy 0, policy_version 537504 (0.0013) [2024-06-15 18:27:04,748][1653645] Updated weights for policy 0, policy_version 537568 (0.0012) [2024-06-15 18:27:05,958][1648982] Fps is (10 sec: 39323.3, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1101004800. Throughput: 0: 11059.2. Samples: 275288576. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:27:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:27:06,578][1653645] Updated weights for policy 0, policy_version 537632 (0.0013) [2024-06-15 18:27:10,958][1648982] Fps is (10 sec: 45873.3, 60 sec: 43690.4, 300 sec: 43986.9). Total num frames: 1101135872. Throughput: 0: 10911.2. Samples: 275350528. Policy #0 lag: (min: 15.0, avg: 135.9, max: 276.0) [2024-06-15 18:27:10,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:27:12,725][1653645] Updated weights for policy 0, policy_version 537698 (0.0013) [2024-06-15 18:27:15,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42599.2, 300 sec: 43764.7). Total num frames: 1101332480. Throughput: 0: 11116.1. Samples: 275419136. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:27:16,390][1653645] Updated weights for policy 0, policy_version 537777 (0.0012) [2024-06-15 18:27:17,786][1653645] Updated weights for policy 0, policy_version 537840 (0.0024) [2024-06-15 18:27:18,684][1653645] Updated weights for policy 0, policy_version 537881 (0.0012) [2024-06-15 18:27:20,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1101660160. Throughput: 0: 11093.4. Samples: 275447808. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:20,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:27:23,808][1653645] Updated weights for policy 0, policy_version 537923 (0.0014) [2024-06-15 18:27:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 1101791232. Throughput: 0: 11047.8. Samples: 275522048. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:27:26,824][1653645] Updated weights for policy 0, policy_version 538002 (0.0092) [2024-06-15 18:27:28,467][1653645] Updated weights for policy 0, policy_version 538080 (0.0012) [2024-06-15 18:27:29,890][1653645] Updated weights for policy 0, policy_version 538128 (0.0015) [2024-06-15 18:27:30,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1102151680. Throughput: 0: 11104.7. Samples: 275582976. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:27:31,004][1653645] Updated weights for policy 0, policy_version 538174 (0.0011) [2024-06-15 18:27:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 1102217216. Throughput: 0: 10956.9. Samples: 275619840. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:35,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:27:36,986][1653645] Updated weights for policy 0, policy_version 538232 (0.0012) [2024-06-15 18:27:39,329][1653645] Updated weights for policy 0, policy_version 538288 (0.0020) [2024-06-15 18:27:40,408][1653645] Updated weights for policy 0, policy_version 538342 (0.0011) [2024-06-15 18:27:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1102577664. Throughput: 0: 11127.6. Samples: 275685888. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:27:42,478][1651596] Signal inference workers to stop experience collection... (28050 times) [2024-06-15 18:27:42,514][1653645] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-06-15 18:27:42,652][1651596] Signal inference workers to resume experience collection... (28050 times) [2024-06-15 18:27:42,652][1653645] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-06-15 18:27:42,759][1653645] Updated weights for policy 0, policy_version 538423 (0.0012) [2024-06-15 18:27:45,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1102708736. Throughput: 0: 11081.9. Samples: 275761152. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:27:47,734][1653645] Updated weights for policy 0, policy_version 538480 (0.0012) [2024-06-15 18:27:49,453][1653645] Updated weights for policy 0, policy_version 538514 (0.0014) [2024-06-15 18:27:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.5, 300 sec: 44098.0). Total num frames: 1103003648. Throughput: 0: 11229.9. Samples: 275793920. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:27:51,049][1653645] Updated weights for policy 0, policy_version 538592 (0.0037) [2024-06-15 18:27:53,408][1653645] Updated weights for policy 0, policy_version 538656 (0.0014) [2024-06-15 18:27:55,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1103233024. Throughput: 0: 11412.0. Samples: 275864064. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:27:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:27:57,949][1653645] Updated weights for policy 0, policy_version 538721 (0.0013) [2024-06-15 18:28:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.0, 300 sec: 44209.1). Total num frames: 1103396864. Throughput: 0: 11582.6. Samples: 275940352. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:28:01,160][1653645] Updated weights for policy 0, policy_version 538791 (0.0016) [2024-06-15 18:28:02,662][1653645] Updated weights for policy 0, policy_version 538864 (0.0172) [2024-06-15 18:28:04,589][1653645] Updated weights for policy 0, policy_version 538912 (0.0011) [2024-06-15 18:28:05,448][1653645] Updated weights for policy 0, policy_version 538944 (0.0012) [2024-06-15 18:28:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1103757312. Throughput: 0: 11616.8. Samples: 275970560. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:28:09,516][1653645] Updated weights for policy 0, policy_version 539008 (0.0101) [2024-06-15 18:28:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45875.5, 300 sec: 44764.4). Total num frames: 1103888384. Throughput: 0: 11537.1. Samples: 276041216. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:10,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:28:12,512][1653645] Updated weights for policy 0, policy_version 539068 (0.0016) [2024-06-15 18:28:13,940][1653645] Updated weights for policy 0, policy_version 539120 (0.0012) [2024-06-15 18:28:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 47513.6, 300 sec: 44542.3). Total num frames: 1104183296. Throughput: 0: 11719.1. Samples: 276110336. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:15,990][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:28:16,639][1653645] Updated weights for policy 0, policy_version 539188 (0.0011) [2024-06-15 18:28:20,680][1653645] Updated weights for policy 0, policy_version 539260 (0.0013) [2024-06-15 18:28:20,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 45875.2, 300 sec: 44875.4). Total num frames: 1104412672. Throughput: 0: 11719.0. Samples: 276147200. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:20,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:28:24,273][1653645] Updated weights for policy 0, policy_version 539319 (0.0013) [2024-06-15 18:28:25,933][1653645] Updated weights for policy 0, policy_version 539379 (0.0016) [2024-06-15 18:28:25,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 47513.6, 300 sec: 44654.9). Total num frames: 1104642048. Throughput: 0: 11650.9. Samples: 276210176. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:28:27,789][1653645] Updated weights for policy 0, policy_version 539424 (0.0011) [2024-06-15 18:28:27,907][1651596] Signal inference workers to stop experience collection... (28100 times) [2024-06-15 18:28:27,954][1653645] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-06-15 18:28:28,170][1651596] Signal inference workers to resume experience collection... (28100 times) [2024-06-15 18:28:28,171][1653645] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-06-15 18:28:28,562][1653645] Updated weights for policy 0, policy_version 539456 (0.0011) [2024-06-15 18:28:30,959][1648982] Fps is (10 sec: 39322.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1104805888. Throughput: 0: 11559.9. Samples: 276281344. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:30,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:28:32,095][1653645] Updated weights for policy 0, policy_version 539508 (0.0027) [2024-06-15 18:28:35,192][1653645] Updated weights for policy 0, policy_version 539580 (0.0015) [2024-06-15 18:28:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 47513.6, 300 sec: 44653.4). Total num frames: 1105068032. Throughput: 0: 11639.5. Samples: 276317696. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:28:38,068][1653645] Updated weights for policy 0, policy_version 539637 (0.0078) [2024-06-15 18:28:38,799][1653645] Updated weights for policy 0, policy_version 539664 (0.0009) [2024-06-15 18:28:39,973][1653645] Updated weights for policy 0, policy_version 539709 (0.0118) [2024-06-15 18:28:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1105330176. Throughput: 0: 11434.7. Samples: 276378624. Policy #0 lag: (min: 8.0, avg: 79.9, max: 264.0) [2024-06-15 18:28:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:28:43,836][1653645] Updated weights for policy 0, policy_version 539770 (0.0024) [2024-06-15 18:28:45,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 45875.5, 300 sec: 44653.4). Total num frames: 1105461248. Throughput: 0: 11241.3. Samples: 276446208. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:28:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:28:47,401][1653645] Updated weights for policy 0, policy_version 539839 (0.0012) [2024-06-15 18:28:50,582][1653645] Updated weights for policy 0, policy_version 539904 (0.0010) [2024-06-15 18:28:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1105723392. Throughput: 0: 11343.6. Samples: 276481024. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:28:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:28:52,023][1653645] Updated weights for policy 0, policy_version 539967 (0.0021) [2024-06-15 18:28:55,681][1653645] Updated weights for policy 0, policy_version 540016 (0.0097) [2024-06-15 18:28:55,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1105952768. Throughput: 0: 11218.5. Samples: 276546048. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:28:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:28:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000540032_1105985536.pth... [2024-06-15 18:28:56,026][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000534832_1095335936.pth [2024-06-15 18:28:58,773][1653645] Updated weights for policy 0, policy_version 540066 (0.0012) [2024-06-15 18:29:00,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1106116608. Throughput: 0: 11229.9. Samples: 276615680. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:29:01,281][1653645] Updated weights for policy 0, policy_version 540115 (0.0019) [2024-06-15 18:29:03,613][1653645] Updated weights for policy 0, policy_version 540208 (0.0013) [2024-06-15 18:29:05,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1106378752. Throughput: 0: 10991.0. Samples: 276641792. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:29:07,984][1653645] Updated weights for policy 0, policy_version 540272 (0.0015) [2024-06-15 18:29:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 44653.4). Total num frames: 1106608128. Throughput: 0: 11081.9. Samples: 276708864. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:29:11,040][1653645] Updated weights for policy 0, policy_version 540342 (0.0090) [2024-06-15 18:29:14,266][1653645] Updated weights for policy 0, policy_version 540384 (0.0013) [2024-06-15 18:29:15,195][1651596] Signal inference workers to stop experience collection... (28150 times) [2024-06-15 18:29:15,242][1653645] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-06-15 18:29:15,512][1651596] Signal inference workers to resume experience collection... (28150 times) [2024-06-15 18:29:15,513][1653645] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-06-15 18:29:15,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 44236.6, 300 sec: 44653.3). Total num frames: 1106837504. Throughput: 0: 10854.3. Samples: 276769792. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:29:16,180][1653645] Updated weights for policy 0, policy_version 540468 (0.0013) [2024-06-15 18:29:19,316][1653645] Updated weights for policy 0, policy_version 540512 (0.0012) [2024-06-15 18:29:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 1107034112. Throughput: 0: 10831.6. Samples: 276805120. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:29:22,611][1653645] Updated weights for policy 0, policy_version 540576 (0.0017) [2024-06-15 18:29:25,947][1653645] Updated weights for policy 0, policy_version 540609 (0.0012) [2024-06-15 18:29:25,958][1648982] Fps is (10 sec: 32768.9, 60 sec: 42052.2, 300 sec: 43987.9). Total num frames: 1107165184. Throughput: 0: 10831.6. Samples: 276866048. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:29:27,216][1653645] Updated weights for policy 0, policy_version 540670 (0.0013) [2024-06-15 18:29:29,088][1653645] Updated weights for policy 0, policy_version 540720 (0.0012) [2024-06-15 18:29:30,712][1653645] Updated weights for policy 0, policy_version 540752 (0.0011) [2024-06-15 18:29:30,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1107460096. Throughput: 0: 10877.1. Samples: 276935680. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:29:33,915][1653645] Updated weights for policy 0, policy_version 540819 (0.0015) [2024-06-15 18:29:35,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 1107689472. Throughput: 0: 10786.1. Samples: 276966400. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:29:38,550][1653645] Updated weights for policy 0, policy_version 540896 (0.0015) [2024-06-15 18:29:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1107886080. Throughput: 0: 10763.4. Samples: 277030400. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:29:41,471][1653645] Updated weights for policy 0, policy_version 540981 (0.0093) [2024-06-15 18:29:43,905][1653645] Updated weights for policy 0, policy_version 541040 (0.0015) [2024-06-15 18:29:45,858][1653645] Updated weights for policy 0, policy_version 541072 (0.0056) [2024-06-15 18:29:45,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 1108115456. Throughput: 0: 10672.3. Samples: 277095936. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:29:46,829][1653645] Updated weights for policy 0, policy_version 541120 (0.0013) [2024-06-15 18:29:50,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42052.3, 300 sec: 43992.1). Total num frames: 1108246528. Throughput: 0: 10865.8. Samples: 277130752. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:29:51,831][1653645] Updated weights for policy 0, policy_version 541181 (0.0028) [2024-06-15 18:29:53,055][1653645] Updated weights for policy 0, policy_version 541219 (0.0012) [2024-06-15 18:29:55,092][1653645] Updated weights for policy 0, policy_version 541268 (0.0030) [2024-06-15 18:29:55,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 43690.4, 300 sec: 44764.4). Total num frames: 1108574208. Throughput: 0: 10842.9. Samples: 277196800. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:29:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:29:56,892][1653645] Updated weights for policy 0, policy_version 541318 (0.0133) [2024-06-15 18:29:58,012][1653645] Updated weights for policy 0, policy_version 541369 (0.0018) [2024-06-15 18:30:00,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1108738048. Throughput: 0: 11127.5. Samples: 277270528. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:30:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:30:02,260][1653645] Updated weights for policy 0, policy_version 541411 (0.0012) [2024-06-15 18:30:03,114][1653645] Updated weights for policy 0, policy_version 541442 (0.0015) [2024-06-15 18:30:03,516][1651596] Signal inference workers to stop experience collection... (28200 times) [2024-06-15 18:30:03,563][1653645] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-06-15 18:30:03,809][1651596] Signal inference workers to resume experience collection... (28200 times) [2024-06-15 18:30:03,810][1653645] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-06-15 18:30:04,380][1653645] Updated weights for policy 0, policy_version 541498 (0.0012) [2024-06-15 18:30:05,958][1648982] Fps is (10 sec: 45877.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1109032960. Throughput: 0: 11138.9. Samples: 277306368. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:30:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:30:06,421][1653645] Updated weights for policy 0, policy_version 541538 (0.0012) [2024-06-15 18:30:08,324][1653645] Updated weights for policy 0, policy_version 541600 (0.0013) [2024-06-15 18:30:10,960][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1109262336. Throughput: 0: 11150.2. Samples: 277367808. Policy #0 lag: (min: 15.0, avg: 117.2, max: 271.0) [2024-06-15 18:30:10,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:30:14,067][1653645] Updated weights for policy 0, policy_version 541669 (0.0014) [2024-06-15 18:30:14,912][1653645] Updated weights for policy 0, policy_version 541696 (0.0073) [2024-06-15 18:30:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.9, 300 sec: 44209.0). Total num frames: 1109458944. Throughput: 0: 11218.5. Samples: 277440512. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:30:16,108][1653645] Updated weights for policy 0, policy_version 541744 (0.0026) [2024-06-15 18:30:17,803][1653645] Updated weights for policy 0, policy_version 541779 (0.0014) [2024-06-15 18:30:19,656][1653645] Updated weights for policy 0, policy_version 541842 (0.0015) [2024-06-15 18:30:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1109786624. Throughput: 0: 11207.2. Samples: 277470720. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:30:24,966][1653645] Updated weights for policy 0, policy_version 541920 (0.0013) [2024-06-15 18:30:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1109917696. Throughput: 0: 11593.9. Samples: 277552128. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:30:26,919][1653645] Updated weights for policy 0, policy_version 541985 (0.0079) [2024-06-15 18:30:29,853][1653645] Updated weights for policy 0, policy_version 542035 (0.0013) [2024-06-15 18:30:30,961][1648982] Fps is (10 sec: 39308.1, 60 sec: 45326.4, 300 sec: 44875.0). Total num frames: 1110179840. Throughput: 0: 11365.6. Samples: 277607424. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:30,962][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:30:31,109][1653645] Updated weights for policy 0, policy_version 542082 (0.0012) [2024-06-15 18:30:31,905][1653645] Updated weights for policy 0, policy_version 542132 (0.0013) [2024-06-15 18:30:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 1110343680. Throughput: 0: 11423.3. Samples: 277644800. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:30:37,333][1653645] Updated weights for policy 0, policy_version 542209 (0.0019) [2024-06-15 18:30:38,603][1653645] Updated weights for policy 0, policy_version 542272 (0.0046) [2024-06-15 18:30:40,958][1648982] Fps is (10 sec: 39333.4, 60 sec: 44782.6, 300 sec: 44431.1). Total num frames: 1110573056. Throughput: 0: 11400.5. Samples: 277709824. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:30:41,988][1653645] Updated weights for policy 0, policy_version 542327 (0.0012) [2024-06-15 18:30:45,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1110835200. Throughput: 0: 11366.4. Samples: 277782016. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:30:47,195][1653645] Updated weights for policy 0, policy_version 542402 (0.0013) [2024-06-15 18:30:48,396][1651596] Signal inference workers to stop experience collection... (28250 times) [2024-06-15 18:30:48,476][1653645] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-06-15 18:30:48,763][1651596] Signal inference workers to resume experience collection... (28250 times) [2024-06-15 18:30:48,764][1653645] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-06-15 18:30:49,124][1653645] Updated weights for policy 0, policy_version 542480 (0.0078) [2024-06-15 18:30:50,958][1648982] Fps is (10 sec: 52431.6, 60 sec: 47513.6, 300 sec: 44431.2). Total num frames: 1111097344. Throughput: 0: 11400.5. Samples: 277819392. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:30:52,712][1653645] Updated weights for policy 0, policy_version 542545 (0.0025) [2024-06-15 18:30:54,268][1653645] Updated weights for policy 0, policy_version 542596 (0.0018) [2024-06-15 18:30:55,573][1653645] Updated weights for policy 0, policy_version 542656 (0.0076) [2024-06-15 18:30:55,972][1648982] Fps is (10 sec: 52368.3, 60 sec: 46412.6, 300 sec: 45318.0). Total num frames: 1111359488. Throughput: 0: 11534.1. Samples: 277886976. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:30:55,976][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:30:55,989][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000542656_1111359488.pth... [2024-06-15 18:30:56,058][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000537408_1100611584.pth [2024-06-15 18:30:56,063][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000542656_1111359488.pth [2024-06-15 18:30:59,963][1653645] Updated weights for policy 0, policy_version 542724 (0.0015) [2024-06-15 18:31:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 1111588864. Throughput: 0: 11411.9. Samples: 277954048. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:31:01,098][1653645] Updated weights for policy 0, policy_version 542776 (0.0011) [2024-06-15 18:31:04,793][1653645] Updated weights for policy 0, policy_version 542832 (0.0014) [2024-06-15 18:31:05,958][1648982] Fps is (10 sec: 39366.7, 60 sec: 45328.9, 300 sec: 44875.5). Total num frames: 1111752704. Throughput: 0: 11605.3. Samples: 277992960. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:31:06,540][1653645] Updated weights for policy 0, policy_version 542880 (0.0013) [2024-06-15 18:31:10,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 44542.4). Total num frames: 1111916544. Throughput: 0: 11195.7. Samples: 278055936. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:31:11,200][1653645] Updated weights for policy 0, policy_version 542946 (0.0012) [2024-06-15 18:31:12,729][1653645] Updated weights for policy 0, policy_version 543024 (0.0013) [2024-06-15 18:31:14,964][1653645] Updated weights for policy 0, policy_version 543060 (0.0014) [2024-06-15 18:31:15,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1112276992. Throughput: 0: 11549.3. Samples: 278127104. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:15,975][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:31:17,966][1653645] Updated weights for policy 0, policy_version 543120 (0.0017) [2024-06-15 18:31:20,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 45097.6). Total num frames: 1112408064. Throughput: 0: 11411.9. Samples: 278158336. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:31:22,087][1653645] Updated weights for policy 0, policy_version 543184 (0.0015) [2024-06-15 18:31:23,666][1653645] Updated weights for policy 0, policy_version 543252 (0.0092) [2024-06-15 18:31:24,523][1653645] Updated weights for policy 0, policy_version 543296 (0.0012) [2024-06-15 18:31:25,957][1648982] Fps is (10 sec: 39322.2, 60 sec: 45875.3, 300 sec: 44653.4). Total num frames: 1112670208. Throughput: 0: 11457.6. Samples: 278225408. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:25,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:31:26,705][1653645] Updated weights for policy 0, policy_version 543346 (0.0012) [2024-06-15 18:31:29,804][1653645] Updated weights for policy 0, policy_version 543395 (0.0011) [2024-06-15 18:31:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45877.9, 300 sec: 45319.9). Total num frames: 1112932352. Throughput: 0: 11446.1. Samples: 278297088. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:30,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 18:31:33,459][1651596] Signal inference workers to stop experience collection... (28300 times) [2024-06-15 18:31:33,503][1653645] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-06-15 18:31:33,716][1651596] Signal inference workers to resume experience collection... (28300 times) [2024-06-15 18:31:33,717][1653645] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-06-15 18:31:33,719][1653645] Updated weights for policy 0, policy_version 543456 (0.0012) [2024-06-15 18:31:35,272][1653645] Updated weights for policy 0, policy_version 543522 (0.0099) [2024-06-15 18:31:35,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 47513.6, 300 sec: 45208.7). Total num frames: 1113194496. Throughput: 0: 11514.3. Samples: 278337536. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:31:37,516][1653645] Updated weights for policy 0, policy_version 543568 (0.0014) [2024-06-15 18:31:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.6, 300 sec: 44875.5). Total num frames: 1113325568. Throughput: 0: 11301.1. Samples: 278395392. Policy #0 lag: (min: 7.0, avg: 89.2, max: 263.0) [2024-06-15 18:31:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:31:41,281][1653645] Updated weights for policy 0, policy_version 543632 (0.0052) [2024-06-15 18:31:42,305][1653645] Updated weights for policy 0, policy_version 543678 (0.0108) [2024-06-15 18:31:45,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 45329.2, 300 sec: 45097.7). Total num frames: 1113554944. Throughput: 0: 11480.2. Samples: 278470656. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:31:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:31:46,456][1653645] Updated weights for policy 0, policy_version 543746 (0.0012) [2024-06-15 18:31:47,618][1653645] Updated weights for policy 0, policy_version 543807 (0.0012) [2024-06-15 18:31:49,713][1653645] Updated weights for policy 0, policy_version 543872 (0.0012) [2024-06-15 18:31:50,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 45874.9, 300 sec: 44875.5). Total num frames: 1113849856. Throughput: 0: 11343.6. Samples: 278503424. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:31:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:31:53,813][1653645] Updated weights for policy 0, policy_version 543933 (0.0052) [2024-06-15 18:31:55,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43699.1, 300 sec: 45097.6). Total num frames: 1113980928. Throughput: 0: 11400.5. Samples: 278568960. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:31:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:31:58,126][1653645] Updated weights for policy 0, policy_version 543988 (0.0012) [2024-06-15 18:31:59,780][1653645] Updated weights for policy 0, policy_version 544054 (0.0011) [2024-06-15 18:32:00,958][1648982] Fps is (10 sec: 45877.0, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1114308608. Throughput: 0: 11286.8. Samples: 278635008. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:32:01,027][1653645] Updated weights for policy 0, policy_version 544112 (0.0117) [2024-06-15 18:32:04,867][1653645] Updated weights for policy 0, policy_version 544176 (0.0012) [2024-06-15 18:32:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.3, 300 sec: 45319.9). Total num frames: 1114505216. Throughput: 0: 11366.4. Samples: 278669824. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:32:09,088][1653645] Updated weights for policy 0, policy_version 544210 (0.0012) [2024-06-15 18:32:10,960][1648982] Fps is (10 sec: 39312.7, 60 sec: 46419.6, 300 sec: 45319.5). Total num frames: 1114701824. Throughput: 0: 11422.7. Samples: 278739456. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:10,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:32:11,378][1653645] Updated weights for policy 0, policy_version 544313 (0.0012) [2024-06-15 18:32:12,920][1653645] Updated weights for policy 0, policy_version 544372 (0.0013) [2024-06-15 18:32:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 1114931200. Throughput: 0: 11264.0. Samples: 278803968. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:32:15,980][1653645] Updated weights for policy 0, policy_version 544403 (0.0075) [2024-06-15 18:32:16,363][1651596] Signal inference workers to stop experience collection... (28350 times) [2024-06-15 18:32:16,415][1653645] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-06-15 18:32:16,623][1651596] Signal inference workers to resume experience collection... (28350 times) [2024-06-15 18:32:16,650][1653645] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-06-15 18:32:20,958][1648982] Fps is (10 sec: 32775.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1115029504. Throughput: 0: 11036.4. Samples: 278834176. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:32:21,563][1653645] Updated weights for policy 0, policy_version 544471 (0.0012) [2024-06-15 18:32:23,056][1653645] Updated weights for policy 0, policy_version 544544 (0.0138) [2024-06-15 18:32:24,361][1653645] Updated weights for policy 0, policy_version 544592 (0.0011) [2024-06-15 18:32:25,363][1653645] Updated weights for policy 0, policy_version 544638 (0.0013) [2024-06-15 18:32:25,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45875.0, 300 sec: 44986.6). Total num frames: 1115422720. Throughput: 0: 11184.3. Samples: 278898688. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:25,959][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 18:32:27,981][1653645] Updated weights for policy 0, policy_version 544698 (0.0012) [2024-06-15 18:32:30,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1115553792. Throughput: 0: 11127.5. Samples: 278971392. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:32:34,097][1653645] Updated weights for policy 0, policy_version 544756 (0.0093) [2024-06-15 18:32:35,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43144.6, 300 sec: 44764.4). Total num frames: 1115783168. Throughput: 0: 11173.1. Samples: 279006208. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:32:36,437][1653645] Updated weights for policy 0, policy_version 544848 (0.0011) [2024-06-15 18:32:37,566][1653645] Updated weights for policy 0, policy_version 544896 (0.0013) [2024-06-15 18:32:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45208.8). Total num frames: 1116045312. Throughput: 0: 10922.7. Samples: 279060480. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:32:41,195][1653645] Updated weights for policy 0, policy_version 544957 (0.0015) [2024-06-15 18:32:45,965][1648982] Fps is (10 sec: 32744.7, 60 sec: 42593.4, 300 sec: 44430.1). Total num frames: 1116110848. Throughput: 0: 11023.3. Samples: 279131136. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:45,969][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:32:46,770][1653645] Updated weights for policy 0, policy_version 545008 (0.0021) [2024-06-15 18:32:49,120][1653645] Updated weights for policy 0, policy_version 545104 (0.0014) [2024-06-15 18:32:50,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 1116471296. Throughput: 0: 10740.6. Samples: 279153152. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:50,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:32:53,696][1653645] Updated weights for policy 0, policy_version 545212 (0.0093) [2024-06-15 18:32:55,958][1648982] Fps is (10 sec: 49185.7, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1116602368. Throughput: 0: 10570.4. Samples: 279215104. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:32:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:32:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000545216_1116602368.pth... [2024-06-15 18:32:56,030][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000540032_1105985536.pth [2024-06-15 18:33:00,367][1653645] Updated weights for policy 0, policy_version 545300 (0.0015) [2024-06-15 18:33:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42052.2, 300 sec: 44320.1). Total num frames: 1116831744. Throughput: 0: 10661.0. Samples: 279283712. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:33:00,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 18:33:00,988][1651596] Signal inference workers to stop experience collection... (28400 times) [2024-06-15 18:33:01,052][1653645] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-06-15 18:33:01,216][1651596] Signal inference workers to resume experience collection... (28400 times) [2024-06-15 18:33:01,217][1653645] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-06-15 18:33:01,718][1653645] Updated weights for policy 0, policy_version 545361 (0.0126) [2024-06-15 18:33:04,273][1653645] Updated weights for policy 0, policy_version 545414 (0.0089) [2024-06-15 18:33:05,258][1653645] Updated weights for policy 0, policy_version 545471 (0.0012) [2024-06-15 18:33:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1117126656. Throughput: 0: 10626.9. Samples: 279312384. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:33:05,963][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:33:10,935][1653645] Updated weights for policy 0, policy_version 545539 (0.0016) [2024-06-15 18:33:10,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 44320.1). Total num frames: 1117257728. Throughput: 0: 11025.1. Samples: 279394816. Policy #0 lag: (min: 11.0, avg: 90.2, max: 267.0) [2024-06-15 18:33:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 18:33:12,581][1653645] Updated weights for policy 0, policy_version 545616 (0.0018) [2024-06-15 18:33:13,420][1653645] Updated weights for policy 0, policy_version 545651 (0.0012) [2024-06-15 18:33:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1117552640. Throughput: 0: 10865.8. Samples: 279460352. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:33:16,340][1653645] Updated weights for policy 0, policy_version 545699 (0.0020) [2024-06-15 18:33:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1117650944. Throughput: 0: 10865.8. Samples: 279495168. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:33:21,485][1653645] Updated weights for policy 0, policy_version 545768 (0.0019) [2024-06-15 18:33:22,718][1653645] Updated weights for policy 0, policy_version 545825 (0.0013) [2024-06-15 18:33:24,710][1653645] Updated weights for policy 0, policy_version 545910 (0.0018) [2024-06-15 18:33:25,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1118044160. Throughput: 0: 10968.2. Samples: 279554048. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:33:28,112][1653645] Updated weights for policy 0, policy_version 545976 (0.0085) [2024-06-15 18:33:30,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1118175232. Throughput: 0: 11243.0. Samples: 279636992. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:33:33,970][1653645] Updated weights for policy 0, policy_version 546050 (0.0015) [2024-06-15 18:33:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1118470144. Throughput: 0: 11411.9. Samples: 279666688. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:33:36,320][1653645] Updated weights for policy 0, policy_version 546160 (0.0099) [2024-06-15 18:33:39,463][1653645] Updated weights for policy 0, policy_version 546197 (0.0013) [2024-06-15 18:33:40,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1118699520. Throughput: 0: 11286.8. Samples: 279723008. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:40,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:33:44,584][1653645] Updated weights for policy 0, policy_version 546256 (0.0041) [2024-06-15 18:33:44,721][1651596] Signal inference workers to stop experience collection... (28450 times) [2024-06-15 18:33:44,798][1653645] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-06-15 18:33:45,034][1651596] Signal inference workers to resume experience collection... (28450 times) [2024-06-15 18:33:45,035][1653645] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-06-15 18:33:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45334.4, 300 sec: 44431.2). Total num frames: 1118830592. Throughput: 0: 11355.0. Samples: 279794688. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:33:47,070][1653645] Updated weights for policy 0, policy_version 546353 (0.0014) [2024-06-15 18:33:48,376][1653645] Updated weights for policy 0, policy_version 546418 (0.0013) [2024-06-15 18:33:50,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 1119092736. Throughput: 0: 11264.0. Samples: 279819264. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:33:51,381][1653645] Updated weights for policy 0, policy_version 546452 (0.0013) [2024-06-15 18:33:52,117][1653645] Updated weights for policy 0, policy_version 546490 (0.0012) [2024-06-15 18:33:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1119223808. Throughput: 0: 11150.2. Samples: 279896576. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:33:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:33:57,116][1653645] Updated weights for policy 0, policy_version 546545 (0.0019) [2024-06-15 18:33:58,641][1653645] Updated weights for policy 0, policy_version 546613 (0.0013) [2024-06-15 18:34:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 1119617024. Throughput: 0: 10854.4. Samples: 279948800. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:34:02,866][1653645] Updated weights for policy 0, policy_version 546689 (0.0013) [2024-06-15 18:34:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1119748096. Throughput: 0: 10968.2. Samples: 279988736. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:34:08,010][1653645] Updated weights for policy 0, policy_version 546759 (0.0029) [2024-06-15 18:34:10,197][1653645] Updated weights for policy 0, policy_version 546836 (0.0013) [2024-06-15 18:34:10,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 1119977472. Throughput: 0: 11218.5. Samples: 280058880. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:34:12,490][1653645] Updated weights for policy 0, policy_version 546928 (0.0013) [2024-06-15 18:34:15,484][1653645] Updated weights for policy 0, policy_version 546976 (0.0012) [2024-06-15 18:34:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1120239616. Throughput: 0: 10797.5. Samples: 280122880. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:34:19,819][1653645] Updated weights for policy 0, policy_version 547044 (0.0012) [2024-06-15 18:34:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1120403456. Throughput: 0: 11025.1. Samples: 280162816. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:34:22,901][1653645] Updated weights for policy 0, policy_version 547122 (0.0016) [2024-06-15 18:34:24,600][1653645] Updated weights for policy 0, policy_version 547196 (0.0104) [2024-06-15 18:34:25,966][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1120665600. Throughput: 0: 10808.8. Samples: 280209408. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:25,968][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:34:26,407][1651596] Signal inference workers to stop experience collection... (28500 times) [2024-06-15 18:34:26,469][1653645] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-06-15 18:34:26,656][1651596] Signal inference workers to resume experience collection... (28500 times) [2024-06-15 18:34:26,657][1653645] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-06-15 18:34:27,574][1653645] Updated weights for policy 0, policy_version 547250 (0.0050) [2024-06-15 18:34:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1120796672. Throughput: 0: 10956.8. Samples: 280287744. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:34:32,866][1653645] Updated weights for policy 0, policy_version 547322 (0.0018) [2024-06-15 18:34:34,771][1653645] Updated weights for policy 0, policy_version 547362 (0.0012) [2024-06-15 18:34:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1121091584. Throughput: 0: 11093.3. Samples: 280318464. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:34:38,840][1653645] Updated weights for policy 0, policy_version 547472 (0.0014) [2024-06-15 18:34:39,931][1653645] Updated weights for policy 0, policy_version 547520 (0.0014) [2024-06-15 18:34:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1121320960. Throughput: 0: 10729.3. Samples: 280379392. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:34:45,610][1653645] Updated weights for policy 0, policy_version 547600 (0.0013) [2024-06-15 18:34:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1121484800. Throughput: 0: 11184.4. Samples: 280452096. Policy #0 lag: (min: 15.0, avg: 148.9, max: 271.0) [2024-06-15 18:34:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:34:47,318][1653645] Updated weights for policy 0, policy_version 547680 (0.0014) [2024-06-15 18:34:50,671][1653645] Updated weights for policy 0, policy_version 547732 (0.0011) [2024-06-15 18:34:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1121746944. Throughput: 0: 10888.5. Samples: 280478720. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:34:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:34:55,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1121845248. Throughput: 0: 11070.6. Samples: 280557056. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:34:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:34:55,990][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000547776_1121845248.pth... [2024-06-15 18:34:56,158][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000542656_1111359488.pth [2024-06-15 18:34:56,723][1653645] Updated weights for policy 0, policy_version 547808 (0.0126) [2024-06-15 18:34:59,409][1653645] Updated weights for policy 0, policy_version 547920 (0.0111) [2024-06-15 18:35:00,780][1653645] Updated weights for policy 0, policy_version 547968 (0.0014) [2024-06-15 18:35:00,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1122238464. Throughput: 0: 10717.9. Samples: 280605184. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:35:03,734][1653645] Updated weights for policy 0, policy_version 548031 (0.0013) [2024-06-15 18:35:05,957][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1122369536. Throughput: 0: 10717.9. Samples: 280645120. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:35:09,704][1653645] Updated weights for policy 0, policy_version 548096 (0.0016) [2024-06-15 18:35:10,685][1651596] Signal inference workers to stop experience collection... (28550 times) [2024-06-15 18:35:10,744][1653645] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-06-15 18:35:10,947][1651596] Signal inference workers to resume experience collection... (28550 times) [2024-06-15 18:35:10,947][1653645] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-06-15 18:35:10,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.5, 300 sec: 44542.2). Total num frames: 1122598912. Throughput: 0: 11298.1. Samples: 280717824. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:10,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:35:12,683][1653645] Updated weights for policy 0, policy_version 548208 (0.0162) [2024-06-15 18:35:15,491][1653645] Updated weights for policy 0, policy_version 548246 (0.0012) [2024-06-15 18:35:15,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 1122828288. Throughput: 0: 10808.9. Samples: 280774144. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:35:20,959][1648982] Fps is (10 sec: 29491.7, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 1122893824. Throughput: 0: 10877.2. Samples: 280807936. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:20,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:35:21,013][1653645] Updated weights for policy 0, policy_version 548304 (0.0015) [2024-06-15 18:35:22,600][1653645] Updated weights for policy 0, policy_version 548357 (0.0012) [2024-06-15 18:35:24,440][1653645] Updated weights for policy 0, policy_version 548432 (0.0012) [2024-06-15 18:35:25,456][1653645] Updated weights for policy 0, policy_version 548478 (0.0011) [2024-06-15 18:35:25,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.8, 300 sec: 44431.7). Total num frames: 1123287040. Throughput: 0: 10854.4. Samples: 280867840. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:35:28,027][1653645] Updated weights for policy 0, policy_version 548537 (0.0014) [2024-06-15 18:35:30,960][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1123418112. Throughput: 0: 10729.2. Samples: 280934912. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:30,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:35:33,472][1653645] Updated weights for policy 0, policy_version 548582 (0.0056) [2024-06-15 18:35:35,434][1653645] Updated weights for policy 0, policy_version 548672 (0.0107) [2024-06-15 18:35:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1123713024. Throughput: 0: 11081.9. Samples: 280977408. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:35:39,369][1653645] Updated weights for policy 0, policy_version 548739 (0.0012) [2024-06-15 18:35:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1123942400. Throughput: 0: 10535.8. Samples: 281031168. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:35:44,667][1653645] Updated weights for policy 0, policy_version 548816 (0.0014) [2024-06-15 18:35:45,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1124073472. Throughput: 0: 11082.0. Samples: 281103872. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:35:46,069][1653645] Updated weights for policy 0, policy_version 548866 (0.0010) [2024-06-15 18:35:47,937][1653645] Updated weights for policy 0, policy_version 548944 (0.0013) [2024-06-15 18:35:49,268][1653645] Updated weights for policy 0, policy_version 548991 (0.0012) [2024-06-15 18:35:50,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43144.3, 300 sec: 43988.6). Total num frames: 1124335616. Throughput: 0: 10740.5. Samples: 281128448. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:35:51,981][1653645] Updated weights for policy 0, policy_version 549044 (0.0013) [2024-06-15 18:35:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 1124466688. Throughput: 0: 10649.6. Samples: 281197056. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:35:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:35:56,805][1651596] Signal inference workers to stop experience collection... (28600 times) [2024-06-15 18:35:56,876][1653645] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-06-15 18:35:57,153][1651596] Signal inference workers to resume experience collection... (28600 times) [2024-06-15 18:35:57,170][1653645] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-06-15 18:35:57,777][1653645] Updated weights for policy 0, policy_version 549092 (0.0025) [2024-06-15 18:35:59,554][1653645] Updated weights for policy 0, policy_version 549171 (0.0095) [2024-06-15 18:36:00,941][1653645] Updated weights for policy 0, policy_version 549240 (0.0018) [2024-06-15 18:36:00,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 1124827136. Throughput: 0: 10717.9. Samples: 281256448. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:36:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:36:04,367][1653645] Updated weights for policy 0, policy_version 549305 (0.0013) [2024-06-15 18:36:05,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.4, 300 sec: 44320.1). Total num frames: 1124990976. Throughput: 0: 10763.3. Samples: 281292288. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:36:05,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:36:09,852][1653645] Updated weights for policy 0, policy_version 549360 (0.0040) [2024-06-15 18:36:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42598.5, 300 sec: 43653.6). Total num frames: 1125154816. Throughput: 0: 11093.3. Samples: 281367040. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:36:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:36:11,512][1653645] Updated weights for policy 0, policy_version 549410 (0.0016) [2024-06-15 18:36:13,275][1653645] Updated weights for policy 0, policy_version 549504 (0.0013) [2024-06-15 18:36:15,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 1125449728. Throughput: 0: 10854.4. Samples: 281423360. Policy #0 lag: (min: 63.0, avg: 138.2, max: 287.0) [2024-06-15 18:36:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:36:20,871][1653645] Updated weights for policy 0, policy_version 549569 (0.0130) [2024-06-15 18:36:20,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1125515264. Throughput: 0: 10729.2. Samples: 281460224. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:20,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 18:36:22,929][1653645] Updated weights for policy 0, policy_version 549664 (0.0012) [2024-06-15 18:36:24,688][1653645] Updated weights for policy 0, policy_version 549744 (0.0037) [2024-06-15 18:36:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 1125908480. Throughput: 0: 10820.2. Samples: 281518080. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:36:27,200][1653645] Updated weights for policy 0, policy_version 549762 (0.0033) [2024-06-15 18:36:28,147][1653645] Updated weights for policy 0, policy_version 549819 (0.0029) [2024-06-15 18:36:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1126039552. Throughput: 0: 10945.4. Samples: 281596416. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:36:33,331][1653645] Updated weights for policy 0, policy_version 549875 (0.0013) [2024-06-15 18:36:35,143][1651596] Signal inference workers to stop experience collection... (28650 times) [2024-06-15 18:36:35,216][1653645] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-06-15 18:36:35,220][1653645] Updated weights for policy 0, policy_version 549953 (0.0014) [2024-06-15 18:36:35,465][1651596] Signal inference workers to resume experience collection... (28650 times) [2024-06-15 18:36:35,466][1653645] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-06-15 18:36:35,960][1648982] Fps is (10 sec: 45864.2, 60 sec: 44234.8, 300 sec: 44208.6). Total num frames: 1126367232. Throughput: 0: 11172.4. Samples: 281631232. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:35,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:36:38,691][1653645] Updated weights for policy 0, policy_version 550017 (0.0119) [2024-06-15 18:36:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 1126563840. Throughput: 0: 10934.1. Samples: 281689088. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:36:44,643][1653645] Updated weights for policy 0, policy_version 550096 (0.0016) [2024-06-15 18:36:45,959][1648982] Fps is (10 sec: 32776.6, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1126694912. Throughput: 0: 11218.5. Samples: 281761280. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:45,960][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 18:36:46,737][1653645] Updated weights for policy 0, policy_version 550180 (0.0012) [2024-06-15 18:36:48,206][1653645] Updated weights for policy 0, policy_version 550244 (0.0013) [2024-06-15 18:36:50,931][1653645] Updated weights for policy 0, policy_version 550288 (0.0013) [2024-06-15 18:36:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44237.0, 300 sec: 44098.0). Total num frames: 1126989824. Throughput: 0: 10991.0. Samples: 281786880. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:36:55,692][1653645] Updated weights for policy 0, policy_version 550337 (0.0097) [2024-06-15 18:36:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 43431.5). Total num frames: 1127120896. Throughput: 0: 11081.9. Samples: 281865728. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:36:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:36:56,217][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000550368_1127153664.pth... [2024-06-15 18:36:56,369][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000545216_1116602368.pth [2024-06-15 18:36:57,280][1653645] Updated weights for policy 0, policy_version 550403 (0.0012) [2024-06-15 18:36:59,025][1653645] Updated weights for policy 0, policy_version 550480 (0.0013) [2024-06-15 18:37:00,958][1648982] Fps is (10 sec: 49150.4, 60 sec: 44236.5, 300 sec: 43986.8). Total num frames: 1127481344. Throughput: 0: 11036.4. Samples: 281920000. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:00,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:37:02,416][1653645] Updated weights for policy 0, policy_version 550544 (0.0014) [2024-06-15 18:37:05,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.8, 300 sec: 43765.0). Total num frames: 1127612416. Throughput: 0: 10979.6. Samples: 281954304. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:37:08,081][1653645] Updated weights for policy 0, policy_version 550624 (0.0022) [2024-06-15 18:37:09,504][1653645] Updated weights for policy 0, policy_version 550674 (0.0012) [2024-06-15 18:37:10,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 1127874560. Throughput: 0: 11366.5. Samples: 282029568. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:37:11,179][1653645] Updated weights for policy 0, policy_version 550728 (0.0015) [2024-06-15 18:37:14,060][1653645] Updated weights for policy 0, policy_version 550803 (0.0013) [2024-06-15 18:37:15,106][1653645] Updated weights for policy 0, policy_version 550848 (0.0022) [2024-06-15 18:37:15,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44783.2, 300 sec: 44431.2). Total num frames: 1128136704. Throughput: 0: 11036.5. Samples: 282093056. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:37:20,371][1651596] Signal inference workers to stop experience collection... (28700 times) [2024-06-15 18:37:20,411][1653645] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-06-15 18:37:20,671][1651596] Signal inference workers to resume experience collection... (28700 times) [2024-06-15 18:37:20,672][1653645] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-06-15 18:37:20,674][1653645] Updated weights for policy 0, policy_version 550928 (0.0014) [2024-06-15 18:37:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 46421.3, 300 sec: 43653.7). Total num frames: 1128300544. Throughput: 0: 11219.2. Samples: 282136064. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:37:22,567][1653645] Updated weights for policy 0, policy_version 550982 (0.0013) [2024-06-15 18:37:23,720][1653645] Updated weights for policy 0, policy_version 551035 (0.0012) [2024-06-15 18:37:25,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 1128595456. Throughput: 0: 11195.7. Samples: 282192896. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:37:26,488][1653645] Updated weights for policy 0, policy_version 551096 (0.0013) [2024-06-15 18:37:30,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 44236.6, 300 sec: 43764.7). Total num frames: 1128693760. Throughput: 0: 11309.5. Samples: 282270208. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:37:31,669][1653645] Updated weights for policy 0, policy_version 551158 (0.0082) [2024-06-15 18:37:33,006][1653645] Updated weights for policy 0, policy_version 551203 (0.0013) [2024-06-15 18:37:34,124][1653645] Updated weights for policy 0, policy_version 551248 (0.0013) [2024-06-15 18:37:35,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44785.0, 300 sec: 44098.0). Total num frames: 1129054208. Throughput: 0: 11343.7. Samples: 282297344. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:37:37,105][1653645] Updated weights for policy 0, policy_version 551332 (0.0036) [2024-06-15 18:37:40,966][1648982] Fps is (10 sec: 49115.1, 60 sec: 43685.0, 300 sec: 44320.0). Total num frames: 1129185280. Throughput: 0: 11193.8. Samples: 282369536. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:40,969][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:37:41,653][1653645] Updated weights for policy 0, policy_version 551376 (0.0010) [2024-06-15 18:37:43,975][1653645] Updated weights for policy 0, policy_version 551431 (0.0012) [2024-06-15 18:37:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1129447424. Throughput: 0: 11446.1. Samples: 282435072. Policy #0 lag: (min: 63.0, avg: 205.2, max: 319.0) [2024-06-15 18:37:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:37:46,877][1653645] Updated weights for policy 0, policy_version 551520 (0.0103) [2024-06-15 18:37:48,665][1653645] Updated weights for policy 0, policy_version 551584 (0.0013) [2024-06-15 18:37:50,958][1648982] Fps is (10 sec: 52467.6, 60 sec: 45328.8, 300 sec: 44431.2). Total num frames: 1129709568. Throughput: 0: 11275.3. Samples: 282461696. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:37:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:37:53,391][1653645] Updated weights for policy 0, policy_version 551633 (0.0013) [2024-06-15 18:37:55,217][1653645] Updated weights for policy 0, policy_version 551682 (0.0020) [2024-06-15 18:37:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 46421.4, 300 sec: 44320.1). Total num frames: 1129906176. Throughput: 0: 11298.1. Samples: 282537984. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:37:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:37:59,164][1653645] Updated weights for policy 0, policy_version 551776 (0.0015) [2024-06-15 18:38:00,960][1648982] Fps is (10 sec: 45876.5, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 1130168320. Throughput: 0: 11172.9. Samples: 282595840. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:00,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:38:01,166][1653645] Updated weights for policy 0, policy_version 551866 (0.0016) [2024-06-15 18:38:05,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1130332160. Throughput: 0: 11150.2. Samples: 282637824. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:38:06,030][1653645] Updated weights for policy 0, policy_version 551929 (0.0074) [2024-06-15 18:38:07,027][1651596] Signal inference workers to stop experience collection... (28750 times) [2024-06-15 18:38:07,078][1653645] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-06-15 18:38:07,315][1651596] Signal inference workers to resume experience collection... (28750 times) [2024-06-15 18:38:07,316][1653645] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-06-15 18:38:10,417][1653645] Updated weights for policy 0, policy_version 552004 (0.0017) [2024-06-15 18:38:10,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44782.7, 300 sec: 44097.9). Total num frames: 1130561536. Throughput: 0: 11309.4. Samples: 282701824. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:38:11,986][1653645] Updated weights for policy 0, policy_version 552082 (0.0051) [2024-06-15 18:38:15,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1130758144. Throughput: 0: 11252.7. Samples: 282776576. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:38:16,741][1653645] Updated weights for policy 0, policy_version 552145 (0.0014) [2024-06-15 18:38:18,665][1653645] Updated weights for policy 0, policy_version 552227 (0.0013) [2024-06-15 18:38:20,959][1648982] Fps is (10 sec: 45876.8, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1131020288. Throughput: 0: 11309.5. Samples: 282806272. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:20,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:38:22,329][1653645] Updated weights for policy 0, policy_version 552272 (0.0011) [2024-06-15 18:38:24,530][1653645] Updated weights for policy 0, policy_version 552355 (0.0012) [2024-06-15 18:38:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1131282432. Throughput: 0: 11015.6. Samples: 282865152. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:38:29,032][1653645] Updated weights for policy 0, policy_version 552416 (0.0014) [2024-06-15 18:38:30,856][1653645] Updated weights for policy 0, policy_version 552482 (0.0110) [2024-06-15 18:38:30,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 46421.5, 300 sec: 44097.9). Total num frames: 1131479040. Throughput: 0: 11082.0. Samples: 282933760. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:38:35,088][1653645] Updated weights for policy 0, policy_version 552544 (0.0013) [2024-06-15 18:38:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1131675648. Throughput: 0: 11275.5. Samples: 282969088. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:38:36,972][1653645] Updated weights for policy 0, policy_version 552625 (0.0019) [2024-06-15 18:38:40,956][1653645] Updated weights for policy 0, policy_version 552677 (0.0014) [2024-06-15 18:38:40,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44788.7, 300 sec: 44209.0). Total num frames: 1131872256. Throughput: 0: 11093.4. Samples: 283037184. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:38:42,858][1653645] Updated weights for policy 0, policy_version 552752 (0.0100) [2024-06-15 18:38:45,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1132068864. Throughput: 0: 11218.4. Samples: 283100672. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:38:48,297][1653645] Updated weights for policy 0, policy_version 552864 (0.0109) [2024-06-15 18:38:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 1132331008. Throughput: 0: 10899.9. Samples: 283128320. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:38:52,139][1651596] Signal inference workers to stop experience collection... (28800 times) [2024-06-15 18:38:52,215][1653645] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-06-15 18:38:52,221][1653645] Updated weights for policy 0, policy_version 552902 (0.0022) [2024-06-15 18:38:52,426][1651596] Signal inference workers to resume experience collection... (28800 times) [2024-06-15 18:38:52,427][1653645] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-06-15 18:38:55,179][1653645] Updated weights for policy 0, policy_version 552994 (0.0101) [2024-06-15 18:38:55,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 44782.7, 300 sec: 43986.8). Total num frames: 1132593152. Throughput: 0: 11082.0. Samples: 283200512. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:38:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:38:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000553024_1132593152.pth... [2024-06-15 18:38:56,016][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000547776_1121845248.pth [2024-06-15 18:38:59,652][1653645] Updated weights for policy 0, policy_version 553104 (0.0014) [2024-06-15 18:39:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 1132822528. Throughput: 0: 10706.5. Samples: 283258368. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:39:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:39:05,524][1653645] Updated weights for policy 0, policy_version 553200 (0.0032) [2024-06-15 18:39:05,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1132986368. Throughput: 0: 10843.0. Samples: 283294208. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:39:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:39:08,044][1653645] Updated weights for policy 0, policy_version 553280 (0.0018) [2024-06-15 18:39:10,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 43690.7, 300 sec: 43875.7). Total num frames: 1133182976. Throughput: 0: 10990.9. Samples: 283359744. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:39:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:39:11,813][1653645] Updated weights for policy 0, policy_version 553344 (0.0094) [2024-06-15 18:39:13,543][1653645] Updated weights for policy 0, policy_version 553407 (0.0013) [2024-06-15 18:39:15,957][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1133379584. Throughput: 0: 10808.9. Samples: 283420160. Policy #0 lag: (min: 47.0, avg: 166.7, max: 303.0) [2024-06-15 18:39:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:39:18,210][1653645] Updated weights for policy 0, policy_version 553465 (0.0014) [2024-06-15 18:39:20,348][1653645] Updated weights for policy 0, policy_version 553536 (0.0022) [2024-06-15 18:39:20,958][1648982] Fps is (10 sec: 45877.4, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1133641728. Throughput: 0: 10888.6. Samples: 283459072. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:39:23,465][1653645] Updated weights for policy 0, policy_version 553616 (0.0015) [2024-06-15 18:39:24,632][1653645] Updated weights for policy 0, policy_version 553664 (0.0016) [2024-06-15 18:39:25,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1133903872. Throughput: 0: 10695.1. Samples: 283518464. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:39:30,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1134067712. Throughput: 0: 10934.1. Samples: 283592704. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:30,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:39:31,116][1653645] Updated weights for policy 0, policy_version 553748 (0.0131) [2024-06-15 18:39:33,329][1653645] Updated weights for policy 0, policy_version 553796 (0.0013) [2024-06-15 18:39:35,064][1653645] Updated weights for policy 0, policy_version 553872 (0.0142) [2024-06-15 18:39:35,095][1651596] Signal inference workers to stop experience collection... (28850 times) [2024-06-15 18:39:35,197][1653645] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-06-15 18:39:35,363][1651596] Signal inference workers to resume experience collection... (28850 times) [2024-06-15 18:39:35,364][1653645] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-06-15 18:39:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1134395392. Throughput: 0: 11116.1. Samples: 283628544. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:39:40,530][1653645] Updated weights for policy 0, policy_version 553938 (0.0022) [2024-06-15 18:39:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 1134493696. Throughput: 0: 10979.6. Samples: 283694592. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:39:43,071][1653645] Updated weights for policy 0, policy_version 553987 (0.0012) [2024-06-15 18:39:45,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1134690304. Throughput: 0: 11070.6. Samples: 283756544. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:39:46,336][1653645] Updated weights for policy 0, policy_version 554069 (0.0014) [2024-06-15 18:39:47,901][1653645] Updated weights for policy 0, policy_version 554144 (0.0016) [2024-06-15 18:39:50,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1134952448. Throughput: 0: 10865.8. Samples: 283783168. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:39:52,664][1653645] Updated weights for policy 0, policy_version 554224 (0.0013) [2024-06-15 18:39:54,993][1653645] Updated weights for policy 0, policy_version 554272 (0.0013) [2024-06-15 18:39:55,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43691.0, 300 sec: 43986.9). Total num frames: 1135214592. Throughput: 0: 11195.8. Samples: 283863552. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:39:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:39:57,053][1653645] Updated weights for policy 0, policy_version 554308 (0.0021) [2024-06-15 18:39:58,528][1653645] Updated weights for policy 0, policy_version 554371 (0.0013) [2024-06-15 18:39:59,950][1653645] Updated weights for policy 0, policy_version 554432 (0.0012) [2024-06-15 18:40:00,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 44236.7, 300 sec: 44431.1). Total num frames: 1135476736. Throughput: 0: 11184.3. Samples: 283923456. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:40:05,512][1653645] Updated weights for policy 0, policy_version 554495 (0.0014) [2024-06-15 18:40:05,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1135607808. Throughput: 0: 11241.2. Samples: 283964928. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:40:07,303][1653645] Updated weights for policy 0, policy_version 554558 (0.0012) [2024-06-15 18:40:09,874][1653645] Updated weights for policy 0, policy_version 554608 (0.0013) [2024-06-15 18:40:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 1135935488. Throughput: 0: 11275.4. Samples: 284025856. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:40:11,398][1653645] Updated weights for policy 0, policy_version 554688 (0.0013) [2024-06-15 18:40:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 1136033792. Throughput: 0: 11309.5. Samples: 284101632. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:40:17,060][1653645] Updated weights for policy 0, policy_version 554750 (0.0012) [2024-06-15 18:40:18,768][1653645] Updated weights for policy 0, policy_version 554809 (0.0015) [2024-06-15 18:40:20,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 1136295936. Throughput: 0: 11081.9. Samples: 284127232. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:40:21,623][1653645] Updated weights for policy 0, policy_version 554864 (0.0027) [2024-06-15 18:40:22,372][1651596] Signal inference workers to stop experience collection... (28900 times) [2024-06-15 18:40:22,430][1653645] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-06-15 18:40:22,666][1651596] Signal inference workers to resume experience collection... (28900 times) [2024-06-15 18:40:22,666][1653645] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-06-15 18:40:23,222][1653645] Updated weights for policy 0, policy_version 554913 (0.0012) [2024-06-15 18:40:25,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1136525312. Throughput: 0: 10899.9. Samples: 284185088. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:40:29,221][1653645] Updated weights for policy 0, policy_version 555001 (0.0013) [2024-06-15 18:40:30,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1136754688. Throughput: 0: 11104.7. Samples: 284256256. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 18:40:31,125][1653645] Updated weights for policy 0, policy_version 555064 (0.0022) [2024-06-15 18:40:33,522][1653645] Updated weights for policy 0, policy_version 555120 (0.0016) [2024-06-15 18:40:35,483][1653645] Updated weights for policy 0, policy_version 555184 (0.0018) [2024-06-15 18:40:35,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1137049600. Throughput: 0: 11184.4. Samples: 284286464. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:35,974][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:40:39,996][1653645] Updated weights for policy 0, policy_version 555223 (0.0012) [2024-06-15 18:40:40,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44782.8, 300 sec: 44431.1). Total num frames: 1137180672. Throughput: 0: 11104.6. Samples: 284363264. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:40:41,928][1653645] Updated weights for policy 0, policy_version 555297 (0.0013) [2024-06-15 18:40:42,485][1653645] Updated weights for policy 0, policy_version 555324 (0.0010) [2024-06-15 18:40:44,431][1653645] Updated weights for policy 0, policy_version 555386 (0.0030) [2024-06-15 18:40:45,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46421.4, 300 sec: 44542.3). Total num frames: 1137475584. Throughput: 0: 11229.9. Samples: 284428800. Policy #0 lag: (min: 37.0, avg: 127.5, max: 293.0) [2024-06-15 18:40:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:40:46,649][1653645] Updated weights for policy 0, policy_version 555440 (0.0013) [2024-06-15 18:40:50,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1137606656. Throughput: 0: 11104.7. Samples: 284464640. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:40:50,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:40:51,749][1653645] Updated weights for policy 0, policy_version 555520 (0.0014) [2024-06-15 18:40:53,584][1653645] Updated weights for policy 0, policy_version 555579 (0.0014) [2024-06-15 18:40:55,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 44782.6, 300 sec: 44320.0). Total num frames: 1137901568. Throughput: 0: 11195.6. Samples: 284529664. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:40:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:40:56,222][1653645] Updated weights for policy 0, policy_version 555632 (0.0012) [2024-06-15 18:40:56,432][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000555648_1137967104.pth... [2024-06-15 18:40:56,506][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000550368_1127153664.pth [2024-06-15 18:40:58,824][1653645] Updated weights for policy 0, policy_version 555698 (0.0012) [2024-06-15 18:41:00,964][1648982] Fps is (10 sec: 49126.9, 60 sec: 43687.0, 300 sec: 44430.4). Total num frames: 1138098176. Throughput: 0: 10944.2. Samples: 284594176. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:00,967][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:41:03,467][1653645] Updated weights for policy 0, policy_version 555744 (0.0151) [2024-06-15 18:41:04,198][1653645] Updated weights for policy 0, policy_version 555776 (0.0011) [2024-06-15 18:41:05,958][1648982] Fps is (10 sec: 39323.1, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1138294784. Throughput: 0: 11150.3. Samples: 284628992. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:41:06,192][1653645] Updated weights for policy 0, policy_version 555831 (0.0013) [2024-06-15 18:41:07,550][1653645] Updated weights for policy 0, policy_version 555894 (0.0013) [2024-06-15 18:41:10,201][1651596] Signal inference workers to stop experience collection... (28950 times) [2024-06-15 18:41:10,265][1653645] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-06-15 18:41:10,539][1651596] Signal inference workers to resume experience collection... (28950 times) [2024-06-15 18:41:10,542][1653645] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-06-15 18:41:10,715][1653645] Updated weights for policy 0, policy_version 555959 (0.0019) [2024-06-15 18:41:10,973][1648982] Fps is (10 sec: 52377.8, 60 sec: 44771.9, 300 sec: 44651.1). Total num frames: 1138622464. Throughput: 0: 11465.0. Samples: 284701184. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:10,974][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:41:14,411][1653645] Updated weights for policy 0, policy_version 556005 (0.0012) [2024-06-15 18:41:15,000][1653645] Updated weights for policy 0, policy_version 556032 (0.0012) [2024-06-15 18:41:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1138753536. Throughput: 0: 11355.0. Samples: 284767232. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:41:17,763][1653645] Updated weights for policy 0, policy_version 556083 (0.0013) [2024-06-15 18:41:20,495][1653645] Updated weights for policy 0, policy_version 556165 (0.0013) [2024-06-15 18:41:20,958][1648982] Fps is (10 sec: 42662.0, 60 sec: 45875.3, 300 sec: 44542.3). Total num frames: 1139048448. Throughput: 0: 11502.9. Samples: 284804096. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:41:25,009][1653645] Updated weights for policy 0, policy_version 556227 (0.0013) [2024-06-15 18:41:25,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44782.7, 300 sec: 44653.3). Total num frames: 1139212288. Throughput: 0: 11320.9. Samples: 284872704. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:41:26,323][1653645] Updated weights for policy 0, policy_version 556283 (0.0011) [2024-06-15 18:41:28,994][1653645] Updated weights for policy 0, policy_version 556336 (0.0011) [2024-06-15 18:41:30,465][1653645] Updated weights for policy 0, policy_version 556409 (0.0090) [2024-06-15 18:41:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 46421.3, 300 sec: 44653.7). Total num frames: 1139539968. Throughput: 0: 11252.6. Samples: 284935168. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:41:33,827][1653645] Updated weights for policy 0, policy_version 556477 (0.0013) [2024-06-15 18:41:35,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1139671040. Throughput: 0: 11298.1. Samples: 284973056. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:41:37,263][1653645] Updated weights for policy 0, policy_version 556534 (0.0015) [2024-06-15 18:41:39,553][1653645] Updated weights for policy 0, policy_version 556576 (0.0040) [2024-06-15 18:41:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.5, 300 sec: 44986.6). Total num frames: 1139965952. Throughput: 0: 11503.0. Samples: 285047296. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:41:41,421][1653645] Updated weights for policy 0, policy_version 556656 (0.0012) [2024-06-15 18:41:44,234][1653645] Updated weights for policy 0, policy_version 556688 (0.0011) [2024-06-15 18:41:45,439][1653645] Updated weights for policy 0, policy_version 556734 (0.0012) [2024-06-15 18:41:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1140195328. Throughput: 0: 11572.5. Samples: 285114880. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:41:48,598][1653645] Updated weights for policy 0, policy_version 556798 (0.0099) [2024-06-15 18:41:50,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 46421.2, 300 sec: 44986.6). Total num frames: 1140391936. Throughput: 0: 11446.0. Samples: 285144064. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:41:51,576][1653645] Updated weights for policy 0, policy_version 556865 (0.0087) [2024-06-15 18:41:52,610][1653645] Updated weights for policy 0, policy_version 556920 (0.0014) [2024-06-15 18:41:55,144][1651596] Signal inference workers to stop experience collection... (29000 times) [2024-06-15 18:41:55,215][1653645] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-06-15 18:41:55,427][1651596] Signal inference workers to resume experience collection... (29000 times) [2024-06-15 18:41:55,428][1653645] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-06-15 18:41:55,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.5, 300 sec: 44653.4). Total num frames: 1140654080. Throughput: 0: 11654.7. Samples: 285225472. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:41:55,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:41:56,594][1653645] Updated weights for policy 0, policy_version 556992 (0.0072) [2024-06-15 18:42:00,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 45879.1, 300 sec: 44875.5). Total num frames: 1140850688. Throughput: 0: 11480.2. Samples: 285283840. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:42:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:42:02,685][1653645] Updated weights for policy 0, policy_version 557088 (0.0015) [2024-06-15 18:42:04,518][1653645] Updated weights for policy 0, policy_version 557168 (0.0085) [2024-06-15 18:42:05,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1141112832. Throughput: 0: 11400.5. Samples: 285317120. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:42:05,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:42:07,895][1653645] Updated weights for policy 0, policy_version 557201 (0.0014) [2024-06-15 18:42:10,312][1653645] Updated weights for policy 0, policy_version 557269 (0.0017) [2024-06-15 18:42:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45340.3, 300 sec: 44764.4). Total num frames: 1141342208. Throughput: 0: 11503.0. Samples: 285390336. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:42:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:42:14,267][1653645] Updated weights for policy 0, policy_version 557344 (0.0015) [2024-06-15 18:42:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 1141538816. Throughput: 0: 11434.7. Samples: 285449728. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:42:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:42:16,852][1653645] Updated weights for policy 0, policy_version 557436 (0.0105) [2024-06-15 18:42:20,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 44236.6, 300 sec: 44431.2). Total num frames: 1141702656. Throughput: 0: 11195.7. Samples: 285476864. Policy #0 lag: (min: 15.0, avg: 99.2, max: 271.0) [2024-06-15 18:42:20,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:42:21,369][1653645] Updated weights for policy 0, policy_version 557500 (0.0014) [2024-06-15 18:42:23,806][1653645] Updated weights for policy 0, policy_version 557562 (0.0013) [2024-06-15 18:42:25,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1141899264. Throughput: 0: 11047.8. Samples: 285544448. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:25,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 18:42:27,254][1653645] Updated weights for policy 0, policy_version 557616 (0.0013) [2024-06-15 18:42:28,496][1653645] Updated weights for policy 0, policy_version 557664 (0.0017) [2024-06-15 18:42:30,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1142161408. Throughput: 0: 11081.9. Samples: 285613568. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:42:32,449][1653645] Updated weights for policy 0, policy_version 557712 (0.0158) [2024-06-15 18:42:33,779][1653645] Updated weights for policy 0, policy_version 557763 (0.0012) [2024-06-15 18:42:35,958][1648982] Fps is (10 sec: 52426.2, 60 sec: 45874.8, 300 sec: 44876.6). Total num frames: 1142423552. Throughput: 0: 11218.4. Samples: 285648896. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:42:37,870][1653645] Updated weights for policy 0, policy_version 557841 (0.0018) [2024-06-15 18:42:39,520][1651596] Signal inference workers to stop experience collection... (29050 times) [2024-06-15 18:42:39,548][1653645] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-06-15 18:42:39,568][1653645] Updated weights for policy 0, policy_version 557906 (0.0012) [2024-06-15 18:42:39,804][1651596] Signal inference workers to resume experience collection... (29050 times) [2024-06-15 18:42:39,805][1653645] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-06-15 18:42:40,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 1142685696. Throughput: 0: 10843.0. Samples: 285713408. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:40,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:42:44,302][1653645] Updated weights for policy 0, policy_version 557955 (0.0025) [2024-06-15 18:42:45,570][1653645] Updated weights for policy 0, policy_version 558008 (0.0011) [2024-06-15 18:42:45,958][1648982] Fps is (10 sec: 39323.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1142816768. Throughput: 0: 11002.3. Samples: 285778944. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:42:46,943][1653645] Updated weights for policy 0, policy_version 558075 (0.0016) [2024-06-15 18:42:50,924][1653645] Updated weights for policy 0, policy_version 558131 (0.0012) [2024-06-15 18:42:50,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 1143046144. Throughput: 0: 11127.5. Samples: 285817856. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:42:55,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 42598.1, 300 sec: 44209.0). Total num frames: 1143209984. Throughput: 0: 10774.7. Samples: 285875200. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:42:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:42:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000558208_1143209984.pth... [2024-06-15 18:42:56,008][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000553024_1132593152.pth [2024-06-15 18:42:56,389][1653645] Updated weights for policy 0, policy_version 558211 (0.0013) [2024-06-15 18:42:57,311][1653645] Updated weights for policy 0, policy_version 558259 (0.0013) [2024-06-15 18:42:58,901][1653645] Updated weights for policy 0, policy_version 558336 (0.0079) [2024-06-15 18:43:00,960][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 44542.2). Total num frames: 1143472128. Throughput: 0: 11002.3. Samples: 285944832. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:00,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:43:02,721][1653645] Updated weights for policy 0, policy_version 558388 (0.0013) [2024-06-15 18:43:04,443][1653645] Updated weights for policy 0, policy_version 558464 (0.0071) [2024-06-15 18:43:05,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1143734272. Throughput: 0: 11173.0. Samples: 285979648. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:43:08,379][1653645] Updated weights for policy 0, policy_version 558513 (0.0042) [2024-06-15 18:43:10,323][1653645] Updated weights for policy 0, policy_version 558587 (0.0014) [2024-06-15 18:43:10,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1143996416. Throughput: 0: 11252.6. Samples: 286050816. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:43:14,660][1653645] Updated weights for policy 0, policy_version 558640 (0.0013) [2024-06-15 18:43:15,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1144193024. Throughput: 0: 11025.1. Samples: 286109696. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:43:16,252][1653645] Updated weights for policy 0, policy_version 558718 (0.0015) [2024-06-15 18:43:20,120][1653645] Updated weights for policy 0, policy_version 558768 (0.0049) [2024-06-15 18:43:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1144389632. Throughput: 0: 11059.3. Samples: 286146560. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:43:21,680][1653645] Updated weights for policy 0, policy_version 558821 (0.0021) [2024-06-15 18:43:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1144553472. Throughput: 0: 11138.9. Samples: 286214656. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:43:26,081][1651596] Signal inference workers to stop experience collection... (29100 times) [2024-06-15 18:43:26,136][1653645] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-06-15 18:43:26,175][1653645] Updated weights for policy 0, policy_version 558870 (0.0012) [2024-06-15 18:43:26,302][1651596] Signal inference workers to resume experience collection... (29100 times) [2024-06-15 18:43:26,314][1653645] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-06-15 18:43:27,392][1653645] Updated weights for policy 0, policy_version 558928 (0.0012) [2024-06-15 18:43:28,448][1653645] Updated weights for policy 0, policy_version 558976 (0.0013) [2024-06-15 18:43:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1144815616. Throughput: 0: 11320.9. Samples: 286288384. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:43:32,290][1653645] Updated weights for policy 0, policy_version 559044 (0.0014) [2024-06-15 18:43:33,632][1653645] Updated weights for policy 0, policy_version 559104 (0.0012) [2024-06-15 18:43:35,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43691.1, 300 sec: 44653.3). Total num frames: 1145044992. Throughput: 0: 10968.2. Samples: 286311424. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:43:38,921][1653645] Updated weights for policy 0, policy_version 559184 (0.0081) [2024-06-15 18:43:40,088][1653645] Updated weights for policy 0, policy_version 559232 (0.0013) [2024-06-15 18:43:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1145307136. Throughput: 0: 11286.8. Samples: 286383104. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:43:44,205][1653645] Updated weights for policy 0, policy_version 559312 (0.0011) [2024-06-15 18:43:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1145569280. Throughput: 0: 11104.7. Samples: 286444544. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:43:49,853][1653645] Updated weights for policy 0, policy_version 559378 (0.0013) [2024-06-15 18:43:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 1145700352. Throughput: 0: 11173.0. Samples: 286482432. Policy #0 lag: (min: 15.0, avg: 130.0, max: 271.0) [2024-06-15 18:43:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:43:51,558][1653645] Updated weights for policy 0, policy_version 559461 (0.0013) [2024-06-15 18:43:55,421][1653645] Updated weights for policy 0, policy_version 559536 (0.0091) [2024-06-15 18:43:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.5, 300 sec: 44542.3). Total num frames: 1145962496. Throughput: 0: 11241.2. Samples: 286556672. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:43:55,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:43:57,427][1653645] Updated weights for policy 0, policy_version 559606 (0.0013) [2024-06-15 18:44:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1146093568. Throughput: 0: 11173.0. Samples: 286612480. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:00,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:44:02,191][1653645] Updated weights for policy 0, policy_version 559664 (0.0015) [2024-06-15 18:44:03,728][1653645] Updated weights for policy 0, policy_version 559738 (0.0119) [2024-06-15 18:44:05,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1146355712. Throughput: 0: 11036.5. Samples: 286643200. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:44:06,563][1651596] Signal inference workers to stop experience collection... (29150 times) [2024-06-15 18:44:06,615][1653645] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-06-15 18:44:06,832][1651596] Signal inference workers to resume experience collection... (29150 times) [2024-06-15 18:44:06,833][1653645] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-06-15 18:44:06,835][1653645] Updated weights for policy 0, policy_version 559792 (0.0015) [2024-06-15 18:44:08,768][1653645] Updated weights for policy 0, policy_version 559872 (0.0013) [2024-06-15 18:44:10,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1146617856. Throughput: 0: 10956.8. Samples: 286707712. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:44:14,146][1653645] Updated weights for policy 0, policy_version 559936 (0.0014) [2024-06-15 18:44:15,794][1653645] Updated weights for policy 0, policy_version 559995 (0.0021) [2024-06-15 18:44:15,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1146880000. Throughput: 0: 10888.5. Samples: 286778368. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:15,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:44:18,381][1653645] Updated weights for policy 0, policy_version 560048 (0.0012) [2024-06-15 18:44:19,640][1653645] Updated weights for policy 0, policy_version 560096 (0.0103) [2024-06-15 18:44:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1147142144. Throughput: 0: 11229.9. Samples: 286816768. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:44:25,611][1653645] Updated weights for policy 0, policy_version 560176 (0.0107) [2024-06-15 18:44:25,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 44783.1, 300 sec: 44653.4). Total num frames: 1147240448. Throughput: 0: 11138.9. Samples: 286884352. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:25,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:44:27,032][1653645] Updated weights for policy 0, policy_version 560209 (0.0023) [2024-06-15 18:44:29,318][1653645] Updated weights for policy 0, policy_version 560272 (0.0013) [2024-06-15 18:44:30,332][1653645] Updated weights for policy 0, policy_version 560317 (0.0013) [2024-06-15 18:44:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 1147568128. Throughput: 0: 11173.0. Samples: 286947328. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:44:31,802][1653645] Updated weights for policy 0, policy_version 560377 (0.0014) [2024-06-15 18:44:35,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1147666432. Throughput: 0: 11047.9. Samples: 286979584. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:44:36,918][1653645] Updated weights for policy 0, policy_version 560416 (0.0012) [2024-06-15 18:44:38,718][1653645] Updated weights for policy 0, policy_version 560464 (0.0012) [2024-06-15 18:44:40,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1147928576. Throughput: 0: 10899.8. Samples: 287047168. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:44:42,242][1653645] Updated weights for policy 0, policy_version 560576 (0.0082) [2024-06-15 18:44:43,482][1653645] Updated weights for policy 0, policy_version 560634 (0.0013) [2024-06-15 18:44:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1148190720. Throughput: 0: 11207.1. Samples: 287116800. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:44:48,955][1653645] Updated weights for policy 0, policy_version 560688 (0.0012) [2024-06-15 18:44:50,680][1651596] Signal inference workers to stop experience collection... (29200 times) [2024-06-15 18:44:50,716][1653645] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-06-15 18:44:50,920][1651596] Signal inference workers to resume experience collection... (29200 times) [2024-06-15 18:44:50,921][1653645] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-06-15 18:44:50,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1148420096. Throughput: 0: 11355.0. Samples: 287154176. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:50,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:44:51,051][1653645] Updated weights for policy 0, policy_version 560761 (0.0013) [2024-06-15 18:44:53,284][1653645] Updated weights for policy 0, policy_version 560816 (0.0010) [2024-06-15 18:44:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1148715008. Throughput: 0: 11309.5. Samples: 287216640. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:44:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:44:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000560896_1148715008.pth... [2024-06-15 18:44:56,088][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000555648_1137967104.pth [2024-06-15 18:44:59,388][1653645] Updated weights for policy 0, policy_version 560897 (0.0139) [2024-06-15 18:45:00,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1148846080. Throughput: 0: 11241.3. Samples: 287284224. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:45:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:45:01,925][1653645] Updated weights for policy 0, policy_version 560963 (0.0014) [2024-06-15 18:45:04,763][1653645] Updated weights for policy 0, policy_version 561044 (0.0015) [2024-06-15 18:45:05,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 46421.2, 300 sec: 44764.4). Total num frames: 1149140992. Throughput: 0: 11013.7. Samples: 287312384. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:45:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:45:06,311][1653645] Updated weights for policy 0, policy_version 561136 (0.0014) [2024-06-15 18:45:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1149239296. Throughput: 0: 11184.3. Samples: 287387648. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:45:10,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:45:12,195][1653645] Updated weights for policy 0, policy_version 561208 (0.0013) [2024-06-15 18:45:14,365][1653645] Updated weights for policy 0, policy_version 561252 (0.0028) [2024-06-15 18:45:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1149534208. Throughput: 0: 11332.3. Samples: 287457280. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:45:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:45:16,017][1653645] Updated weights for policy 0, policy_version 561299 (0.0014) [2024-06-15 18:45:17,388][1653645] Updated weights for policy 0, policy_version 561376 (0.0034) [2024-06-15 18:45:20,961][1648982] Fps is (10 sec: 52411.6, 60 sec: 43688.3, 300 sec: 44875.0). Total num frames: 1149763584. Throughput: 0: 11308.7. Samples: 287488512. Policy #0 lag: (min: 15.0, avg: 120.3, max: 271.0) [2024-06-15 18:45:20,962][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:45:23,506][1653645] Updated weights for policy 0, policy_version 561456 (0.0014) [2024-06-15 18:45:25,930][1653645] Updated weights for policy 0, policy_version 561493 (0.0013) [2024-06-15 18:45:25,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 44782.6, 300 sec: 44653.3). Total num frames: 1149927424. Throughput: 0: 11366.4. Samples: 287558656. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:45:27,693][1653645] Updated weights for policy 0, policy_version 561571 (0.0115) [2024-06-15 18:45:29,920][1653645] Updated weights for policy 0, policy_version 561632 (0.0013) [2024-06-15 18:45:30,958][1648982] Fps is (10 sec: 52446.2, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1150287872. Throughput: 0: 11070.6. Samples: 287614976. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:45:34,498][1653645] Updated weights for policy 0, policy_version 561680 (0.0014) [2024-06-15 18:45:35,959][1648982] Fps is (10 sec: 49152.0, 60 sec: 45875.0, 300 sec: 44875.5). Total num frames: 1150418944. Throughput: 0: 11138.8. Samples: 287655424. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:35,963][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:45:37,215][1651596] Signal inference workers to stop experience collection... (29250 times) [2024-06-15 18:45:37,254][1653645] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-06-15 18:45:37,439][1651596] Signal inference workers to resume experience collection... (29250 times) [2024-06-15 18:45:37,440][1653645] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-06-15 18:45:37,882][1653645] Updated weights for policy 0, policy_version 561762 (0.0145) [2024-06-15 18:45:39,263][1653645] Updated weights for policy 0, policy_version 561824 (0.0025) [2024-06-15 18:45:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45875.4, 300 sec: 44764.4). Total num frames: 1150681088. Throughput: 0: 11229.9. Samples: 287721984. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:45:42,322][1653645] Updated weights for policy 0, policy_version 561888 (0.0013) [2024-06-15 18:45:45,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1150844928. Throughput: 0: 11252.6. Samples: 287790592. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:45:46,429][1653645] Updated weights for policy 0, policy_version 561968 (0.0014) [2024-06-15 18:45:50,231][1653645] Updated weights for policy 0, policy_version 562032 (0.0016) [2024-06-15 18:45:50,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 44236.6, 300 sec: 44653.3). Total num frames: 1151074304. Throughput: 0: 11389.0. Samples: 287824896. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:50,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:45:52,190][1653645] Updated weights for policy 0, policy_version 562101 (0.0012) [2024-06-15 18:45:54,577][1653645] Updated weights for policy 0, policy_version 562151 (0.0014) [2024-06-15 18:45:55,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.9, 300 sec: 44876.3). Total num frames: 1151336448. Throughput: 0: 11013.7. Samples: 287883264. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:45:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:45:58,120][1653645] Updated weights for policy 0, policy_version 562228 (0.0013) [2024-06-15 18:46:00,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1151467520. Throughput: 0: 11093.3. Samples: 287956480. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:46:02,196][1653645] Updated weights for policy 0, policy_version 562288 (0.0048) [2024-06-15 18:46:03,306][1653645] Updated weights for policy 0, policy_version 562321 (0.0011) [2024-06-15 18:46:04,275][1653645] Updated weights for policy 0, policy_version 562364 (0.0014) [2024-06-15 18:46:05,962][1648982] Fps is (10 sec: 49151.3, 60 sec: 44782.9, 300 sec: 44766.7). Total num frames: 1151827968. Throughput: 0: 11082.8. Samples: 287987200. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:05,967][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:46:05,965][1653645] Updated weights for policy 0, policy_version 562425 (0.0059) [2024-06-15 18:46:09,870][1653645] Updated weights for policy 0, policy_version 562480 (0.0013) [2024-06-15 18:46:10,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1151991808. Throughput: 0: 11082.0. Samples: 288057344. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:46:14,172][1653645] Updated weights for policy 0, policy_version 562535 (0.0015) [2024-06-15 18:46:15,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 1152188416. Throughput: 0: 11252.6. Samples: 288121344. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:46:16,054][1653645] Updated weights for policy 0, policy_version 562608 (0.0017) [2024-06-15 18:46:17,028][1653645] Updated weights for policy 0, policy_version 562640 (0.0012) [2024-06-15 18:46:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43693.0, 300 sec: 44653.4). Total num frames: 1152385024. Throughput: 0: 10979.6. Samples: 288149504. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:46:21,296][1653645] Updated weights for policy 0, policy_version 562704 (0.0078) [2024-06-15 18:46:21,745][1651596] Signal inference workers to stop experience collection... (29300 times) [2024-06-15 18:46:21,825][1653645] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-06-15 18:46:22,011][1651596] Signal inference workers to resume experience collection... (29300 times) [2024-06-15 18:46:22,011][1653645] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-06-15 18:46:24,789][1653645] Updated weights for policy 0, policy_version 562753 (0.0013) [2024-06-15 18:46:25,959][1648982] Fps is (10 sec: 39322.4, 60 sec: 44237.0, 300 sec: 44209.0). Total num frames: 1152581632. Throughput: 0: 11218.5. Samples: 288226816. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:25,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:46:27,214][1653645] Updated weights for policy 0, policy_version 562833 (0.0016) [2024-06-15 18:46:29,192][1653645] Updated weights for policy 0, policy_version 562882 (0.0023) [2024-06-15 18:46:30,251][1653645] Updated weights for policy 0, policy_version 562940 (0.0041) [2024-06-15 18:46:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1152909312. Throughput: 0: 10956.8. Samples: 288283648. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:46:34,005][1653645] Updated weights for policy 0, policy_version 563002 (0.0013) [2024-06-15 18:46:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1153040384. Throughput: 0: 10968.3. Samples: 288318464. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:46:38,364][1653645] Updated weights for policy 0, policy_version 563060 (0.0012) [2024-06-15 18:46:39,808][1653645] Updated weights for policy 0, policy_version 563133 (0.0011) [2024-06-15 18:46:40,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44236.7, 300 sec: 44542.2). Total num frames: 1153335296. Throughput: 0: 11070.5. Samples: 288381440. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:40,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:46:41,857][1653645] Updated weights for policy 0, policy_version 563195 (0.0014) [2024-06-15 18:46:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 1153531904. Throughput: 0: 10968.2. Samples: 288450048. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:46:46,229][1653645] Updated weights for policy 0, policy_version 563263 (0.0014) [2024-06-15 18:46:50,912][1653645] Updated weights for policy 0, policy_version 563332 (0.0015) [2024-06-15 18:46:50,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 43691.0, 300 sec: 44209.0). Total num frames: 1153695744. Throughput: 0: 11025.1. Samples: 288483328. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:46:52,657][1653645] Updated weights for policy 0, policy_version 563399 (0.0012) [2024-06-15 18:46:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1153957888. Throughput: 0: 10854.4. Samples: 288545792. Policy #0 lag: (min: 15.0, avg: 113.2, max: 271.0) [2024-06-15 18:46:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:46:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000563456_1153957888.pth... [2024-06-15 18:46:56,030][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000558208_1143209984.pth [2024-06-15 18:46:57,547][1653645] Updated weights for policy 0, policy_version 563488 (0.0021) [2024-06-15 18:47:00,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1154088960. Throughput: 0: 10934.1. Samples: 288613376. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:00,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:47:01,268][1653645] Updated weights for policy 0, policy_version 563524 (0.0013) [2024-06-15 18:47:02,947][1653645] Updated weights for policy 0, policy_version 563602 (0.0014) [2024-06-15 18:47:04,256][1653645] Updated weights for policy 0, policy_version 563650 (0.0013) [2024-06-15 18:47:05,253][1653645] Updated weights for policy 0, policy_version 563703 (0.0013) [2024-06-15 18:47:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 1154482176. Throughput: 0: 11059.2. Samples: 288647168. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:47:08,007][1651596] Signal inference workers to stop experience collection... (29350 times) [2024-06-15 18:47:08,041][1653645] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-06-15 18:47:08,263][1651596] Signal inference workers to resume experience collection... (29350 times) [2024-06-15 18:47:08,263][1653645] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-06-15 18:47:08,347][1653645] Updated weights for policy 0, policy_version 563744 (0.0095) [2024-06-15 18:47:10,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1154613248. Throughput: 0: 10968.2. Samples: 288720384. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:10,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 18:47:12,294][1653645] Updated weights for policy 0, policy_version 563780 (0.0012) [2024-06-15 18:47:13,612][1653645] Updated weights for policy 0, policy_version 563840 (0.0060) [2024-06-15 18:47:14,863][1653645] Updated weights for policy 0, policy_version 563891 (0.0013) [2024-06-15 18:47:15,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1154940928. Throughput: 0: 11320.8. Samples: 288793088. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:47:15,983][1653645] Updated weights for policy 0, policy_version 563936 (0.0102) [2024-06-15 18:47:19,663][1653645] Updated weights for policy 0, policy_version 563987 (0.0012) [2024-06-15 18:47:20,856][1653645] Updated weights for policy 0, policy_version 564032 (0.0012) [2024-06-15 18:47:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1155137536. Throughput: 0: 11298.2. Samples: 288826880. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:47:24,916][1653645] Updated weights for policy 0, policy_version 564096 (0.0111) [2024-06-15 18:47:25,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1155366912. Throughput: 0: 11355.1. Samples: 288892416. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:47:27,676][1653645] Updated weights for policy 0, policy_version 564176 (0.0013) [2024-06-15 18:47:30,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44431.3). Total num frames: 1155530752. Throughput: 0: 11286.7. Samples: 288957952. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:47:31,796][1653645] Updated weights for policy 0, policy_version 564256 (0.0012) [2024-06-15 18:47:35,690][1653645] Updated weights for policy 0, policy_version 564320 (0.0015) [2024-06-15 18:47:35,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 44782.7, 300 sec: 44209.0). Total num frames: 1155727360. Throughput: 0: 11275.3. Samples: 288990720. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:35,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:47:37,315][1653645] Updated weights for policy 0, policy_version 564371 (0.0013) [2024-06-15 18:47:39,795][1653645] Updated weights for policy 0, policy_version 564436 (0.0013) [2024-06-15 18:47:40,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 1156055040. Throughput: 0: 11377.8. Samples: 289057792. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:47:43,240][1653645] Updated weights for policy 0, policy_version 564483 (0.0034) [2024-06-15 18:47:45,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1156186112. Throughput: 0: 11377.8. Samples: 289125376. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:47:47,011][1653645] Updated weights for policy 0, policy_version 564549 (0.0014) [2024-06-15 18:47:48,557][1653645] Updated weights for policy 0, policy_version 564611 (0.0022) [2024-06-15 18:47:50,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 45875.3, 300 sec: 44875.6). Total num frames: 1156448256. Throughput: 0: 11366.4. Samples: 289158656. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:47:51,278][1653645] Updated weights for policy 0, policy_version 564688 (0.0136) [2024-06-15 18:47:55,311][1651596] Signal inference workers to stop experience collection... (29400 times) [2024-06-15 18:47:55,364][1653645] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-06-15 18:47:55,522][1651596] Signal inference workers to resume experience collection... (29400 times) [2024-06-15 18:47:55,522][1653645] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-06-15 18:47:55,524][1653645] Updated weights for policy 0, policy_version 564768 (0.0016) [2024-06-15 18:47:55,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1156677632. Throughput: 0: 11184.4. Samples: 289223680. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:47:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:47:59,747][1653645] Updated weights for policy 0, policy_version 564832 (0.0012) [2024-06-15 18:48:00,958][1648982] Fps is (10 sec: 42596.9, 60 sec: 46421.3, 300 sec: 44542.2). Total num frames: 1156874240. Throughput: 0: 10991.0. Samples: 289287680. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:48:00,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:48:01,159][1653645] Updated weights for policy 0, policy_version 564897 (0.0013) [2024-06-15 18:48:03,080][1653645] Updated weights for policy 0, policy_version 564944 (0.0013) [2024-06-15 18:48:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1157103616. Throughput: 0: 11093.3. Samples: 289326080. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:48:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:48:06,994][1653645] Updated weights for policy 0, policy_version 565011 (0.0013) [2024-06-15 18:48:10,875][1653645] Updated weights for policy 0, policy_version 565088 (0.0013) [2024-06-15 18:48:10,960][1648982] Fps is (10 sec: 42599.2, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1157300224. Throughput: 0: 11082.0. Samples: 289391104. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:48:10,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:48:12,434][1653645] Updated weights for policy 0, policy_version 565136 (0.0021) [2024-06-15 18:48:13,711][1653645] Updated weights for policy 0, policy_version 565181 (0.0012) [2024-06-15 18:48:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44237.0, 300 sec: 44764.4). Total num frames: 1157595136. Throughput: 0: 11241.2. Samples: 289463808. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:48:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:48:16,059][1653645] Updated weights for policy 0, policy_version 565248 (0.0014) [2024-06-15 18:48:20,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1157758976. Throughput: 0: 11184.4. Samples: 289494016. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:48:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:48:21,852][1653645] Updated weights for policy 0, policy_version 565330 (0.0015) [2024-06-15 18:48:24,011][1653645] Updated weights for policy 0, policy_version 565392 (0.0021) [2024-06-15 18:48:25,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1158021120. Throughput: 0: 11207.1. Samples: 289562112. Policy #0 lag: (min: 11.0, avg: 119.4, max: 267.0) [2024-06-15 18:48:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:48:26,959][1653645] Updated weights for policy 0, policy_version 565442 (0.0019) [2024-06-15 18:48:27,822][1653645] Updated weights for policy 0, policy_version 565497 (0.0011) [2024-06-15 18:48:29,001][1653645] Updated weights for policy 0, policy_version 565555 (0.0013) [2024-06-15 18:48:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1158283264. Throughput: 0: 11366.4. Samples: 289636864. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:48:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:48:32,914][1653645] Updated weights for policy 0, policy_version 565601 (0.0102) [2024-06-15 18:48:33,431][1653645] Updated weights for policy 0, policy_version 565631 (0.0013) [2024-06-15 18:48:35,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.4, 300 sec: 44653.4). Total num frames: 1158479872. Throughput: 0: 11377.7. Samples: 289670656. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:48:35,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:48:36,429][1653645] Updated weights for policy 0, policy_version 565692 (0.0017) [2024-06-15 18:48:39,231][1653645] Updated weights for policy 0, policy_version 565759 (0.0014) [2024-06-15 18:48:40,424][1651596] Signal inference workers to stop experience collection... (29450 times) [2024-06-15 18:48:40,470][1653645] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-06-15 18:48:40,734][1651596] Signal inference workers to resume experience collection... (29450 times) [2024-06-15 18:48:40,735][1653645] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-06-15 18:48:40,945][1653645] Updated weights for policy 0, policy_version 565817 (0.0013) [2024-06-15 18:48:40,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 45328.9, 300 sec: 44764.4). Total num frames: 1158774784. Throughput: 0: 11491.5. Samples: 289740800. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:48:40,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 18:48:44,637][1653645] Updated weights for policy 0, policy_version 565875 (0.0016) [2024-06-15 18:48:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1158938624. Throughput: 0: 11548.5. Samples: 289807360. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:48:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:48:46,741][1653645] Updated weights for policy 0, policy_version 565920 (0.0011) [2024-06-15 18:48:50,159][1653645] Updated weights for policy 0, policy_version 565984 (0.0025) [2024-06-15 18:48:50,959][1648982] Fps is (10 sec: 42592.5, 60 sec: 45873.9, 300 sec: 44875.3). Total num frames: 1159200768. Throughput: 0: 11445.6. Samples: 289841152. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:48:50,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 18:48:52,065][1653645] Updated weights for policy 0, policy_version 566050 (0.0015) [2024-06-15 18:48:55,496][1653645] Updated weights for policy 0, policy_version 566112 (0.0115) [2024-06-15 18:48:55,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1159430144. Throughput: 0: 11628.1. Samples: 289914368. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:48:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:48:56,173][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000566144_1159462912.pth... [2024-06-15 18:48:56,217][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000560896_1148715008.pth [2024-06-15 18:48:57,370][1653645] Updated weights for policy 0, policy_version 566146 (0.0012) [2024-06-15 18:48:58,536][1653645] Updated weights for policy 0, policy_version 566208 (0.0013) [2024-06-15 18:49:00,958][1648982] Fps is (10 sec: 39328.3, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 1159593984. Throughput: 0: 11548.5. Samples: 289983488. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:00,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:49:04,153][1653645] Updated weights for policy 0, policy_version 566320 (0.0016) [2024-06-15 18:49:05,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1159856128. Throughput: 0: 11525.7. Samples: 290012672. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:49:06,657][1653645] Updated weights for policy 0, policy_version 566340 (0.0012) [2024-06-15 18:49:09,193][1653645] Updated weights for policy 0, policy_version 566416 (0.0012) [2024-06-15 18:49:10,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 46967.5, 300 sec: 44875.5). Total num frames: 1160118272. Throughput: 0: 11525.7. Samples: 290080768. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:49:15,904][1653645] Updated weights for policy 0, policy_version 566544 (0.0018) [2024-06-15 18:49:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 1160282112. Throughput: 0: 11138.8. Samples: 290138112. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:49:16,873][1653645] Updated weights for policy 0, policy_version 566591 (0.0112) [2024-06-15 18:49:20,942][1653645] Updated weights for policy 0, policy_version 566657 (0.0012) [2024-06-15 18:49:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1160511488. Throughput: 0: 11184.4. Samples: 290173952. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:20,958][1648982] Avg episode reward: [(0, '37.090')] [2024-06-15 18:49:21,929][1653645] Updated weights for policy 0, policy_version 566713 (0.0015) [2024-06-15 18:49:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1160642560. Throughput: 0: 11207.2. Samples: 290245120. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:49:27,164][1653645] Updated weights for policy 0, policy_version 566768 (0.0041) [2024-06-15 18:49:28,528][1651596] Signal inference workers to stop experience collection... (29500 times) [2024-06-15 18:49:28,573][1653645] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-06-15 18:49:28,764][1651596] Signal inference workers to resume experience collection... (29500 times) [2024-06-15 18:49:28,765][1653645] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-06-15 18:49:28,916][1653645] Updated weights for policy 0, policy_version 566837 (0.0021) [2024-06-15 18:49:30,569][1653645] Updated weights for policy 0, policy_version 566868 (0.0011) [2024-06-15 18:49:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1160970240. Throughput: 0: 11207.1. Samples: 290311680. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:49:32,115][1653645] Updated weights for policy 0, policy_version 566920 (0.0013) [2024-06-15 18:49:35,974][1648982] Fps is (10 sec: 52341.8, 60 sec: 44770.5, 300 sec: 44873.0). Total num frames: 1161166848. Throughput: 0: 11066.9. Samples: 290339328. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:35,975][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:49:38,155][1653645] Updated weights for policy 0, policy_version 567008 (0.0022) [2024-06-15 18:49:39,842][1653645] Updated weights for policy 0, policy_version 567072 (0.0032) [2024-06-15 18:49:40,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1161428992. Throughput: 0: 11036.4. Samples: 290411008. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:40,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:49:41,346][1653645] Updated weights for policy 0, policy_version 567105 (0.0011) [2024-06-15 18:49:42,526][1653645] Updated weights for policy 0, policy_version 567163 (0.0011) [2024-06-15 18:49:45,083][1653645] Updated weights for policy 0, policy_version 567222 (0.0013) [2024-06-15 18:49:45,959][1648982] Fps is (10 sec: 52509.4, 60 sec: 45874.2, 300 sec: 44986.4). Total num frames: 1161691136. Throughput: 0: 11002.0. Samples: 290478592. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:45,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:49:50,035][1653645] Updated weights for policy 0, policy_version 567280 (0.0046) [2024-06-15 18:49:50,970][1648982] Fps is (10 sec: 42545.0, 60 sec: 44228.7, 300 sec: 44540.4). Total num frames: 1161854976. Throughput: 0: 11238.1. Samples: 290518528. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:50,971][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:49:51,621][1653645] Updated weights for policy 0, policy_version 567355 (0.0011) [2024-06-15 18:49:53,963][1653645] Updated weights for policy 0, policy_version 567418 (0.0016) [2024-06-15 18:49:55,958][1648982] Fps is (10 sec: 39327.1, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1162084352. Throughput: 0: 11070.6. Samples: 290578944. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:49:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:49:57,479][1653645] Updated weights for policy 0, policy_version 567478 (0.0014) [2024-06-15 18:50:00,958][1648982] Fps is (10 sec: 36090.7, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1162215424. Throughput: 0: 11366.4. Samples: 290649600. Policy #0 lag: (min: 28.0, avg: 146.0, max: 271.0) [2024-06-15 18:50:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:50:01,790][1653645] Updated weights for policy 0, policy_version 567525 (0.0013) [2024-06-15 18:50:02,796][1653645] Updated weights for policy 0, policy_version 567573 (0.0021) [2024-06-15 18:50:04,846][1653645] Updated weights for policy 0, policy_version 567648 (0.0013) [2024-06-15 18:50:05,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1162608640. Throughput: 0: 11309.5. Samples: 290682880. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:50:08,316][1653645] Updated weights for policy 0, policy_version 567696 (0.0016) [2024-06-15 18:50:10,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1162739712. Throughput: 0: 11082.0. Samples: 290743808. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:50:14,632][1653645] Updated weights for policy 0, policy_version 567808 (0.0013) [2024-06-15 18:50:15,123][1651596] Signal inference workers to stop experience collection... (29550 times) [2024-06-15 18:50:15,182][1653645] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-06-15 18:50:15,362][1651596] Signal inference workers to resume experience collection... (29550 times) [2024-06-15 18:50:15,364][1653645] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-06-15 18:50:15,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44782.9, 300 sec: 44764.9). Total num frames: 1162969088. Throughput: 0: 11013.7. Samples: 290807296. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:50:16,970][1653645] Updated weights for policy 0, policy_version 567904 (0.0012) [2024-06-15 18:50:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44764.5). Total num frames: 1163132928. Throughput: 0: 11006.4. Samples: 290834432. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:50:21,376][1653645] Updated weights for policy 0, policy_version 567959 (0.0025) [2024-06-15 18:50:25,818][1653645] Updated weights for policy 0, policy_version 568001 (0.0014) [2024-06-15 18:50:25,962][1648982] Fps is (10 sec: 29477.4, 60 sec: 43687.3, 300 sec: 43986.2). Total num frames: 1163264000. Throughput: 0: 11069.5. Samples: 290909184. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:25,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:50:27,438][1653645] Updated weights for policy 0, policy_version 568068 (0.0137) [2024-06-15 18:50:28,447][1653645] Updated weights for policy 0, policy_version 568117 (0.0012) [2024-06-15 18:50:30,026][1653645] Updated weights for policy 0, policy_version 568191 (0.0012) [2024-06-15 18:50:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1163657216. Throughput: 0: 11002.6. Samples: 290973696. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:50:32,903][1653645] Updated weights for policy 0, policy_version 568243 (0.0013) [2024-06-15 18:50:35,958][1648982] Fps is (10 sec: 52452.4, 60 sec: 43702.7, 300 sec: 44431.2). Total num frames: 1163788288. Throughput: 0: 10925.7. Samples: 291010048. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:50:37,375][1653645] Updated weights for policy 0, policy_version 568288 (0.0012) [2024-06-15 18:50:39,568][1653645] Updated weights for policy 0, policy_version 568384 (0.0126) [2024-06-15 18:50:40,825][1653645] Updated weights for policy 0, policy_version 568439 (0.0013) [2024-06-15 18:50:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.1, 300 sec: 45097.6). Total num frames: 1164148736. Throughput: 0: 11059.2. Samples: 291076608. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:50:43,823][1653645] Updated weights for policy 0, policy_version 568480 (0.0070) [2024-06-15 18:50:45,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43691.4, 300 sec: 44875.5). Total num frames: 1164312576. Throughput: 0: 11229.8. Samples: 291154944. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:50:47,485][1653645] Updated weights for policy 0, policy_version 568513 (0.0014) [2024-06-15 18:50:49,646][1653645] Updated weights for policy 0, policy_version 568592 (0.0087) [2024-06-15 18:50:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45338.6, 300 sec: 44875.5). Total num frames: 1164574720. Throughput: 0: 11275.4. Samples: 291190272. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 18:50:51,611][1653645] Updated weights for policy 0, policy_version 568662 (0.0013) [2024-06-15 18:50:55,030][1653645] Updated weights for policy 0, policy_version 568706 (0.0011) [2024-06-15 18:50:55,960][1648982] Fps is (10 sec: 45866.9, 60 sec: 44781.3, 300 sec: 45097.3). Total num frames: 1164771328. Throughput: 0: 11468.3. Samples: 291259904. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:50:55,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:50:56,458][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000568768_1164836864.pth... [2024-06-15 18:50:56,521][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000563456_1153957888.pth [2024-06-15 18:50:56,525][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000568768_1164836864.pth [2024-06-15 18:50:58,945][1653645] Updated weights for policy 0, policy_version 568771 (0.0014) [2024-06-15 18:50:59,193][1651596] Signal inference workers to stop experience collection... (29600 times) [2024-06-15 18:50:59,264][1653645] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-06-15 18:50:59,433][1651596] Signal inference workers to resume experience collection... (29600 times) [2024-06-15 18:50:59,434][1653645] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-06-15 18:50:59,995][1653645] Updated weights for policy 0, policy_version 568826 (0.0013) [2024-06-15 18:51:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.1, 300 sec: 44542.3). Total num frames: 1164967936. Throughput: 0: 11502.9. Samples: 291324928. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:51:02,442][1653645] Updated weights for policy 0, policy_version 568896 (0.0014) [2024-06-15 18:51:05,958][1648982] Fps is (10 sec: 45884.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1165230080. Throughput: 0: 11389.1. Samples: 291346944. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:51:07,524][1653645] Updated weights for policy 0, policy_version 568966 (0.0015) [2024-06-15 18:51:08,967][1653645] Updated weights for policy 0, policy_version 569021 (0.0111) [2024-06-15 18:51:10,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 1165426688. Throughput: 0: 11435.8. Samples: 291423744. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:51:11,116][1653645] Updated weights for policy 0, policy_version 569057 (0.0012) [2024-06-15 18:51:13,470][1653645] Updated weights for policy 0, policy_version 569120 (0.0020) [2024-06-15 18:51:15,025][1653645] Updated weights for policy 0, policy_version 569186 (0.0036) [2024-06-15 18:51:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1165754368. Throughput: 0: 11320.9. Samples: 291483136. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:51:20,482][1653645] Updated weights for policy 0, policy_version 569279 (0.0013) [2024-06-15 18:51:20,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1165885440. Throughput: 0: 11423.3. Samples: 291524096. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:51:23,526][1653645] Updated weights for policy 0, policy_version 569342 (0.0017) [2024-06-15 18:51:25,148][1653645] Updated weights for policy 0, policy_version 569382 (0.0013) [2024-06-15 18:51:25,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 48609.4, 300 sec: 44986.5). Total num frames: 1166180352. Throughput: 0: 11400.5. Samples: 291589632. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:51:26,522][1653645] Updated weights for policy 0, policy_version 569460 (0.0064) [2024-06-15 18:51:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1166278656. Throughput: 0: 11286.8. Samples: 291662848. Policy #0 lag: (min: 15.0, avg: 97.2, max: 271.0) [2024-06-15 18:51:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:51:32,066][1653645] Updated weights for policy 0, policy_version 569520 (0.0020) [2024-06-15 18:51:35,149][1653645] Updated weights for policy 0, policy_version 569599 (0.0075) [2024-06-15 18:51:35,958][1648982] Fps is (10 sec: 39323.3, 60 sec: 46421.6, 300 sec: 44875.6). Total num frames: 1166573568. Throughput: 0: 11173.0. Samples: 291693056. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:51:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:51:36,466][1653645] Updated weights for policy 0, policy_version 569648 (0.0013) [2024-06-15 18:51:38,114][1653645] Updated weights for policy 0, policy_version 569718 (0.0141) [2024-06-15 18:51:40,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 44236.9, 300 sec: 44986.6). Total num frames: 1166802944. Throughput: 0: 10991.5. Samples: 291754496. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:51:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:51:43,593][1653645] Updated weights for policy 0, policy_version 569767 (0.0013) [2024-06-15 18:51:45,459][1651596] Signal inference workers to stop experience collection... (29650 times) [2024-06-15 18:51:45,494][1653645] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-06-15 18:51:45,676][1651596] Signal inference workers to resume experience collection... (29650 times) [2024-06-15 18:51:45,677][1653645] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-06-15 18:51:45,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44237.0, 300 sec: 44986.6). Total num frames: 1166966784. Throughput: 0: 11195.7. Samples: 291828736. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:51:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 18:51:46,184][1653645] Updated weights for policy 0, policy_version 569827 (0.0126) [2024-06-15 18:51:47,664][1653645] Updated weights for policy 0, policy_version 569888 (0.0129) [2024-06-15 18:51:48,148][1653645] Updated weights for policy 0, policy_version 569916 (0.0010) [2024-06-15 18:51:50,189][1653645] Updated weights for policy 0, policy_version 569977 (0.0014) [2024-06-15 18:51:50,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1167327232. Throughput: 0: 11411.9. Samples: 291860480. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:51:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:51:55,489][1653645] Updated weights for policy 0, policy_version 570047 (0.0017) [2024-06-15 18:51:55,974][1648982] Fps is (10 sec: 49108.0, 60 sec: 44777.8, 300 sec: 45318.5). Total num frames: 1167458304. Throughput: 0: 11330.0. Samples: 291933696. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:51:55,977][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:51:58,031][1653645] Updated weights for policy 0, policy_version 570111 (0.0015) [2024-06-15 18:52:00,058][1653645] Updated weights for policy 0, policy_version 570173 (0.0013) [2024-06-15 18:52:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1167720448. Throughput: 0: 11264.0. Samples: 291990016. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:52:02,225][1653645] Updated weights for policy 0, policy_version 570232 (0.0012) [2024-06-15 18:52:05,958][1648982] Fps is (10 sec: 39356.5, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1167851520. Throughput: 0: 11184.3. Samples: 292027392. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:52:07,349][1653645] Updated weights for policy 0, policy_version 570293 (0.0012) [2024-06-15 18:52:10,295][1653645] Updated weights for policy 0, policy_version 570364 (0.0053) [2024-06-15 18:52:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 1168113664. Throughput: 0: 11207.1. Samples: 292093952. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:52:12,278][1653645] Updated weights for policy 0, policy_version 570425 (0.0029) [2024-06-15 18:52:13,807][1653645] Updated weights for policy 0, policy_version 570451 (0.0034) [2024-06-15 18:52:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1168375808. Throughput: 0: 11002.3. Samples: 292157952. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:52:18,947][1653645] Updated weights for policy 0, policy_version 570528 (0.0120) [2024-06-15 18:52:20,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1168506880. Throughput: 0: 11127.4. Samples: 292193792. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:20,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:52:22,517][1653645] Updated weights for policy 0, policy_version 570608 (0.0013) [2024-06-15 18:52:24,466][1653645] Updated weights for policy 0, policy_version 570685 (0.0017) [2024-06-15 18:52:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43144.8, 300 sec: 44875.5). Total num frames: 1168769024. Throughput: 0: 10843.0. Samples: 292242432. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:52:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1168900096. Throughput: 0: 10877.2. Samples: 292318208. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:52:31,076][1653645] Updated weights for policy 0, policy_version 570768 (0.0012) [2024-06-15 18:52:33,430][1651596] Signal inference workers to stop experience collection... (29700 times) [2024-06-15 18:52:33,457][1653645] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-06-15 18:52:33,488][1653645] Updated weights for policy 0, policy_version 570835 (0.0014) [2024-06-15 18:52:33,699][1651596] Signal inference workers to resume experience collection... (29700 times) [2024-06-15 18:52:33,700][1653645] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-06-15 18:52:35,662][1653645] Updated weights for policy 0, policy_version 570912 (0.0014) [2024-06-15 18:52:35,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 1169227776. Throughput: 0: 10899.9. Samples: 292350976. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:52:38,525][1653645] Updated weights for policy 0, policy_version 570976 (0.0021) [2024-06-15 18:52:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1169424384. Throughput: 0: 10720.0. Samples: 292416000. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:52:42,570][1653645] Updated weights for policy 0, policy_version 571012 (0.0018) [2024-06-15 18:52:45,283][1653645] Updated weights for policy 0, policy_version 571104 (0.0013) [2024-06-15 18:52:45,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1169653760. Throughput: 0: 10934.1. Samples: 292482048. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:52:46,392][1653645] Updated weights for policy 0, policy_version 571152 (0.0014) [2024-06-15 18:52:49,688][1653645] Updated weights for policy 0, policy_version 571216 (0.0010) [2024-06-15 18:52:50,612][1653645] Updated weights for policy 0, policy_version 571254 (0.0030) [2024-06-15 18:52:50,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.4, 300 sec: 44986.5). Total num frames: 1169948672. Throughput: 0: 10877.1. Samples: 292516864. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:52:55,047][1653645] Updated weights for policy 0, policy_version 571328 (0.0026) [2024-06-15 18:52:55,959][1648982] Fps is (10 sec: 42593.9, 60 sec: 43696.4, 300 sec: 44764.3). Total num frames: 1170079744. Throughput: 0: 11036.2. Samples: 292590592. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:52:55,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:52:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000571328_1170079744.pth... [2024-06-15 18:52:56,007][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000566144_1159462912.pth [2024-06-15 18:52:57,520][1653645] Updated weights for policy 0, policy_version 571391 (0.0014) [2024-06-15 18:53:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1170341888. Throughput: 0: 10990.9. Samples: 292652544. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:53:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:53:01,712][1653645] Updated weights for policy 0, policy_version 571489 (0.0014) [2024-06-15 18:53:05,958][1648982] Fps is (10 sec: 42603.0, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1170505728. Throughput: 0: 10877.2. Samples: 292683264. Policy #0 lag: (min: 15.0, avg: 105.8, max: 271.0) [2024-06-15 18:53:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:53:06,798][1653645] Updated weights for policy 0, policy_version 571578 (0.0014) [2024-06-15 18:53:09,351][1653645] Updated weights for policy 0, policy_version 571642 (0.0026) [2024-06-15 18:53:10,960][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 1170735104. Throughput: 0: 11275.4. Samples: 292749824. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:10,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:53:12,143][1653645] Updated weights for policy 0, policy_version 571703 (0.0013) [2024-06-15 18:53:13,933][1653645] Updated weights for policy 0, policy_version 571776 (0.0012) [2024-06-15 18:53:15,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1170997248. Throughput: 0: 11127.5. Samples: 292818944. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 18:53:18,948][1653645] Updated weights for policy 0, policy_version 571840 (0.0012) [2024-06-15 18:53:19,473][1651596] Signal inference workers to stop experience collection... (29750 times) [2024-06-15 18:53:19,531][1653645] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-06-15 18:53:19,721][1651596] Signal inference workers to resume experience collection... (29750 times) [2024-06-15 18:53:19,727][1653645] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-06-15 18:53:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1171259392. Throughput: 0: 11343.6. Samples: 292861440. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:53:22,573][1653645] Updated weights for policy 0, policy_version 571923 (0.0016) [2024-06-15 18:53:24,502][1653645] Updated weights for policy 0, policy_version 572001 (0.0041) [2024-06-15 18:53:25,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1171521536. Throughput: 0: 11161.6. Samples: 292918272. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:53:29,356][1653645] Updated weights for policy 0, policy_version 572048 (0.0011) [2024-06-15 18:53:30,888][1653645] Updated weights for policy 0, policy_version 572100 (0.0013) [2024-06-15 18:53:30,957][1648982] Fps is (10 sec: 39322.5, 60 sec: 45875.3, 300 sec: 44653.4). Total num frames: 1171652608. Throughput: 0: 11389.2. Samples: 292994560. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:53:33,871][1653645] Updated weights for policy 0, policy_version 572192 (0.0119) [2024-06-15 18:53:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1171980288. Throughput: 0: 11343.7. Samples: 293027328. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:53:36,132][1653645] Updated weights for policy 0, policy_version 572272 (0.0013) [2024-06-15 18:53:40,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1172045824. Throughput: 0: 11139.1. Samples: 293091840. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:53:41,795][1653645] Updated weights for policy 0, policy_version 572320 (0.0012) [2024-06-15 18:53:43,970][1653645] Updated weights for policy 0, policy_version 572407 (0.0015) [2024-06-15 18:53:45,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 44236.8, 300 sec: 44431.4). Total num frames: 1172307968. Throughput: 0: 11286.8. Samples: 293160448. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:53:46,716][1653645] Updated weights for policy 0, policy_version 572451 (0.0013) [2024-06-15 18:53:47,856][1653645] Updated weights for policy 0, policy_version 572496 (0.0013) [2024-06-15 18:53:48,900][1653645] Updated weights for policy 0, policy_version 572542 (0.0014) [2024-06-15 18:53:50,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.9, 300 sec: 44542.3). Total num frames: 1172570112. Throughput: 0: 11207.1. Samples: 293187584. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:53:54,139][1653645] Updated weights for policy 0, policy_version 572598 (0.0013) [2024-06-15 18:53:55,343][1653645] Updated weights for policy 0, policy_version 572656 (0.0029) [2024-06-15 18:53:55,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45876.1, 300 sec: 44875.5). Total num frames: 1172832256. Throughput: 0: 11332.3. Samples: 293259776. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:53:55,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:53:57,844][1653645] Updated weights for policy 0, policy_version 572708 (0.0020) [2024-06-15 18:54:00,033][1653645] Updated weights for policy 0, policy_version 572771 (0.0130) [2024-06-15 18:54:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 1173094400. Throughput: 0: 11116.1. Samples: 293319168. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:54:05,629][1653645] Updated weights for policy 0, policy_version 572848 (0.0153) [2024-06-15 18:54:05,740][1651596] Signal inference workers to stop experience collection... (29800 times) [2024-06-15 18:54:05,775][1653645] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-06-15 18:54:05,940][1651596] Signal inference workers to resume experience collection... (29800 times) [2024-06-15 18:54:05,950][1653645] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-06-15 18:54:05,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1173225472. Throughput: 0: 11184.3. Samples: 293364736. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:54:07,162][1653645] Updated weights for policy 0, policy_version 572913 (0.0014) [2024-06-15 18:54:09,787][1653645] Updated weights for policy 0, policy_version 572976 (0.0045) [2024-06-15 18:54:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 1173520384. Throughput: 0: 11298.1. Samples: 293426688. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:54:10,979][1653645] Updated weights for policy 0, policy_version 573014 (0.0131) [2024-06-15 18:54:11,666][1653645] Updated weights for policy 0, policy_version 573056 (0.0019) [2024-06-15 18:54:15,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1173651456. Throughput: 0: 11309.5. Samples: 293503488. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:54:17,557][1653645] Updated weights for policy 0, policy_version 573136 (0.0123) [2024-06-15 18:54:20,268][1653645] Updated weights for policy 0, policy_version 573201 (0.0013) [2024-06-15 18:54:20,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 1173946368. Throughput: 0: 11173.0. Samples: 293530112. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:20,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:54:22,364][1653645] Updated weights for policy 0, policy_version 573251 (0.0032) [2024-06-15 18:54:25,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1174142976. Throughput: 0: 11173.0. Samples: 293594624. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 18:54:27,973][1653645] Updated weights for policy 0, policy_version 573328 (0.0116) [2024-06-15 18:54:29,042][1653645] Updated weights for policy 0, policy_version 573376 (0.0012) [2024-06-15 18:54:30,966][1648982] Fps is (10 sec: 42598.4, 60 sec: 45328.9, 300 sec: 44767.0). Total num frames: 1174372352. Throughput: 0: 11195.7. Samples: 293664256. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:30,966][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:54:31,099][1653645] Updated weights for policy 0, policy_version 573436 (0.0012) [2024-06-15 18:54:32,899][1653645] Updated weights for policy 0, policy_version 573496 (0.0012) [2024-06-15 18:54:34,286][1653645] Updated weights for policy 0, policy_version 573525 (0.0012) [2024-06-15 18:54:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 1174667264. Throughput: 0: 11241.3. Samples: 293693440. Policy #0 lag: (min: 31.0, avg: 122.2, max: 287.0) [2024-06-15 18:54:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:54:39,558][1653645] Updated weights for policy 0, policy_version 573570 (0.0020) [2024-06-15 18:54:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 44320.3). Total num frames: 1174765568. Throughput: 0: 11309.5. Samples: 293768704. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:54:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:54:41,749][1653645] Updated weights for policy 0, policy_version 573636 (0.0044) [2024-06-15 18:54:42,996][1653645] Updated weights for policy 0, policy_version 573689 (0.0113) [2024-06-15 18:54:44,480][1653645] Updated weights for policy 0, policy_version 573744 (0.0016) [2024-06-15 18:54:45,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 44877.4). Total num frames: 1175093248. Throughput: 0: 11320.9. Samples: 293828608. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:54:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:54:46,613][1653645] Updated weights for policy 0, policy_version 573808 (0.0014) [2024-06-15 18:54:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1175191552. Throughput: 0: 11127.5. Samples: 293865472. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:54:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:54:51,854][1653645] Updated weights for policy 0, policy_version 573842 (0.0013) [2024-06-15 18:54:52,801][1653645] Updated weights for policy 0, policy_version 573888 (0.0012) [2024-06-15 18:54:53,427][1651596] Signal inference workers to stop experience collection... (29850 times) [2024-06-15 18:54:53,492][1653645] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-06-15 18:54:53,788][1651596] Signal inference workers to resume experience collection... (29850 times) [2024-06-15 18:54:53,789][1653645] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-06-15 18:54:54,457][1653645] Updated weights for policy 0, policy_version 573943 (0.0115) [2024-06-15 18:54:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 1175486464. Throughput: 0: 11184.3. Samples: 293929984. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:54:55,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 18:54:56,377][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000574000_1175552000.pth... [2024-06-15 18:54:56,425][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000568768_1164836864.pth [2024-06-15 18:54:56,609][1653645] Updated weights for policy 0, policy_version 574011 (0.0090) [2024-06-15 18:54:57,756][1653645] Updated weights for policy 0, policy_version 574065 (0.0012) [2024-06-15 18:55:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1175715840. Throughput: 0: 11036.4. Samples: 294000128. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:55:04,007][1653645] Updated weights for policy 0, policy_version 574116 (0.0014) [2024-06-15 18:55:05,021][1653645] Updated weights for policy 0, policy_version 574147 (0.0011) [2024-06-15 18:55:05,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.8, 300 sec: 44653.3). Total num frames: 1175912448. Throughput: 0: 11252.6. Samples: 294036480. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:05,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:55:06,303][1653645] Updated weights for policy 0, policy_version 574207 (0.0011) [2024-06-15 18:55:08,172][1653645] Updated weights for policy 0, policy_version 574269 (0.0024) [2024-06-15 18:55:09,583][1653645] Updated weights for policy 0, policy_version 574336 (0.0013) [2024-06-15 18:55:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1176240128. Throughput: 0: 11161.6. Samples: 294096896. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:10,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 18:55:15,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44782.8, 300 sec: 44764.4). Total num frames: 1176338432. Throughput: 0: 11264.0. Samples: 294171136. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:55:16,143][1653645] Updated weights for policy 0, policy_version 574394 (0.0040) [2024-06-15 18:55:17,703][1653645] Updated weights for policy 0, policy_version 574459 (0.0127) [2024-06-15 18:55:19,619][1653645] Updated weights for policy 0, policy_version 574517 (0.0011) [2024-06-15 18:55:20,762][1653645] Updated weights for policy 0, policy_version 574582 (0.0112) [2024-06-15 18:55:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 46967.5, 300 sec: 45764.9). Total num frames: 1176764416. Throughput: 0: 11343.6. Samples: 294203904. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:55:25,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1176764416. Throughput: 0: 11195.7. Samples: 294272512. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:55:27,262][1653645] Updated weights for policy 0, policy_version 574626 (0.0103) [2024-06-15 18:55:29,575][1653645] Updated weights for policy 0, policy_version 574708 (0.0106) [2024-06-15 18:55:30,655][1653645] Updated weights for policy 0, policy_version 574752 (0.0015) [2024-06-15 18:55:30,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 45329.0, 300 sec: 45097.7). Total num frames: 1177092096. Throughput: 0: 11320.9. Samples: 294338048. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:30,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:55:31,983][1653645] Updated weights for policy 0, policy_version 574816 (0.0013) [2024-06-15 18:55:35,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1177288704. Throughput: 0: 11161.6. Samples: 294367744. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:55:38,248][1653645] Updated weights for policy 0, policy_version 574852 (0.0015) [2024-06-15 18:55:38,963][1651596] Signal inference workers to stop experience collection... (29900 times) [2024-06-15 18:55:39,051][1653645] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-06-15 18:55:39,194][1651596] Signal inference workers to resume experience collection... (29900 times) [2024-06-15 18:55:39,195][1653645] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-06-15 18:55:40,784][1653645] Updated weights for policy 0, policy_version 574930 (0.0016) [2024-06-15 18:55:40,962][1648982] Fps is (10 sec: 39304.1, 60 sec: 45325.6, 300 sec: 44652.7). Total num frames: 1177485312. Throughput: 0: 11274.3. Samples: 294437376. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:40,963][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:55:41,749][1653645] Updated weights for policy 0, policy_version 574978 (0.0078) [2024-06-15 18:55:42,477][1653645] Updated weights for policy 0, policy_version 575024 (0.0014) [2024-06-15 18:55:44,636][1653645] Updated weights for policy 0, policy_version 575100 (0.0013) [2024-06-15 18:55:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1177812992. Throughput: 0: 11195.7. Samples: 294503936. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:55:50,896][1653645] Updated weights for policy 0, policy_version 575160 (0.0012) [2024-06-15 18:55:50,958][1648982] Fps is (10 sec: 42617.8, 60 sec: 45329.1, 300 sec: 44542.6). Total num frames: 1177911296. Throughput: 0: 11241.3. Samples: 294542336. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:50,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:55:52,124][1653645] Updated weights for policy 0, policy_version 575203 (0.0112) [2024-06-15 18:55:53,833][1653645] Updated weights for policy 0, policy_version 575266 (0.0012) [2024-06-15 18:55:55,151][1653645] Updated weights for policy 0, policy_version 575313 (0.0012) [2024-06-15 18:55:55,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 46967.5, 300 sec: 45208.7). Total num frames: 1178304512. Throughput: 0: 11400.5. Samples: 294609920. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:55:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:56:00,960][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1178337280. Throughput: 0: 11309.5. Samples: 294680064. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:56:00,961][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 18:56:02,126][1653645] Updated weights for policy 0, policy_version 575376 (0.0013) [2024-06-15 18:56:03,532][1653645] Updated weights for policy 0, policy_version 575440 (0.0231) [2024-06-15 18:56:04,671][1653645] Updated weights for policy 0, policy_version 575488 (0.0015) [2024-06-15 18:56:05,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 46421.5, 300 sec: 44986.6). Total num frames: 1178697728. Throughput: 0: 11332.3. Samples: 294713856. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:56:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:56:07,012][1653645] Updated weights for policy 0, policy_version 575568 (0.0011) [2024-06-15 18:56:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1178861568. Throughput: 0: 11127.5. Samples: 294773248. Policy #0 lag: (min: 15.0, avg: 89.9, max: 271.0) [2024-06-15 18:56:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:56:14,119][1653645] Updated weights for policy 0, policy_version 575634 (0.0012) [2024-06-15 18:56:15,213][1653645] Updated weights for policy 0, policy_version 575696 (0.0011) [2024-06-15 18:56:15,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1179058176. Throughput: 0: 11275.4. Samples: 294845440. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:56:16,425][1653645] Updated weights for policy 0, policy_version 575741 (0.0033) [2024-06-15 18:56:17,709][1653645] Updated weights for policy 0, policy_version 575800 (0.0012) [2024-06-15 18:56:19,182][1651596] Signal inference workers to stop experience collection... (29950 times) [2024-06-15 18:56:19,211][1653645] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-06-15 18:56:19,486][1651596] Signal inference workers to resume experience collection... (29950 times) [2024-06-15 18:56:19,487][1653645] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-06-15 18:56:19,916][1653645] Updated weights for policy 0, policy_version 575856 (0.0012) [2024-06-15 18:56:20,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44764.5). Total num frames: 1179385856. Throughput: 0: 11343.6. Samples: 294878208. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:56:25,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 1179451392. Throughput: 0: 11333.4. Samples: 294947328. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:56:26,078][1653645] Updated weights for policy 0, policy_version 575920 (0.0146) [2024-06-15 18:56:27,432][1653645] Updated weights for policy 0, policy_version 575984 (0.0128) [2024-06-15 18:56:28,413][1653645] Updated weights for policy 0, policy_version 576018 (0.0013) [2024-06-15 18:56:30,557][1653645] Updated weights for policy 0, policy_version 576065 (0.0012) [2024-06-15 18:56:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1179811840. Throughput: 0: 11264.0. Samples: 295010816. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:56:32,160][1653645] Updated weights for policy 0, policy_version 576128 (0.0023) [2024-06-15 18:56:35,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1179910144. Throughput: 0: 11059.2. Samples: 295040000. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:56:38,652][1653645] Updated weights for policy 0, policy_version 576209 (0.0013) [2024-06-15 18:56:39,767][1653645] Updated weights for policy 0, policy_version 576249 (0.0021) [2024-06-15 18:56:40,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 45878.5, 300 sec: 44986.5). Total num frames: 1180237824. Throughput: 0: 11081.9. Samples: 295108608. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:56:41,388][1653645] Updated weights for policy 0, policy_version 576304 (0.0011) [2024-06-15 18:56:42,966][1653645] Updated weights for policy 0, policy_version 576336 (0.0013) [2024-06-15 18:56:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1180434432. Throughput: 0: 10888.5. Samples: 295170048. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:56:49,436][1653645] Updated weights for policy 0, policy_version 576403 (0.0014) [2024-06-15 18:56:50,958][1648982] Fps is (10 sec: 32769.0, 60 sec: 44236.8, 300 sec: 44432.5). Total num frames: 1180565504. Throughput: 0: 11070.6. Samples: 295212032. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:50,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:56:51,637][1653645] Updated weights for policy 0, policy_version 576486 (0.0015) [2024-06-15 18:56:53,051][1653645] Updated weights for policy 0, policy_version 576544 (0.0012) [2024-06-15 18:56:55,416][1653645] Updated weights for policy 0, policy_version 576583 (0.0011) [2024-06-15 18:56:55,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 44653.3). Total num frames: 1180893184. Throughput: 0: 10968.1. Samples: 295266816. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:56:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:56:56,289][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000576624_1180925952.pth... [2024-06-15 18:56:56,345][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000571328_1170079744.pth [2024-06-15 18:56:56,546][1653645] Updated weights for policy 0, policy_version 576633 (0.0012) [2024-06-15 18:57:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1180958720. Throughput: 0: 11138.9. Samples: 295346688. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:57:01,990][1653645] Updated weights for policy 0, policy_version 576688 (0.0012) [2024-06-15 18:57:04,322][1651596] Signal inference workers to stop experience collection... (30000 times) [2024-06-15 18:57:04,347][1653645] Updated weights for policy 0, policy_version 576771 (0.0021) [2024-06-15 18:57:04,395][1653645] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-06-15 18:57:04,583][1651596] Signal inference workers to resume experience collection... (30000 times) [2024-06-15 18:57:04,584][1653645] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-06-15 18:57:05,814][1653645] Updated weights for policy 0, policy_version 576832 (0.0012) [2024-06-15 18:57:05,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 1181351936. Throughput: 0: 10979.5. Samples: 295372288. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:57:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1181483008. Throughput: 0: 10740.6. Samples: 295430656. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:57:13,495][1653645] Updated weights for policy 0, policy_version 576928 (0.0013) [2024-06-15 18:57:15,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1181712384. Throughput: 0: 10854.4. Samples: 295499264. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:57:16,578][1653645] Updated weights for policy 0, policy_version 577040 (0.0199) [2024-06-15 18:57:19,999][1653645] Updated weights for policy 0, policy_version 577089 (0.0036) [2024-06-15 18:57:20,939][1653645] Updated weights for policy 0, policy_version 577140 (0.0012) [2024-06-15 18:57:20,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1181974528. Throughput: 0: 10774.7. Samples: 295524864. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:57:25,812][1653645] Updated weights for policy 0, policy_version 577216 (0.0014) [2024-06-15 18:57:25,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 44782.7, 300 sec: 44875.4). Total num frames: 1182138368. Throughput: 0: 11138.8. Samples: 295609856. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:25,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:57:28,481][1653645] Updated weights for policy 0, policy_version 577288 (0.0014) [2024-06-15 18:57:29,513][1653645] Updated weights for policy 0, policy_version 577332 (0.0012) [2024-06-15 18:57:30,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 44653.3). Total num frames: 1182400512. Throughput: 0: 11025.1. Samples: 295666176. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:57:31,721][1653645] Updated weights for policy 0, policy_version 577376 (0.0045) [2024-06-15 18:57:35,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1182531584. Throughput: 0: 10763.4. Samples: 295696384. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 18:57:38,180][1653645] Updated weights for policy 0, policy_version 577472 (0.0029) [2024-06-15 18:57:40,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 42598.5, 300 sec: 44542.2). Total num frames: 1182793728. Throughput: 0: 10865.8. Samples: 295755776. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:57:41,686][1653645] Updated weights for policy 0, policy_version 577552 (0.0014) [2024-06-15 18:57:42,665][1653645] Updated weights for policy 0, policy_version 577600 (0.0012) [2024-06-15 18:57:44,892][1653645] Updated weights for policy 0, policy_version 577648 (0.0013) [2024-06-15 18:57:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1183055872. Throughput: 0: 10706.5. Samples: 295828480. Policy #0 lag: (min: 3.0, avg: 60.7, max: 237.0) [2024-06-15 18:57:45,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 18:57:47,558][1653645] Updated weights for policy 0, policy_version 577665 (0.0011) [2024-06-15 18:57:50,278][1651596] Signal inference workers to stop experience collection... (30050 times) [2024-06-15 18:57:50,308][1653645] Updated weights for policy 0, policy_version 577745 (0.0087) [2024-06-15 18:57:50,347][1653645] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-06-15 18:57:50,597][1651596] Signal inference workers to resume experience collection... (30050 times) [2024-06-15 18:57:50,598][1653645] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-06-15 18:57:50,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 44653.5). Total num frames: 1183252480. Throughput: 0: 10922.7. Samples: 295863808. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:57:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 18:57:53,332][1653645] Updated weights for policy 0, policy_version 577799 (0.0012) [2024-06-15 18:57:54,581][1653645] Updated weights for policy 0, policy_version 577856 (0.0020) [2024-06-15 18:57:55,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 42598.3, 300 sec: 44431.2). Total num frames: 1183449088. Throughput: 0: 11036.4. Samples: 295927296. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:57:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:57:56,943][1653645] Updated weights for policy 0, policy_version 577910 (0.0010) [2024-06-15 18:58:00,074][1653645] Updated weights for policy 0, policy_version 577952 (0.0012) [2024-06-15 18:58:00,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1183711232. Throughput: 0: 11047.8. Samples: 295996416. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:00,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 18:58:02,706][1653645] Updated weights for policy 0, policy_version 578017 (0.0034) [2024-06-15 18:58:04,432][1653645] Updated weights for policy 0, policy_version 578064 (0.0038) [2024-06-15 18:58:05,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 1183973376. Throughput: 0: 11229.9. Samples: 296030208. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:05,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:58:06,916][1653645] Updated weights for policy 0, policy_version 578115 (0.0013) [2024-06-15 18:58:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1184104448. Throughput: 0: 10752.1. Samples: 296093696. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 18:58:11,924][1653645] Updated weights for policy 0, policy_version 578192 (0.0013) [2024-06-15 18:58:14,584][1653645] Updated weights for policy 0, policy_version 578256 (0.0031) [2024-06-15 18:58:15,564][1653645] Updated weights for policy 0, policy_version 578303 (0.0013) [2024-06-15 18:58:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1184366592. Throughput: 0: 10968.2. Samples: 296159744. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:58:16,935][1653645] Updated weights for policy 0, policy_version 578359 (0.0014) [2024-06-15 18:58:19,350][1653645] Updated weights for policy 0, policy_version 578403 (0.0014) [2024-06-15 18:58:20,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1184628736. Throughput: 0: 11116.1. Samples: 296196608. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:20,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 18:58:23,488][1653645] Updated weights for policy 0, policy_version 578464 (0.0014) [2024-06-15 18:58:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1184759808. Throughput: 0: 11286.8. Samples: 296263680. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 18:58:26,657][1653645] Updated weights for policy 0, policy_version 578528 (0.0013) [2024-06-15 18:58:28,036][1653645] Updated weights for policy 0, policy_version 578593 (0.0021) [2024-06-15 18:58:29,854][1653645] Updated weights for policy 0, policy_version 578627 (0.0013) [2024-06-15 18:58:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 1185120256. Throughput: 0: 11275.4. Samples: 296335872. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:30,960][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 18:58:31,114][1653645] Updated weights for policy 0, policy_version 578681 (0.0020) [2024-06-15 18:58:35,597][1653645] Updated weights for policy 0, policy_version 578740 (0.0021) [2024-06-15 18:58:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1185284096. Throughput: 0: 11241.2. Samples: 296369664. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 18:58:38,091][1653645] Updated weights for policy 0, policy_version 578770 (0.0012) [2024-06-15 18:58:38,396][1651596] Signal inference workers to stop experience collection... (30100 times) [2024-06-15 18:58:38,445][1653645] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-06-15 18:58:38,636][1651596] Signal inference workers to resume experience collection... (30100 times) [2024-06-15 18:58:38,637][1653645] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-06-15 18:58:40,100][1653645] Updated weights for policy 0, policy_version 578850 (0.0014) [2024-06-15 18:58:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1185546240. Throughput: 0: 11264.1. Samples: 296434176. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:58:43,232][1653645] Updated weights for policy 0, policy_version 578936 (0.0111) [2024-06-15 18:58:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1185677312. Throughput: 0: 11161.6. Samples: 296498688. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:45,962][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:58:47,703][1653645] Updated weights for policy 0, policy_version 578994 (0.0013) [2024-06-15 18:58:50,776][1653645] Updated weights for policy 0, policy_version 579025 (0.0018) [2024-06-15 18:58:50,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 1185841152. Throughput: 0: 11093.3. Samples: 296529408. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 18:58:52,100][1653645] Updated weights for policy 0, policy_version 579079 (0.0013) [2024-06-15 18:58:54,607][1653645] Updated weights for policy 0, policy_version 579137 (0.0014) [2024-06-15 18:58:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 1186201600. Throughput: 0: 11150.2. Samples: 296595456. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:58:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 18:58:56,017][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000579200_1186201600.pth... [2024-06-15 18:58:56,060][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000574000_1175552000.pth [2024-06-15 18:58:58,137][1653645] Updated weights for policy 0, policy_version 579201 (0.0013) [2024-06-15 18:58:59,260][1653645] Updated weights for policy 0, policy_version 579256 (0.0012) [2024-06-15 18:59:00,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1186332672. Throughput: 0: 11263.9. Samples: 296666624. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:59:00,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 18:59:02,135][1653645] Updated weights for policy 0, policy_version 579313 (0.0012) [2024-06-15 18:59:03,446][1653645] Updated weights for policy 0, policy_version 579380 (0.0012) [2024-06-15 18:59:05,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 44236.6, 300 sec: 44431.2). Total num frames: 1186627584. Throughput: 0: 11241.2. Samples: 296702464. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:59:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:59:06,202][1653645] Updated weights for policy 0, policy_version 579424 (0.0075) [2024-06-15 18:59:06,803][1653645] Updated weights for policy 0, policy_version 579450 (0.0013) [2024-06-15 18:59:10,508][1653645] Updated weights for policy 0, policy_version 579490 (0.0016) [2024-06-15 18:59:10,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1186824192. Throughput: 0: 11411.9. Samples: 296777216. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:59:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 18:59:12,473][1653645] Updated weights for policy 0, policy_version 579539 (0.0014) [2024-06-15 18:59:14,402][1653645] Updated weights for policy 0, policy_version 579632 (0.0013) [2024-06-15 18:59:15,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 1187119104. Throughput: 0: 11161.6. Samples: 296838144. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:59:15,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 18:59:18,054][1653645] Updated weights for policy 0, policy_version 579696 (0.0013) [2024-06-15 18:59:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1187250176. Throughput: 0: 11161.6. Samples: 296871936. Policy #0 lag: (min: 3.0, avg: 91.8, max: 259.0) [2024-06-15 18:59:20,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 18:59:22,159][1653645] Updated weights for policy 0, policy_version 579760 (0.0014) [2024-06-15 18:59:23,311][1651596] Signal inference workers to stop experience collection... (30150 times) [2024-06-15 18:59:23,364][1653645] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-06-15 18:59:23,567][1651596] Signal inference workers to resume experience collection... (30150 times) [2024-06-15 18:59:23,576][1653645] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-06-15 18:59:24,070][1653645] Updated weights for policy 0, policy_version 579811 (0.0012) [2024-06-15 18:59:25,645][1653645] Updated weights for policy 0, policy_version 579875 (0.0012) [2024-06-15 18:59:25,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 47513.6, 300 sec: 44875.5). Total num frames: 1187610624. Throughput: 0: 11218.5. Samples: 296939008. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 18:59:29,211][1653645] Updated weights for policy 0, policy_version 579922 (0.0017) [2024-06-15 18:59:30,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 44236.6, 300 sec: 44431.1). Total num frames: 1187774464. Throughput: 0: 11366.3. Samples: 297010176. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:30,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 18:59:32,916][1653645] Updated weights for policy 0, policy_version 579969 (0.0011) [2024-06-15 18:59:33,854][1653645] Updated weights for policy 0, policy_version 580027 (0.0092) [2024-06-15 18:59:35,862][1653645] Updated weights for policy 0, policy_version 580069 (0.0014) [2024-06-15 18:59:35,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1187971072. Throughput: 0: 11446.1. Samples: 297044480. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:59:37,705][1653645] Updated weights for policy 0, policy_version 580150 (0.0055) [2024-06-15 18:59:40,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.4, 300 sec: 44320.1). Total num frames: 1188167680. Throughput: 0: 11446.0. Samples: 297110528. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:40,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 18:59:41,263][1653645] Updated weights for policy 0, policy_version 580176 (0.0103) [2024-06-15 18:59:45,261][1653645] Updated weights for policy 0, policy_version 580256 (0.0035) [2024-06-15 18:59:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1188429824. Throughput: 0: 11400.6. Samples: 297179648. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:59:47,213][1653645] Updated weights for policy 0, policy_version 580323 (0.0012) [2024-06-15 18:59:48,115][1653645] Updated weights for policy 0, policy_version 580353 (0.0013) [2024-06-15 18:59:49,541][1653645] Updated weights for policy 0, policy_version 580416 (0.0015) [2024-06-15 18:59:50,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 1188691968. Throughput: 0: 11275.4. Samples: 297209856. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 18:59:55,709][1653645] Updated weights for policy 0, policy_version 580496 (0.0044) [2024-06-15 18:59:55,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1188855808. Throughput: 0: 11252.6. Samples: 297283584. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 18:59:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 18:59:58,391][1653645] Updated weights for policy 0, policy_version 580561 (0.0030) [2024-06-15 19:00:00,258][1653645] Updated weights for policy 0, policy_version 580643 (0.0012) [2024-06-15 19:00:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 48059.9, 300 sec: 45097.7). Total num frames: 1189216256. Throughput: 0: 11286.8. Samples: 297346048. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:00:04,683][1653645] Updated weights for policy 0, policy_version 580691 (0.0013) [2024-06-15 19:00:05,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 45329.0, 300 sec: 44431.1). Total num frames: 1189347328. Throughput: 0: 11468.7. Samples: 297388032. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:05,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:00:07,139][1653645] Updated weights for policy 0, policy_version 580738 (0.0026) [2024-06-15 19:00:08,060][1653645] Updated weights for policy 0, policy_version 580785 (0.0106) [2024-06-15 19:00:09,044][1651596] Signal inference workers to stop experience collection... (30200 times) [2024-06-15 19:00:09,138][1653645] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-06-15 19:00:09,307][1651596] Signal inference workers to resume experience collection... (30200 times) [2024-06-15 19:00:09,309][1653645] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-06-15 19:00:09,528][1653645] Updated weights for policy 0, policy_version 580822 (0.0032) [2024-06-15 19:00:10,622][1653645] Updated weights for policy 0, policy_version 580866 (0.0012) [2024-06-15 19:00:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1189642240. Throughput: 0: 11480.2. Samples: 297455616. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:10,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 19:00:15,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1189740544. Throughput: 0: 11400.5. Samples: 297523200. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:15,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:00:16,170][1653645] Updated weights for policy 0, policy_version 580944 (0.0034) [2024-06-15 19:00:19,456][1653645] Updated weights for policy 0, policy_version 581010 (0.0013) [2024-06-15 19:00:20,757][1653645] Updated weights for policy 0, policy_version 581075 (0.0035) [2024-06-15 19:00:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46967.4, 300 sec: 45097.7). Total num frames: 1190068224. Throughput: 0: 11400.5. Samples: 297557504. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:00:22,280][1653645] Updated weights for policy 0, policy_version 581121 (0.0013) [2024-06-15 19:00:23,567][1653645] Updated weights for policy 0, policy_version 581177 (0.0010) [2024-06-15 19:00:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 1190264832. Throughput: 0: 11434.7. Samples: 297625088. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:00:27,419][1653645] Updated weights for policy 0, policy_version 581216 (0.0051) [2024-06-15 19:00:30,854][1653645] Updated weights for policy 0, policy_version 581280 (0.0012) [2024-06-15 19:00:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.1, 300 sec: 44653.3). Total num frames: 1190461440. Throughput: 0: 11582.6. Samples: 297700864. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:00:32,834][1653645] Updated weights for policy 0, policy_version 581360 (0.0012) [2024-06-15 19:00:35,284][1653645] Updated weights for policy 0, policy_version 581408 (0.0012) [2024-06-15 19:00:35,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 46967.5, 300 sec: 45098.4). Total num frames: 1190789120. Throughput: 0: 11446.1. Samples: 297724928. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:00:38,793][1653645] Updated weights for policy 0, policy_version 581472 (0.0025) [2024-06-15 19:00:40,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 45875.2, 300 sec: 44431.1). Total num frames: 1190920192. Throughput: 0: 11354.9. Samples: 297794560. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:00:42,435][1653645] Updated weights for policy 0, policy_version 581520 (0.0014) [2024-06-15 19:00:43,753][1653645] Updated weights for policy 0, policy_version 581584 (0.0021) [2024-06-15 19:00:45,962][1648982] Fps is (10 sec: 39303.7, 60 sec: 45871.7, 300 sec: 44985.9). Total num frames: 1191182336. Throughput: 0: 11410.8. Samples: 297859584. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:45,963][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:00:46,765][1653645] Updated weights for policy 0, policy_version 581634 (0.0059) [2024-06-15 19:00:47,909][1653645] Updated weights for policy 0, policy_version 581691 (0.0014) [2024-06-15 19:00:50,882][1653645] Updated weights for policy 0, policy_version 581735 (0.0012) [2024-06-15 19:00:50,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1191378944. Throughput: 0: 11229.9. Samples: 297893376. Policy #0 lag: (min: 15.0, avg: 105.4, max: 271.0) [2024-06-15 19:00:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:00:54,231][1651596] Signal inference workers to stop experience collection... (30250 times) [2024-06-15 19:00:54,278][1653645] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-06-15 19:00:54,427][1651596] Signal inference workers to resume experience collection... (30250 times) [2024-06-15 19:00:54,428][1653645] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-06-15 19:00:54,631][1653645] Updated weights for policy 0, policy_version 581793 (0.0012) [2024-06-15 19:00:55,958][1648982] Fps is (10 sec: 42616.6, 60 sec: 45875.0, 300 sec: 44986.5). Total num frames: 1191608320. Throughput: 0: 11309.5. Samples: 297964544. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:00:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:00:56,439][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000581872_1191673856.pth... [2024-06-15 19:00:56,440][1653645] Updated weights for policy 0, policy_version 581872 (0.0013) [2024-06-15 19:00:56,489][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000576624_1180925952.pth [2024-06-15 19:00:59,255][1653645] Updated weights for policy 0, policy_version 581904 (0.0011) [2024-06-15 19:01:00,416][1653645] Updated weights for policy 0, policy_version 581950 (0.0015) [2024-06-15 19:01:00,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.5, 300 sec: 44542.2). Total num frames: 1191837696. Throughput: 0: 11161.6. Samples: 298025472. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:01:03,084][1653645] Updated weights for policy 0, policy_version 582008 (0.0012) [2024-06-15 19:01:05,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 1192001536. Throughput: 0: 11173.0. Samples: 298060288. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:01:06,872][1653645] Updated weights for policy 0, policy_version 582064 (0.0014) [2024-06-15 19:01:08,330][1653645] Updated weights for policy 0, policy_version 582128 (0.0013) [2024-06-15 19:01:10,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1192296448. Throughput: 0: 11173.0. Samples: 298127872. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:01:11,537][1653645] Updated weights for policy 0, policy_version 582207 (0.0099) [2024-06-15 19:01:15,187][1653645] Updated weights for policy 0, policy_version 582266 (0.0012) [2024-06-15 19:01:15,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 1192493056. Throughput: 0: 11025.1. Samples: 298196992. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:01:18,442][1653645] Updated weights for policy 0, policy_version 582336 (0.0033) [2024-06-15 19:01:19,781][1653645] Updated weights for policy 0, policy_version 582400 (0.0017) [2024-06-15 19:01:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 1192755200. Throughput: 0: 11082.0. Samples: 298223616. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:01:23,701][1653645] Updated weights for policy 0, policy_version 582464 (0.0013) [2024-06-15 19:01:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 1192886272. Throughput: 0: 11025.2. Samples: 298290688. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:01:28,091][1653645] Updated weights for policy 0, policy_version 582519 (0.0013) [2024-06-15 19:01:30,084][1653645] Updated weights for policy 0, policy_version 582576 (0.0012) [2024-06-15 19:01:30,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1193181184. Throughput: 0: 11094.4. Samples: 298358784. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:01:31,292][1653645] Updated weights for policy 0, policy_version 582628 (0.0013) [2024-06-15 19:01:33,593][1653645] Updated weights for policy 0, policy_version 582672 (0.0013) [2024-06-15 19:01:34,595][1653645] Updated weights for policy 0, policy_version 582717 (0.0012) [2024-06-15 19:01:35,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1193410560. Throughput: 0: 11138.9. Samples: 298394624. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:01:39,295][1651596] Signal inference workers to stop experience collection... (30300 times) [2024-06-15 19:01:39,387][1653645] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-06-15 19:01:39,622][1651596] Signal inference workers to resume experience collection... (30300 times) [2024-06-15 19:01:39,623][1653645] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-06-15 19:01:39,856][1653645] Updated weights for policy 0, policy_version 582778 (0.0013) [2024-06-15 19:01:40,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44236.9, 300 sec: 44542.2). Total num frames: 1193574400. Throughput: 0: 11161.6. Samples: 298466816. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:01:42,417][1653645] Updated weights for policy 0, policy_version 582865 (0.0013) [2024-06-15 19:01:43,724][1653645] Updated weights for policy 0, policy_version 582912 (0.0013) [2024-06-15 19:01:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43693.9, 300 sec: 44875.5). Total num frames: 1193803776. Throughput: 0: 10968.2. Samples: 298519040. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:01:47,273][1653645] Updated weights for policy 0, policy_version 582974 (0.0016) [2024-06-15 19:01:50,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1193934848. Throughput: 0: 11127.4. Samples: 298561024. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:01:51,521][1653645] Updated weights for policy 0, policy_version 583010 (0.0009) [2024-06-15 19:01:53,665][1653645] Updated weights for policy 0, policy_version 583104 (0.0092) [2024-06-15 19:01:55,191][1653645] Updated weights for policy 0, policy_version 583168 (0.0013) [2024-06-15 19:01:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 1194328064. Throughput: 0: 10990.9. Samples: 298622464. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:01:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:01:59,386][1653645] Updated weights for policy 0, policy_version 583229 (0.0129) [2024-06-15 19:02:00,978][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1194459136. Throughput: 0: 11047.8. Samples: 298694144. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:02:00,979][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:02:03,750][1653645] Updated weights for policy 0, policy_version 583280 (0.0010) [2024-06-15 19:02:05,680][1653645] Updated weights for policy 0, policy_version 583360 (0.0010) [2024-06-15 19:02:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1194721280. Throughput: 0: 11252.6. Samples: 298729984. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:02:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:02:07,173][1653645] Updated weights for policy 0, policy_version 583424 (0.0025) [2024-06-15 19:02:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 44653.3). Total num frames: 1194885120. Throughput: 0: 11036.4. Samples: 298787328. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:02:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:02:12,017][1653645] Updated weights for policy 0, policy_version 583479 (0.0012) [2024-06-15 19:02:15,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 1195048960. Throughput: 0: 11093.3. Samples: 298857984. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:02:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:02:16,500][1653645] Updated weights for policy 0, policy_version 583552 (0.0013) [2024-06-15 19:02:17,871][1653645] Updated weights for policy 0, policy_version 583619 (0.0015) [2024-06-15 19:02:18,956][1653645] Updated weights for policy 0, policy_version 583680 (0.0014) [2024-06-15 19:02:20,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1195376640. Throughput: 0: 10854.4. Samples: 298883072. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:02:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:02:22,568][1651596] Signal inference workers to stop experience collection... (30350 times) [2024-06-15 19:02:22,599][1653645] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-06-15 19:02:22,865][1651596] Signal inference workers to resume experience collection... (30350 times) [2024-06-15 19:02:22,866][1653645] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-06-15 19:02:23,837][1653645] Updated weights for policy 0, policy_version 583733 (0.0010) [2024-06-15 19:02:25,970][1648982] Fps is (10 sec: 45817.9, 60 sec: 43681.5, 300 sec: 44429.3). Total num frames: 1195507712. Throughput: 0: 10851.4. Samples: 298955264. Policy #0 lag: (min: 0.0, avg: 87.2, max: 256.0) [2024-06-15 19:02:25,971][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:02:27,994][1653645] Updated weights for policy 0, policy_version 583798 (0.0012) [2024-06-15 19:02:29,760][1653645] Updated weights for policy 0, policy_version 583879 (0.0013) [2024-06-15 19:02:30,790][1653645] Updated weights for policy 0, policy_version 583936 (0.0013) [2024-06-15 19:02:30,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1195900928. Throughput: 0: 11059.2. Samples: 299016704. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:02:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:02:35,628][1653645] Updated weights for policy 0, policy_version 583997 (0.0107) [2024-06-15 19:02:35,958][1648982] Fps is (10 sec: 52494.3, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1196032000. Throughput: 0: 11002.3. Samples: 299056128. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:02:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:02:39,498][1653645] Updated weights for policy 0, policy_version 584052 (0.0015) [2024-06-15 19:02:40,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1196261376. Throughput: 0: 11184.3. Samples: 299125760. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:02:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:02:41,178][1653645] Updated weights for policy 0, policy_version 584128 (0.0121) [2024-06-15 19:02:42,289][1653645] Updated weights for policy 0, policy_version 584191 (0.0109) [2024-06-15 19:02:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1196425216. Throughput: 0: 11093.3. Samples: 299193344. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:02:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:02:47,266][1653645] Updated weights for policy 0, policy_version 584245 (0.0012) [2024-06-15 19:02:50,207][1653645] Updated weights for policy 0, policy_version 584288 (0.0012) [2024-06-15 19:02:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1196687360. Throughput: 0: 11047.8. Samples: 299227136. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:02:50,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:02:51,682][1653645] Updated weights for policy 0, policy_version 584356 (0.0129) [2024-06-15 19:02:53,235][1653645] Updated weights for policy 0, policy_version 584432 (0.0022) [2024-06-15 19:02:55,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1196949504. Throughput: 0: 11207.1. Samples: 299291648. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:02:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:02:55,984][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000584448_1196949504.pth... [2024-06-15 19:02:56,042][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000579200_1186201600.pth [2024-06-15 19:02:59,028][1653645] Updated weights for policy 0, policy_version 584506 (0.0015) [2024-06-15 19:03:00,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1197080576. Throughput: 0: 11241.2. Samples: 299363840. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:03:02,384][1653645] Updated weights for policy 0, policy_version 584548 (0.0017) [2024-06-15 19:03:03,282][1651596] Signal inference workers to stop experience collection... (30400 times) [2024-06-15 19:03:03,327][1653645] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-06-15 19:03:03,523][1651596] Signal inference workers to resume experience collection... (30400 times) [2024-06-15 19:03:03,534][1653645] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-06-15 19:03:04,374][1653645] Updated weights for policy 0, policy_version 584641 (0.0012) [2024-06-15 19:03:05,507][1653645] Updated weights for policy 0, policy_version 584702 (0.0013) [2024-06-15 19:03:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45875.0, 300 sec: 45319.8). Total num frames: 1197473792. Throughput: 0: 11434.7. Samples: 299397632. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:03:10,533][1653645] Updated weights for policy 0, policy_version 584763 (0.0013) [2024-06-15 19:03:10,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 45328.9, 300 sec: 44875.5). Total num frames: 1197604864. Throughput: 0: 11449.2. Samples: 299470336. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:03:14,404][1653645] Updated weights for policy 0, policy_version 584818 (0.0116) [2024-06-15 19:03:15,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1197834240. Throughput: 0: 11286.7. Samples: 299524608. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:03:16,275][1653645] Updated weights for policy 0, policy_version 584912 (0.0013) [2024-06-15 19:03:20,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1197998080. Throughput: 0: 11241.2. Samples: 299561984. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:20,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 19:03:21,614][1653645] Updated weights for policy 0, policy_version 584976 (0.0027) [2024-06-15 19:03:25,006][1653645] Updated weights for policy 0, policy_version 585025 (0.0013) [2024-06-15 19:03:25,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44792.2, 300 sec: 44320.1). Total num frames: 1198194688. Throughput: 0: 11320.9. Samples: 299635200. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:03:26,822][1653645] Updated weights for policy 0, policy_version 585110 (0.0012) [2024-06-15 19:03:28,384][1653645] Updated weights for policy 0, policy_version 585184 (0.0104) [2024-06-15 19:03:30,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1198522368. Throughput: 0: 11093.3. Samples: 299692544. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:30,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 19:03:33,448][1653645] Updated weights for policy 0, policy_version 585223 (0.0016) [2024-06-15 19:03:35,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1198653440. Throughput: 0: 11195.7. Samples: 299730944. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:03:37,420][1653645] Updated weights for policy 0, policy_version 585312 (0.0014) [2024-06-15 19:03:38,458][1653645] Updated weights for policy 0, policy_version 585376 (0.0013) [2024-06-15 19:03:40,308][1653645] Updated weights for policy 0, policy_version 585441 (0.0012) [2024-06-15 19:03:40,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1199046656. Throughput: 0: 11161.7. Samples: 299793920. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:03:44,824][1653645] Updated weights for policy 0, policy_version 585473 (0.0020) [2024-06-15 19:03:45,502][1651596] Signal inference workers to stop experience collection... (30450 times) [2024-06-15 19:03:45,638][1653645] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-06-15 19:03:45,708][1651596] Signal inference workers to resume experience collection... (30450 times) [2024-06-15 19:03:45,709][1653645] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-06-15 19:03:45,982][1648982] Fps is (10 sec: 52301.2, 60 sec: 45856.5, 300 sec: 45205.0). Total num frames: 1199177728. Throughput: 0: 11075.9. Samples: 299862528. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:45,983][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:03:49,032][1653645] Updated weights for policy 0, policy_version 585554 (0.0143) [2024-06-15 19:03:50,957][1648982] Fps is (10 sec: 32768.5, 60 sec: 44783.1, 300 sec: 44653.4). Total num frames: 1199374336. Throughput: 0: 11116.2. Samples: 299897856. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:03:51,247][1653645] Updated weights for policy 0, policy_version 585652 (0.0014) [2024-06-15 19:03:52,659][1653645] Updated weights for policy 0, policy_version 585712 (0.0011) [2024-06-15 19:03:55,958][1648982] Fps is (10 sec: 39417.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1199570944. Throughput: 0: 10774.8. Samples: 299955200. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:03:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:03:57,831][1653645] Updated weights for policy 0, policy_version 585744 (0.0013) [2024-06-15 19:04:00,957][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 44320.2). Total num frames: 1199702016. Throughput: 0: 11173.0. Samples: 300027392. Policy #0 lag: (min: 15.0, avg: 90.7, max: 271.0) [2024-06-15 19:04:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:04:02,320][1653645] Updated weights for policy 0, policy_version 585856 (0.0013) [2024-06-15 19:04:04,619][1653645] Updated weights for policy 0, policy_version 585942 (0.0014) [2024-06-15 19:04:05,959][1648982] Fps is (10 sec: 52424.7, 60 sec: 43690.0, 300 sec: 44986.4). Total num frames: 1200095232. Throughput: 0: 10865.6. Samples: 300050944. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:04:10,264][1653645] Updated weights for policy 0, policy_version 585985 (0.0011) [2024-06-15 19:04:10,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 42598.6, 300 sec: 44209.0). Total num frames: 1200160768. Throughput: 0: 10672.4. Samples: 300115456. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:04:11,441][1653645] Updated weights for policy 0, policy_version 586046 (0.0011) [2024-06-15 19:04:14,089][1653645] Updated weights for policy 0, policy_version 586098 (0.0012) [2024-06-15 19:04:15,604][1653645] Updated weights for policy 0, policy_version 586162 (0.0011) [2024-06-15 19:04:15,958][1648982] Fps is (10 sec: 39324.6, 60 sec: 44236.6, 300 sec: 44875.5). Total num frames: 1200488448. Throughput: 0: 10843.0. Samples: 300180480. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:04:17,194][1653645] Updated weights for policy 0, policy_version 586236 (0.0011) [2024-06-15 19:04:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1200619520. Throughput: 0: 10592.7. Samples: 300207616. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:04:23,716][1653645] Updated weights for policy 0, policy_version 586288 (0.0022) [2024-06-15 19:04:24,670][1653645] Updated weights for policy 0, policy_version 586320 (0.0010) [2024-06-15 19:04:25,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1200881664. Throughput: 0: 10865.8. Samples: 300282880. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:04:26,971][1653645] Updated weights for policy 0, policy_version 586402 (0.0014) [2024-06-15 19:04:27,531][1653645] Updated weights for policy 0, policy_version 586430 (0.0010) [2024-06-15 19:04:28,157][1651596] Signal inference workers to stop experience collection... (30500 times) [2024-06-15 19:04:28,243][1653645] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-06-15 19:04:28,458][1651596] Signal inference workers to resume experience collection... (30500 times) [2024-06-15 19:04:28,459][1653645] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-06-15 19:04:29,311][1653645] Updated weights for policy 0, policy_version 586492 (0.0011) [2024-06-15 19:04:30,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1201143808. Throughput: 0: 10643.9. Samples: 300341248. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:04:35,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 42598.4, 300 sec: 44209.1). Total num frames: 1201209344. Throughput: 0: 10683.7. Samples: 300378624. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:04:36,213][1653645] Updated weights for policy 0, policy_version 586549 (0.0011) [2024-06-15 19:04:37,484][1653645] Updated weights for policy 0, policy_version 586592 (0.0013) [2024-06-15 19:04:39,195][1653645] Updated weights for policy 0, policy_version 586657 (0.0010) [2024-06-15 19:04:40,828][1653645] Updated weights for policy 0, policy_version 586704 (0.0012) [2024-06-15 19:04:40,960][1648982] Fps is (10 sec: 42599.5, 60 sec: 42052.2, 300 sec: 44542.3). Total num frames: 1201569792. Throughput: 0: 10717.9. Samples: 300437504. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:40,968][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:04:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 41523.1, 300 sec: 43986.9). Total num frames: 1201668096. Throughput: 0: 10661.0. Samples: 300507136. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:04:47,189][1653645] Updated weights for policy 0, policy_version 586758 (0.0014) [2024-06-15 19:04:49,246][1653645] Updated weights for policy 0, policy_version 586833 (0.0012) [2024-06-15 19:04:50,605][1653645] Updated weights for policy 0, policy_version 586882 (0.0014) [2024-06-15 19:04:50,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43144.2, 300 sec: 44431.1). Total num frames: 1201963008. Throughput: 0: 10854.6. Samples: 300539392. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:50,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:04:51,897][1653645] Updated weights for policy 0, policy_version 586944 (0.0015) [2024-06-15 19:04:54,308][1653645] Updated weights for policy 0, policy_version 587004 (0.0014) [2024-06-15 19:04:55,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1202192384. Throughput: 0: 10706.5. Samples: 300597248. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:04:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:04:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000587008_1202192384.pth... [2024-06-15 19:04:56,031][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000581872_1191673856.pth [2024-06-15 19:05:00,814][1653645] Updated weights for policy 0, policy_version 587059 (0.0054) [2024-06-15 19:05:00,958][1648982] Fps is (10 sec: 32768.8, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 1202290688. Throughput: 0: 10877.2. Samples: 300669952. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:05:02,323][1653645] Updated weights for policy 0, policy_version 587120 (0.0011) [2024-06-15 19:05:05,497][1653645] Updated weights for policy 0, policy_version 587201 (0.0015) [2024-06-15 19:05:05,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 42052.8, 300 sec: 43986.9). Total num frames: 1202618368. Throughput: 0: 10831.6. Samples: 300695040. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:05,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:05:06,710][1653645] Updated weights for policy 0, policy_version 587257 (0.0112) [2024-06-15 19:05:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1202716672. Throughput: 0: 10683.7. Samples: 300763648. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:05:12,602][1653645] Updated weights for policy 0, policy_version 587312 (0.0012) [2024-06-15 19:05:13,857][1651596] Signal inference workers to stop experience collection... (30550 times) [2024-06-15 19:05:13,908][1653645] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-06-15 19:05:14,110][1651596] Signal inference workers to resume experience collection... (30550 times) [2024-06-15 19:05:14,111][1653645] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-06-15 19:05:14,551][1653645] Updated weights for policy 0, policy_version 587393 (0.0012) [2024-06-15 19:05:15,776][1653645] Updated weights for policy 0, policy_version 587456 (0.0015) [2024-06-15 19:05:15,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1203109888. Throughput: 0: 10752.0. Samples: 300825088. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:15,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:05:18,051][1653645] Updated weights for policy 0, policy_version 587511 (0.0014) [2024-06-15 19:05:20,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.4, 300 sec: 43986.9). Total num frames: 1203240960. Throughput: 0: 10683.7. Samples: 300859392. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:05:23,792][1653645] Updated weights for policy 0, policy_version 587576 (0.0044) [2024-06-15 19:05:25,456][1653645] Updated weights for policy 0, policy_version 587632 (0.0012) [2024-06-15 19:05:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 1203503104. Throughput: 0: 11093.3. Samples: 300936704. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:05:26,833][1653645] Updated weights for policy 0, policy_version 587709 (0.0014) [2024-06-15 19:05:29,687][1653645] Updated weights for policy 0, policy_version 587770 (0.0014) [2024-06-15 19:05:30,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1203765248. Throughput: 0: 10774.8. Samples: 300992000. Policy #0 lag: (min: 2.0, avg: 85.2, max: 258.0) [2024-06-15 19:05:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:05:35,645][1653645] Updated weights for policy 0, policy_version 587831 (0.0012) [2024-06-15 19:05:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1203896320. Throughput: 0: 11013.7. Samples: 301035008. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:05:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:05:38,804][1653645] Updated weights for policy 0, policy_version 587952 (0.0128) [2024-06-15 19:05:40,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 43987.5). Total num frames: 1204158464. Throughput: 0: 10945.4. Samples: 301089792. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:05:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:05:41,045][1653645] Updated weights for policy 0, policy_version 587984 (0.0046) [2024-06-15 19:05:45,962][1648982] Fps is (10 sec: 39304.0, 60 sec: 43687.4, 300 sec: 43764.1). Total num frames: 1204289536. Throughput: 0: 11103.6. Samples: 301169664. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:05:45,963][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 19:05:46,485][1653645] Updated weights for policy 0, policy_version 588048 (0.0013) [2024-06-15 19:05:47,311][1653645] Updated weights for policy 0, policy_version 588090 (0.0013) [2024-06-15 19:05:48,545][1653645] Updated weights for policy 0, policy_version 588144 (0.0012) [2024-06-15 19:05:49,678][1653645] Updated weights for policy 0, policy_version 588195 (0.0013) [2024-06-15 19:05:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 1204682752. Throughput: 0: 11264.0. Samples: 301201920. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:05:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:05:53,170][1653645] Updated weights for policy 0, policy_version 588278 (0.0013) [2024-06-15 19:05:55,958][1648982] Fps is (10 sec: 52452.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1204813824. Throughput: 0: 11298.1. Samples: 301272064. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:05:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:05:58,315][1653645] Updated weights for policy 0, policy_version 588322 (0.0013) [2024-06-15 19:05:58,600][1651596] Signal inference workers to stop experience collection... (30600 times) [2024-06-15 19:05:58,680][1653645] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-06-15 19:05:58,790][1651596] Signal inference workers to resume experience collection... (30600 times) [2024-06-15 19:05:58,791][1653645] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-06-15 19:05:59,120][1653645] Updated weights for policy 0, policy_version 588368 (0.0013) [2024-06-15 19:06:00,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 46967.5, 300 sec: 44431.2). Total num frames: 1205108736. Throughput: 0: 11457.5. Samples: 301340672. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:06:01,064][1653645] Updated weights for policy 0, policy_version 588448 (0.0098) [2024-06-15 19:06:01,797][1653645] Updated weights for policy 0, policy_version 588480 (0.0025) [2024-06-15 19:06:04,449][1653645] Updated weights for policy 0, policy_version 588533 (0.0013) [2024-06-15 19:06:05,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.2, 300 sec: 44209.0). Total num frames: 1205338112. Throughput: 0: 11434.7. Samples: 301373952. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:06:09,338][1653645] Updated weights for policy 0, policy_version 588565 (0.0012) [2024-06-15 19:06:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 46421.3, 300 sec: 44097.9). Total num frames: 1205501952. Throughput: 0: 11446.0. Samples: 301451776. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:06:11,117][1653645] Updated weights for policy 0, policy_version 588642 (0.0013) [2024-06-15 19:06:12,732][1653645] Updated weights for policy 0, policy_version 588707 (0.0011) [2024-06-15 19:06:14,862][1653645] Updated weights for policy 0, policy_version 588754 (0.0013) [2024-06-15 19:06:15,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 1205829632. Throughput: 0: 11468.8. Samples: 301508096. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:06:20,268][1653645] Updated weights for policy 0, policy_version 588816 (0.0011) [2024-06-15 19:06:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45329.3, 300 sec: 44320.1). Total num frames: 1205960704. Throughput: 0: 11446.0. Samples: 301550080. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:06:21,808][1653645] Updated weights for policy 0, policy_version 588880 (0.0017) [2024-06-15 19:06:24,128][1653645] Updated weights for policy 0, policy_version 588947 (0.0135) [2024-06-15 19:06:25,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 1206255616. Throughput: 0: 11639.5. Samples: 301613568. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:06:26,364][1653645] Updated weights for policy 0, policy_version 589011 (0.0012) [2024-06-15 19:06:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1206386688. Throughput: 0: 11572.4. Samples: 301690368. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:06:31,577][1653645] Updated weights for policy 0, policy_version 589072 (0.0013) [2024-06-15 19:06:32,848][1653645] Updated weights for policy 0, policy_version 589124 (0.0014) [2024-06-15 19:06:34,074][1653645] Updated weights for policy 0, policy_version 589182 (0.0014) [2024-06-15 19:06:35,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 47513.5, 300 sec: 44653.3). Total num frames: 1206747136. Throughput: 0: 11571.2. Samples: 301722624. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:35,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:06:37,643][1653645] Updated weights for policy 0, policy_version 589267 (0.0111) [2024-06-15 19:06:38,612][1653645] Updated weights for policy 0, policy_version 589312 (0.0010) [2024-06-15 19:06:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1206910976. Throughput: 0: 11468.8. Samples: 301788160. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:06:43,239][1651596] Signal inference workers to stop experience collection... (30650 times) [2024-06-15 19:06:43,272][1653645] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-06-15 19:06:43,538][1651596] Signal inference workers to resume experience collection... (30650 times) [2024-06-15 19:06:43,541][1653645] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-06-15 19:06:45,180][1653645] Updated weights for policy 0, policy_version 589408 (0.0151) [2024-06-15 19:06:45,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 48063.3, 300 sec: 44875.5). Total num frames: 1207173120. Throughput: 0: 11502.9. Samples: 301858304. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:06:46,963][1653645] Updated weights for policy 0, policy_version 589488 (0.0016) [2024-06-15 19:06:49,731][1653645] Updated weights for policy 0, policy_version 589538 (0.0041) [2024-06-15 19:06:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1207435264. Throughput: 0: 11355.0. Samples: 301884928. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:06:54,747][1653645] Updated weights for policy 0, policy_version 589570 (0.0031) [2024-06-15 19:06:55,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1207566336. Throughput: 0: 11537.1. Samples: 301970944. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:06:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:06:56,229][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000589648_1207599104.pth... [2024-06-15 19:06:56,438][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000584448_1196949504.pth [2024-06-15 19:06:57,048][1653645] Updated weights for policy 0, policy_version 589668 (0.0014) [2024-06-15 19:06:59,148][1653645] Updated weights for policy 0, policy_version 589749 (0.0129) [2024-06-15 19:07:00,904][1653645] Updated weights for policy 0, policy_version 589778 (0.0012) [2024-06-15 19:07:00,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 44542.2). Total num frames: 1207861248. Throughput: 0: 11468.8. Samples: 302024192. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:07:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:07:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1207959552. Throughput: 0: 11389.1. Samples: 302062592. Policy #0 lag: (min: 10.0, avg: 72.9, max: 266.0) [2024-06-15 19:07:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:07:06,672][1653645] Updated weights for policy 0, policy_version 589859 (0.0013) [2024-06-15 19:07:08,790][1653645] Updated weights for policy 0, policy_version 589944 (0.0098) [2024-06-15 19:07:10,748][1653645] Updated weights for policy 0, policy_version 590009 (0.0172) [2024-06-15 19:07:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 47513.7, 300 sec: 45097.6). Total num frames: 1208352768. Throughput: 0: 11434.7. Samples: 302128128. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:07:13,277][1653645] Updated weights for policy 0, policy_version 590064 (0.0077) [2024-06-15 19:07:15,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1208483840. Throughput: 0: 11309.5. Samples: 302199296. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:07:18,433][1653645] Updated weights for policy 0, policy_version 590112 (0.0013) [2024-06-15 19:07:19,981][1653645] Updated weights for policy 0, policy_version 590165 (0.0012) [2024-06-15 19:07:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 46421.4, 300 sec: 44877.4). Total num frames: 1208745984. Throughput: 0: 11412.0. Samples: 302236160. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:07:21,887][1653645] Updated weights for policy 0, policy_version 590240 (0.0013) [2024-06-15 19:07:21,996][1651596] Signal inference workers to stop experience collection... (30700 times) [2024-06-15 19:07:22,050][1653645] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-06-15 19:07:22,225][1651596] Signal inference workers to resume experience collection... (30700 times) [2024-06-15 19:07:22,230][1653645] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-06-15 19:07:24,362][1653645] Updated weights for policy 0, policy_version 590304 (0.0118) [2024-06-15 19:07:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1209008128. Throughput: 0: 11218.5. Samples: 302292992. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:07:30,177][1653645] Updated weights for policy 0, policy_version 590368 (0.0012) [2024-06-15 19:07:30,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1209106432. Throughput: 0: 11320.9. Samples: 302367744. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:07:31,522][1653645] Updated weights for policy 0, policy_version 590416 (0.0013) [2024-06-15 19:07:32,632][1653645] Updated weights for policy 0, policy_version 590464 (0.0042) [2024-06-15 19:07:35,409][1653645] Updated weights for policy 0, policy_version 590544 (0.0142) [2024-06-15 19:07:35,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1209466880. Throughput: 0: 11423.2. Samples: 302398976. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:35,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:07:36,376][1653645] Updated weights for policy 0, policy_version 590592 (0.0011) [2024-06-15 19:07:40,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1209532416. Throughput: 0: 11082.0. Samples: 302469632. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:40,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:07:42,796][1653645] Updated weights for policy 0, policy_version 590672 (0.0018) [2024-06-15 19:07:45,196][1653645] Updated weights for policy 0, policy_version 590726 (0.0106) [2024-06-15 19:07:45,974][1648982] Fps is (10 sec: 39257.6, 60 sec: 44770.6, 300 sec: 44650.9). Total num frames: 1209860096. Throughput: 0: 11362.2. Samples: 302535680. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:45,975][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:07:46,340][1653645] Updated weights for policy 0, policy_version 590784 (0.0014) [2024-06-15 19:07:47,848][1653645] Updated weights for policy 0, policy_version 590848 (0.0016) [2024-06-15 19:07:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1210056704. Throughput: 0: 11218.5. Samples: 302567424. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:07:53,988][1653645] Updated weights for policy 0, policy_version 590914 (0.0102) [2024-06-15 19:07:55,027][1653645] Updated weights for policy 0, policy_version 590966 (0.0110) [2024-06-15 19:07:55,958][1648982] Fps is (10 sec: 45950.8, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1210318848. Throughput: 0: 11377.8. Samples: 302640128. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:07:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:07:56,592][1653645] Updated weights for policy 0, policy_version 590992 (0.0024) [2024-06-15 19:07:58,307][1653645] Updated weights for policy 0, policy_version 591059 (0.0026) [2024-06-15 19:08:00,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1210580992. Throughput: 0: 11320.9. Samples: 302708736. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:08:03,831][1653645] Updated weights for policy 0, policy_version 591120 (0.0013) [2024-06-15 19:08:05,738][1651596] Signal inference workers to stop experience collection... (30750 times) [2024-06-15 19:08:05,810][1653645] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-06-15 19:08:05,812][1653645] Updated weights for policy 0, policy_version 591207 (0.0083) [2024-06-15 19:08:05,914][1651596] Signal inference workers to resume experience collection... (30750 times) [2024-06-15 19:08:05,916][1653645] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-06-15 19:08:05,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 47513.8, 300 sec: 44764.5). Total num frames: 1210810368. Throughput: 0: 11468.8. Samples: 302752256. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:08:07,766][1653645] Updated weights for policy 0, policy_version 591238 (0.0031) [2024-06-15 19:08:10,066][1653645] Updated weights for policy 0, policy_version 591332 (0.0019) [2024-06-15 19:08:10,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1211105280. Throughput: 0: 11594.0. Samples: 302814720. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:08:14,726][1653645] Updated weights for policy 0, policy_version 591376 (0.0012) [2024-06-15 19:08:15,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1211236352. Throughput: 0: 11571.2. Samples: 302888448. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:08:16,779][1653645] Updated weights for policy 0, policy_version 591472 (0.0025) [2024-06-15 19:08:19,341][1653645] Updated weights for policy 0, policy_version 591508 (0.0013) [2024-06-15 19:08:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46421.3, 300 sec: 45208.7). Total num frames: 1211531264. Throughput: 0: 11605.4. Samples: 302921216. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:08:21,234][1653645] Updated weights for policy 0, policy_version 591584 (0.0098) [2024-06-15 19:08:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1211629568. Throughput: 0: 11525.7. Samples: 302988288. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:08:26,989][1653645] Updated weights for policy 0, policy_version 591664 (0.0112) [2024-06-15 19:08:28,539][1653645] Updated weights for policy 0, policy_version 591728 (0.0012) [2024-06-15 19:08:30,957][1648982] Fps is (10 sec: 36045.4, 60 sec: 46421.5, 300 sec: 44875.5). Total num frames: 1211891712. Throughput: 0: 11632.4. Samples: 303058944. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:30,957][1653645] Updated weights for policy 0, policy_version 591745 (0.0014) [2024-06-15 19:08:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:08:32,187][1653645] Updated weights for policy 0, policy_version 591800 (0.0012) [2024-06-15 19:08:33,818][1653645] Updated weights for policy 0, policy_version 591864 (0.0015) [2024-06-15 19:08:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1212153856. Throughput: 0: 11571.2. Samples: 303088128. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:08:38,086][1653645] Updated weights for policy 0, policy_version 591910 (0.0012) [2024-06-15 19:08:38,888][1653645] Updated weights for policy 0, policy_version 591940 (0.0013) [2024-06-15 19:08:40,178][1653645] Updated weights for policy 0, policy_version 592000 (0.0011) [2024-06-15 19:08:40,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 48059.8, 300 sec: 44879.2). Total num frames: 1212416000. Throughput: 0: 11400.5. Samples: 303153152. Policy #0 lag: (min: 65.0, avg: 138.1, max: 337.0) [2024-06-15 19:08:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:08:44,375][1653645] Updated weights for policy 0, policy_version 592065 (0.0015) [2024-06-15 19:08:45,612][1653645] Updated weights for policy 0, policy_version 592121 (0.0013) [2024-06-15 19:08:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 46980.3, 300 sec: 45097.6). Total num frames: 1212678144. Throughput: 0: 11423.3. Samples: 303222784. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:08:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:08:49,595][1651596] Signal inference workers to stop experience collection... (30800 times) [2024-06-15 19:08:49,662][1653645] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-06-15 19:08:49,818][1651596] Signal inference workers to resume experience collection... (30800 times) [2024-06-15 19:08:49,820][1653645] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-06-15 19:08:49,822][1653645] Updated weights for policy 0, policy_version 592176 (0.0025) [2024-06-15 19:08:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 46421.5, 300 sec: 44986.6). Total num frames: 1212841984. Throughput: 0: 11229.9. Samples: 303257600. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:08:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:08:51,743][1653645] Updated weights for policy 0, policy_version 592248 (0.0101) [2024-06-15 19:08:54,515][1653645] Updated weights for policy 0, policy_version 592278 (0.0017) [2024-06-15 19:08:55,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1213071360. Throughput: 0: 11355.0. Samples: 303325696. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:08:55,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:08:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000592320_1213071360.pth... [2024-06-15 19:08:56,106][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000587008_1202192384.pth [2024-06-15 19:08:56,863][1653645] Updated weights for policy 0, policy_version 592352 (0.0011) [2024-06-15 19:09:00,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.8, 300 sec: 44542.4). Total num frames: 1213235200. Throughput: 0: 11229.8. Samples: 303393792. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:09:01,116][1653645] Updated weights for policy 0, policy_version 592416 (0.0011) [2024-06-15 19:09:02,152][1653645] Updated weights for policy 0, policy_version 592451 (0.0012) [2024-06-15 19:09:03,516][1653645] Updated weights for policy 0, policy_version 592512 (0.0011) [2024-06-15 19:09:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44236.7, 300 sec: 45097.6). Total num frames: 1213464576. Throughput: 0: 11127.5. Samples: 303421952. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:09:07,138][1653645] Updated weights for policy 0, policy_version 592567 (0.0015) [2024-06-15 19:09:08,574][1653645] Updated weights for policy 0, policy_version 592624 (0.0012) [2024-06-15 19:09:10,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1213726720. Throughput: 0: 11218.5. Samples: 303493120. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:10,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:09:13,744][1653645] Updated weights for policy 0, policy_version 592692 (0.0011) [2024-06-15 19:09:15,384][1653645] Updated weights for policy 0, policy_version 592758 (0.0014) [2024-06-15 19:09:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1213988864. Throughput: 0: 10990.9. Samples: 303553536. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:09:18,572][1653645] Updated weights for policy 0, policy_version 592800 (0.0011) [2024-06-15 19:09:19,724][1653645] Updated weights for policy 0, policy_version 592836 (0.0012) [2024-06-15 19:09:20,822][1653645] Updated weights for policy 0, policy_version 592894 (0.0030) [2024-06-15 19:09:20,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1214251008. Throughput: 0: 11332.2. Samples: 303598080. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:09:25,295][1653645] Updated weights for policy 0, policy_version 592944 (0.0011) [2024-06-15 19:09:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.3, 300 sec: 44875.6). Total num frames: 1214382080. Throughput: 0: 11195.8. Samples: 303656960. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:09:26,685][1653645] Updated weights for policy 0, policy_version 592995 (0.0022) [2024-06-15 19:09:30,469][1653645] Updated weights for policy 0, policy_version 593040 (0.0013) [2024-06-15 19:09:30,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44782.8, 300 sec: 45319.8). Total num frames: 1214578688. Throughput: 0: 11229.9. Samples: 303728128. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:09:31,620][1653645] Updated weights for policy 0, policy_version 593088 (0.0011) [2024-06-15 19:09:31,760][1651596] Signal inference workers to stop experience collection... (30850 times) [2024-06-15 19:09:31,812][1653645] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-06-15 19:09:31,929][1651596] Signal inference workers to resume experience collection... (30850 times) [2024-06-15 19:09:31,930][1653645] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-06-15 19:09:32,723][1653645] Updated weights for policy 0, policy_version 593137 (0.0019) [2024-06-15 19:09:35,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1214808064. Throughput: 0: 11138.8. Samples: 303758848. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:09:36,595][1653645] Updated weights for policy 0, policy_version 593200 (0.0014) [2024-06-15 19:09:37,831][1653645] Updated weights for policy 0, policy_version 593236 (0.0012) [2024-06-15 19:09:40,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1215037440. Throughput: 0: 11036.5. Samples: 303822336. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:40,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:09:41,913][1653645] Updated weights for policy 0, policy_version 593281 (0.0017) [2024-06-15 19:09:44,891][1653645] Updated weights for policy 0, policy_version 593386 (0.0014) [2024-06-15 19:09:45,457][1653645] Updated weights for policy 0, policy_version 593408 (0.0012) [2024-06-15 19:09:45,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 45208.8). Total num frames: 1215299584. Throughput: 0: 10991.0. Samples: 303888384. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:09:48,246][1653645] Updated weights for policy 0, policy_version 593465 (0.0012) [2024-06-15 19:09:50,217][1653645] Updated weights for policy 0, policy_version 593534 (0.0097) [2024-06-15 19:09:50,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.0, 300 sec: 45319.8). Total num frames: 1215561728. Throughput: 0: 11127.5. Samples: 303922688. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:09:55,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 44236.8, 300 sec: 45541.9). Total num frames: 1215725568. Throughput: 0: 11093.3. Samples: 303992320. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:09:55,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 19:09:56,303][1653645] Updated weights for policy 0, policy_version 593632 (0.0013) [2024-06-15 19:09:59,692][1653645] Updated weights for policy 0, policy_version 593701 (0.0013) [2024-06-15 19:10:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45329.1, 300 sec: 45208.8). Total num frames: 1215954944. Throughput: 0: 11195.7. Samples: 304057344. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:10:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:10:01,542][1653645] Updated weights for policy 0, policy_version 593746 (0.0014) [2024-06-15 19:10:04,875][1653645] Updated weights for policy 0, policy_version 593794 (0.0018) [2024-06-15 19:10:05,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45875.3, 300 sec: 45764.1). Total num frames: 1216217088. Throughput: 0: 10945.4. Samples: 304090624. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:10:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:10:07,729][1653645] Updated weights for policy 0, policy_version 593888 (0.0012) [2024-06-15 19:10:08,494][1653645] Updated weights for policy 0, policy_version 593920 (0.0029) [2024-06-15 19:10:10,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44783.1, 300 sec: 45097.7). Total num frames: 1216413696. Throughput: 0: 11229.9. Samples: 304162304. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:10:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:10:13,556][1653645] Updated weights for policy 0, policy_version 594001 (0.0014) [2024-06-15 19:10:15,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.5, 300 sec: 45319.8). Total num frames: 1216610304. Throughput: 0: 11093.3. Samples: 304227328. Policy #0 lag: (min: 4.0, avg: 97.6, max: 260.0) [2024-06-15 19:10:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:10:17,095][1653645] Updated weights for policy 0, policy_version 594080 (0.0108) [2024-06-15 19:10:19,757][1651596] Signal inference workers to stop experience collection... (30900 times) [2024-06-15 19:10:19,810][1653645] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-06-15 19:10:20,021][1651596] Signal inference workers to resume experience collection... (30900 times) [2024-06-15 19:10:20,032][1653645] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-06-15 19:10:20,037][1653645] Updated weights for policy 0, policy_version 594160 (0.0016) [2024-06-15 19:10:20,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1216872448. Throughput: 0: 11104.7. Samples: 304258560. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:10:23,313][1653645] Updated weights for policy 0, policy_version 594224 (0.0041) [2024-06-15 19:10:25,982][1648982] Fps is (10 sec: 45772.6, 60 sec: 44766.0, 300 sec: 45094.2). Total num frames: 1217069056. Throughput: 0: 11133.3. Samples: 304323584. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:25,986][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:10:26,067][1653645] Updated weights for policy 0, policy_version 594288 (0.0013) [2024-06-15 19:10:29,624][1653645] Updated weights for policy 0, policy_version 594340 (0.0121) [2024-06-15 19:10:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1217265664. Throughput: 0: 11173.0. Samples: 304391168. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:10:32,233][1653645] Updated weights for policy 0, policy_version 594400 (0.0096) [2024-06-15 19:10:34,959][1653645] Updated weights for policy 0, policy_version 594492 (0.0014) [2024-06-15 19:10:35,958][1648982] Fps is (10 sec: 45977.7, 60 sec: 45328.8, 300 sec: 45319.8). Total num frames: 1217527808. Throughput: 0: 11150.2. Samples: 304424448. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:10:36,940][1653645] Updated weights for policy 0, policy_version 594544 (0.0021) [2024-06-15 19:10:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 45431.6). Total num frames: 1217691648. Throughput: 0: 11161.6. Samples: 304494592. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:10:41,448][1653645] Updated weights for policy 0, policy_version 594593 (0.0018) [2024-06-15 19:10:42,102][1653645] Updated weights for policy 0, policy_version 594624 (0.0010) [2024-06-15 19:10:43,971][1653645] Updated weights for policy 0, policy_version 594688 (0.0017) [2024-06-15 19:10:45,966][1648982] Fps is (10 sec: 45846.1, 60 sec: 44777.9, 300 sec: 45096.7). Total num frames: 1217986560. Throughput: 0: 11228.2. Samples: 304562688. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:45,971][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:10:46,545][1653645] Updated weights for policy 0, policy_version 594752 (0.0014) [2024-06-15 19:10:48,720][1653645] Updated weights for policy 0, policy_version 594814 (0.0013) [2024-06-15 19:10:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1218183168. Throughput: 0: 11161.6. Samples: 304592896. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:10:53,446][1653645] Updated weights for policy 0, policy_version 594875 (0.0012) [2024-06-15 19:10:55,728][1653645] Updated weights for policy 0, policy_version 594944 (0.0095) [2024-06-15 19:10:55,958][1648982] Fps is (10 sec: 45904.3, 60 sec: 45329.0, 300 sec: 45208.7). Total num frames: 1218445312. Throughput: 0: 11195.7. Samples: 304666112. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:10:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:10:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000594944_1218445312.pth... [2024-06-15 19:10:56,073][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000589648_1207599104.pth [2024-06-15 19:10:56,080][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000594944_1218445312.pth [2024-06-15 19:10:57,686][1653645] Updated weights for policy 0, policy_version 595007 (0.0013) [2024-06-15 19:11:00,297][1653645] Updated weights for policy 0, policy_version 595058 (0.0132) [2024-06-15 19:11:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1218707456. Throughput: 0: 11173.0. Samples: 304730112. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:00,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:11:04,054][1653645] Updated weights for policy 0, policy_version 595108 (0.0116) [2024-06-15 19:11:05,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 43690.7, 300 sec: 45208.8). Total num frames: 1218838528. Throughput: 0: 11400.6. Samples: 304771584. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:11:06,271][1653645] Updated weights for policy 0, policy_version 595168 (0.0013) [2024-06-15 19:11:07,967][1651596] Signal inference workers to stop experience collection... (30950 times) [2024-06-15 19:11:08,009][1653645] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-06-15 19:11:08,314][1651596] Signal inference workers to resume experience collection... (30950 times) [2024-06-15 19:11:08,316][1653645] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-06-15 19:11:08,532][1653645] Updated weights for policy 0, policy_version 595234 (0.0013) [2024-06-15 19:11:10,959][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 1219100672. Throughput: 0: 11383.5. Samples: 304835584. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:10,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:11:11,297][1653645] Updated weights for policy 0, policy_version 595285 (0.0013) [2024-06-15 19:11:14,367][1653645] Updated weights for policy 0, policy_version 595329 (0.0012) [2024-06-15 19:11:15,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.3, 300 sec: 45430.9). Total num frames: 1219362816. Throughput: 0: 11355.0. Samples: 304902144. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:11:17,442][1653645] Updated weights for policy 0, policy_version 595393 (0.0020) [2024-06-15 19:11:19,708][1653645] Updated weights for policy 0, policy_version 595460 (0.0043) [2024-06-15 19:11:20,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1219592192. Throughput: 0: 11400.6. Samples: 304937472. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:11:21,107][1653645] Updated weights for policy 0, policy_version 595520 (0.0012) [2024-06-15 19:11:24,115][1653645] Updated weights for policy 0, policy_version 595584 (0.0112) [2024-06-15 19:11:25,961][1648982] Fps is (10 sec: 39307.0, 60 sec: 44797.0, 300 sec: 45319.2). Total num frames: 1219756032. Throughput: 0: 11217.6. Samples: 304999424. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:25,962][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:11:27,867][1653645] Updated weights for policy 0, policy_version 595648 (0.0012) [2024-06-15 19:11:30,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1219952640. Throughput: 0: 11220.2. Samples: 305067520. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:30,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 19:11:31,303][1653645] Updated weights for policy 0, policy_version 595710 (0.0114) [2024-06-15 19:11:33,056][1653645] Updated weights for policy 0, policy_version 595767 (0.0010) [2024-06-15 19:11:35,457][1653645] Updated weights for policy 0, policy_version 595812 (0.0013) [2024-06-15 19:11:35,966][1648982] Fps is (10 sec: 49128.4, 60 sec: 45322.9, 300 sec: 45207.4). Total num frames: 1220247552. Throughput: 0: 11284.6. Samples: 305100800. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:35,967][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:11:39,716][1653645] Updated weights for policy 0, policy_version 595876 (0.0013) [2024-06-15 19:11:40,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 1220411392. Throughput: 0: 11093.4. Samples: 305165312. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:11:42,648][1653645] Updated weights for policy 0, policy_version 595922 (0.0012) [2024-06-15 19:11:43,740][1653645] Updated weights for policy 0, policy_version 595969 (0.0029) [2024-06-15 19:11:45,958][1648982] Fps is (10 sec: 42634.5, 60 sec: 44787.9, 300 sec: 44875.5). Total num frames: 1220673536. Throughput: 0: 11002.3. Samples: 305225216. Policy #0 lag: (min: 15.0, avg: 118.8, max: 271.0) [2024-06-15 19:11:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:11:47,548][1653645] Updated weights for policy 0, policy_version 596038 (0.0022) [2024-06-15 19:11:50,905][1653645] Updated weights for policy 0, policy_version 596112 (0.0039) [2024-06-15 19:11:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1220837376. Throughput: 0: 10854.4. Samples: 305260032. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:11:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:11:53,962][1653645] Updated weights for policy 0, policy_version 596162 (0.0013) [2024-06-15 19:11:55,085][1653645] Updated weights for policy 0, policy_version 596224 (0.0014) [2024-06-15 19:11:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.1, 300 sec: 44986.6). Total num frames: 1221132288. Throughput: 0: 10956.8. Samples: 305328640. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:11:55,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:11:56,032][1651596] Signal inference workers to stop experience collection... (31000 times) [2024-06-15 19:11:56,083][1653645] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-06-15 19:11:56,282][1651596] Signal inference workers to resume experience collection... (31000 times) [2024-06-15 19:11:56,283][1653645] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-06-15 19:11:56,493][1653645] Updated weights for policy 0, policy_version 596283 (0.0014) [2024-06-15 19:12:00,828][1653645] Updated weights for policy 0, policy_version 596352 (0.0017) [2024-06-15 19:12:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1221328896. Throughput: 0: 11025.1. Samples: 305398272. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:12:03,184][1653645] Updated weights for policy 0, policy_version 596414 (0.0013) [2024-06-15 19:12:05,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1221459968. Throughput: 0: 10877.2. Samples: 305426944. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:12:07,672][1653645] Updated weights for policy 0, policy_version 596496 (0.0015) [2024-06-15 19:12:08,941][1653645] Updated weights for policy 0, policy_version 596544 (0.0013) [2024-06-15 19:12:10,960][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1221722112. Throughput: 0: 10934.9. Samples: 305491456. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:10,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:12:13,851][1653645] Updated weights for policy 0, policy_version 596609 (0.0016) [2024-06-15 19:12:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1221984256. Throughput: 0: 10854.4. Samples: 305555968. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:12:17,862][1653645] Updated weights for policy 0, policy_version 596673 (0.0127) [2024-06-15 19:12:19,615][1653645] Updated weights for policy 0, policy_version 596752 (0.0014) [2024-06-15 19:12:20,994][1648982] Fps is (10 sec: 52237.6, 60 sec: 44209.8, 300 sec: 44869.9). Total num frames: 1222246400. Throughput: 0: 10984.1. Samples: 305595392. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:20,995][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:12:24,267][1653645] Updated weights for policy 0, policy_version 596816 (0.0012) [2024-06-15 19:12:25,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44239.4, 300 sec: 45097.6). Total num frames: 1222410240. Throughput: 0: 10990.9. Samples: 305659904. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:25,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:12:26,080][1653645] Updated weights for policy 0, policy_version 596896 (0.0084) [2024-06-15 19:12:29,617][1653645] Updated weights for policy 0, policy_version 596931 (0.0012) [2024-06-15 19:12:30,958][1648982] Fps is (10 sec: 36177.6, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1222606848. Throughput: 0: 11229.9. Samples: 305730560. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:12:32,008][1653645] Updated weights for policy 0, policy_version 597024 (0.0116) [2024-06-15 19:12:35,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 42058.3, 300 sec: 44875.5). Total num frames: 1222770688. Throughput: 0: 10990.9. Samples: 305754624. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:12:36,233][1653645] Updated weights for policy 0, policy_version 597073 (0.0042) [2024-06-15 19:12:37,283][1653645] Updated weights for policy 0, policy_version 597121 (0.0013) [2024-06-15 19:12:38,616][1653645] Updated weights for policy 0, policy_version 597179 (0.0013) [2024-06-15 19:12:40,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 44655.8). Total num frames: 1223032832. Throughput: 0: 11082.0. Samples: 305827328. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:12:42,064][1651596] Signal inference workers to stop experience collection... (31050 times) [2024-06-15 19:12:42,119][1653645] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-06-15 19:12:42,235][1651596] Signal inference workers to resume experience collection... (31050 times) [2024-06-15 19:12:42,236][1653645] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-06-15 19:12:42,490][1653645] Updated weights for policy 0, policy_version 597236 (0.0011) [2024-06-15 19:12:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1223294976. Throughput: 0: 10899.9. Samples: 305888768. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:12:48,341][1653645] Updated weights for policy 0, policy_version 597314 (0.0015) [2024-06-15 19:12:50,091][1653645] Updated weights for policy 0, policy_version 597392 (0.0077) [2024-06-15 19:12:50,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1223557120. Throughput: 0: 11229.9. Samples: 305932288. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:50,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 19:12:51,956][1653645] Updated weights for policy 0, policy_version 597441 (0.0012) [2024-06-15 19:12:54,062][1653645] Updated weights for policy 0, policy_version 597520 (0.0013) [2024-06-15 19:12:55,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 44782.8, 300 sec: 44875.5). Total num frames: 1223819264. Throughput: 0: 11320.8. Samples: 306000896. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:12:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:12:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000597568_1223819264.pth... [2024-06-15 19:12:56,023][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000592320_1213071360.pth [2024-06-15 19:12:59,836][1653645] Updated weights for policy 0, policy_version 597600 (0.0013) [2024-06-15 19:13:00,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.9, 300 sec: 44653.3). Total num frames: 1223983104. Throughput: 0: 11355.0. Samples: 306066944. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:13:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:13:01,385][1653645] Updated weights for policy 0, policy_version 597668 (0.0016) [2024-06-15 19:13:04,201][1653645] Updated weights for policy 0, policy_version 597728 (0.0013) [2024-06-15 19:13:05,763][1653645] Updated weights for policy 0, policy_version 597776 (0.0013) [2024-06-15 19:13:05,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 46421.3, 300 sec: 44542.3). Total num frames: 1224245248. Throughput: 0: 11239.0. Samples: 306100736. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:13:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:13:10,925][1653645] Updated weights for policy 0, policy_version 597827 (0.0013) [2024-06-15 19:13:10,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1224343552. Throughput: 0: 11264.1. Samples: 306166784. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:13:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:13:12,555][1653645] Updated weights for policy 0, policy_version 597910 (0.0012) [2024-06-15 19:13:15,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1224638464. Throughput: 0: 11229.9. Samples: 306235904. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:13:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:13:16,242][1653645] Updated weights for policy 0, policy_version 597984 (0.0014) [2024-06-15 19:13:17,905][1653645] Updated weights for policy 0, policy_version 598018 (0.0016) [2024-06-15 19:13:20,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43717.4, 300 sec: 44875.5). Total num frames: 1224867840. Throughput: 0: 11343.6. Samples: 306265088. Policy #0 lag: (min: 15.0, avg: 127.3, max: 271.0) [2024-06-15 19:13:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:13:22,706][1653645] Updated weights for policy 0, policy_version 598096 (0.0013) [2024-06-15 19:13:25,120][1653645] Updated weights for policy 0, policy_version 598192 (0.0012) [2024-06-15 19:13:25,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 1225129984. Throughput: 0: 11252.6. Samples: 306333696. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:13:27,618][1651596] Signal inference workers to stop experience collection... (31100 times) [2024-06-15 19:13:27,697][1653645] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-06-15 19:13:27,911][1651596] Signal inference workers to resume experience collection... (31100 times) [2024-06-15 19:13:27,912][1653645] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-06-15 19:13:28,556][1653645] Updated weights for policy 0, policy_version 598241 (0.0027) [2024-06-15 19:13:29,805][1653645] Updated weights for policy 0, policy_version 598288 (0.0011) [2024-06-15 19:13:30,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 1225392128. Throughput: 0: 11229.9. Samples: 306394112. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 19:13:34,933][1653645] Updated weights for policy 0, policy_version 598357 (0.0015) [2024-06-15 19:13:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1225523200. Throughput: 0: 11036.4. Samples: 306428928. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:13:36,863][1653645] Updated weights for policy 0, policy_version 598448 (0.0013) [2024-06-15 19:13:40,065][1653645] Updated weights for policy 0, policy_version 598499 (0.0121) [2024-06-15 19:13:40,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1225785344. Throughput: 0: 11150.2. Samples: 306502656. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:13:41,474][1653645] Updated weights for policy 0, policy_version 598547 (0.0012) [2024-06-15 19:13:45,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1225916416. Throughput: 0: 11172.9. Samples: 306569728. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:13:46,415][1653645] Updated weights for policy 0, policy_version 598608 (0.0014) [2024-06-15 19:13:48,667][1653645] Updated weights for policy 0, policy_version 598704 (0.0123) [2024-06-15 19:13:50,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1226178560. Throughput: 0: 11013.7. Samples: 306596352. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:13:51,986][1653645] Updated weights for policy 0, policy_version 598753 (0.0015) [2024-06-15 19:13:53,155][1653645] Updated weights for policy 0, policy_version 598788 (0.0013) [2024-06-15 19:13:55,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.9, 300 sec: 44764.4). Total num frames: 1226440704. Throughput: 0: 10934.0. Samples: 306658816. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:13:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:13:58,657][1653645] Updated weights for policy 0, policy_version 598851 (0.0017) [2024-06-15 19:14:00,608][1653645] Updated weights for policy 0, policy_version 598928 (0.0013) [2024-06-15 19:14:00,960][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1226604544. Throughput: 0: 10922.7. Samples: 306727424. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:00,961][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:14:01,851][1653645] Updated weights for policy 0, policy_version 598973 (0.0014) [2024-06-15 19:14:04,240][1653645] Updated weights for policy 0, policy_version 599032 (0.0013) [2024-06-15 19:14:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1226866688. Throughput: 0: 10968.2. Samples: 306758656. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:14:06,759][1653645] Updated weights for policy 0, policy_version 599104 (0.0103) [2024-06-15 19:14:10,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1226964992. Throughput: 0: 10888.5. Samples: 306823680. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:14:12,973][1653645] Updated weights for policy 0, policy_version 599171 (0.0012) [2024-06-15 19:14:13,308][1651596] Signal inference workers to stop experience collection... (31150 times) [2024-06-15 19:14:13,357][1653645] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-06-15 19:14:13,565][1651596] Signal inference workers to resume experience collection... (31150 times) [2024-06-15 19:14:13,566][1653645] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-06-15 19:14:14,352][1653645] Updated weights for policy 0, policy_version 599232 (0.0012) [2024-06-15 19:14:15,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1227325440. Throughput: 0: 10865.8. Samples: 306883072. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:14:16,214][1653645] Updated weights for policy 0, policy_version 599296 (0.0013) [2024-06-15 19:14:18,995][1653645] Updated weights for policy 0, policy_version 599353 (0.0012) [2024-06-15 19:14:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1227489280. Throughput: 0: 10968.2. Samples: 306922496. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:14:24,292][1653645] Updated weights for policy 0, policy_version 599424 (0.0013) [2024-06-15 19:14:25,724][1653645] Updated weights for policy 0, policy_version 599478 (0.0013) [2024-06-15 19:14:25,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1227751424. Throughput: 0: 10843.1. Samples: 306990592. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:14:27,065][1653645] Updated weights for policy 0, policy_version 599520 (0.0010) [2024-06-15 19:14:30,213][1653645] Updated weights for policy 0, policy_version 599584 (0.0015) [2024-06-15 19:14:30,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1228013568. Throughput: 0: 10706.5. Samples: 307051520. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:14:35,322][1653645] Updated weights for policy 0, policy_version 599636 (0.0013) [2024-06-15 19:14:35,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1228079104. Throughput: 0: 10922.7. Samples: 307087872. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:14:37,284][1653645] Updated weights for policy 0, policy_version 599712 (0.0011) [2024-06-15 19:14:38,335][1653645] Updated weights for policy 0, policy_version 599760 (0.0012) [2024-06-15 19:14:40,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 1228406784. Throughput: 0: 10911.2. Samples: 307149824. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:14:41,603][1653645] Updated weights for policy 0, policy_version 599809 (0.0015) [2024-06-15 19:14:42,765][1653645] Updated weights for policy 0, policy_version 599867 (0.0013) [2024-06-15 19:14:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1228537856. Throughput: 0: 11047.8. Samples: 307224576. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:14:47,477][1653645] Updated weights for policy 0, policy_version 599906 (0.0012) [2024-06-15 19:14:49,018][1653645] Updated weights for policy 0, policy_version 599973 (0.0122) [2024-06-15 19:14:50,524][1653645] Updated weights for policy 0, policy_version 600037 (0.0011) [2024-06-15 19:14:50,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1228931072. Throughput: 0: 11127.5. Samples: 307259392. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:14:53,659][1653645] Updated weights for policy 0, policy_version 600097 (0.0022) [2024-06-15 19:14:55,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1229062144. Throughput: 0: 11229.8. Samples: 307329024. Policy #0 lag: (min: 4.0, avg: 124.0, max: 266.0) [2024-06-15 19:14:55,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:14:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000600128_1229062144.pth... [2024-06-15 19:14:56,007][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000594944_1218445312.pth [2024-06-15 19:14:57,685][1653645] Updated weights for policy 0, policy_version 600148 (0.0012) [2024-06-15 19:14:58,068][1651596] Signal inference workers to stop experience collection... (31200 times) [2024-06-15 19:14:58,171][1653645] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-06-15 19:14:58,380][1651596] Signal inference workers to resume experience collection... (31200 times) [2024-06-15 19:14:58,381][1653645] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-06-15 19:14:59,387][1653645] Updated weights for policy 0, policy_version 600193 (0.0013) [2024-06-15 19:15:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1229324288. Throughput: 0: 11423.3. Samples: 307397120. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:00,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:15:01,078][1653645] Updated weights for policy 0, policy_version 600272 (0.0019) [2024-06-15 19:15:04,433][1653645] Updated weights for policy 0, policy_version 600352 (0.0016) [2024-06-15 19:15:05,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1229586432. Throughput: 0: 11332.2. Samples: 307432448. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:05,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 19:15:09,840][1653645] Updated weights for policy 0, policy_version 600432 (0.0012) [2024-06-15 19:15:10,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1229717504. Throughput: 0: 11309.5. Samples: 307499520. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:15:12,334][1653645] Updated weights for policy 0, policy_version 600496 (0.0018) [2024-06-15 19:15:13,954][1653645] Updated weights for policy 0, policy_version 600570 (0.0031) [2024-06-15 19:15:15,982][1648982] Fps is (10 sec: 39224.9, 60 sec: 44218.6, 300 sec: 44427.5). Total num frames: 1229979648. Throughput: 0: 11405.7. Samples: 307565056. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:15,983][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:15:20,544][1653645] Updated weights for policy 0, policy_version 600656 (0.0017) [2024-06-15 19:15:20,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44323.5). Total num frames: 1230143488. Throughput: 0: 11252.6. Samples: 307594240. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:15:23,300][1653645] Updated weights for policy 0, policy_version 600705 (0.0013) [2024-06-15 19:15:24,956][1653645] Updated weights for policy 0, policy_version 600769 (0.0015) [2024-06-15 19:15:25,960][1648982] Fps is (10 sec: 45988.8, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 1230438400. Throughput: 0: 11332.3. Samples: 307659776. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:25,961][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:15:28,992][1653645] Updated weights for policy 0, policy_version 600833 (0.0014) [2024-06-15 19:15:30,424][1653645] Updated weights for policy 0, policy_version 600896 (0.0013) [2024-06-15 19:15:30,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1230635008. Throughput: 0: 11241.2. Samples: 307730432. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:15:32,421][1653645] Updated weights for policy 0, policy_version 600959 (0.0013) [2024-06-15 19:15:35,912][1653645] Updated weights for policy 0, policy_version 601009 (0.0011) [2024-06-15 19:15:35,959][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.3, 300 sec: 44653.3). Total num frames: 1230864384. Throughput: 0: 11286.8. Samples: 307767296. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:35,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:15:37,528][1653645] Updated weights for policy 0, policy_version 601083 (0.0020) [2024-06-15 19:15:40,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44237.0, 300 sec: 44321.1). Total num frames: 1231060992. Throughput: 0: 11275.4. Samples: 307836416. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:15:42,004][1653645] Updated weights for policy 0, policy_version 601143 (0.0015) [2024-06-15 19:15:42,801][1651596] Signal inference workers to stop experience collection... (31250 times) [2024-06-15 19:15:42,899][1653645] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-06-15 19:15:43,058][1651596] Signal inference workers to resume experience collection... (31250 times) [2024-06-15 19:15:43,059][1653645] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-06-15 19:15:43,922][1653645] Updated weights for policy 0, policy_version 601216 (0.0013) [2024-06-15 19:15:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1231290368. Throughput: 0: 11286.8. Samples: 307905024. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:15:47,715][1653645] Updated weights for policy 0, policy_version 601284 (0.0014) [2024-06-15 19:15:48,609][1653645] Updated weights for policy 0, policy_version 601343 (0.0014) [2024-06-15 19:15:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1231552512. Throughput: 0: 11150.2. Samples: 307934208. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:15:53,539][1653645] Updated weights for policy 0, policy_version 601400 (0.0092) [2024-06-15 19:15:54,523][1653645] Updated weights for policy 0, policy_version 601441 (0.0012) [2024-06-15 19:15:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 1231814656. Throughput: 0: 11252.6. Samples: 308005888. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:15:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:15:57,307][1653645] Updated weights for policy 0, policy_version 601490 (0.0015) [2024-06-15 19:15:58,123][1653645] Updated weights for policy 0, policy_version 601528 (0.0041) [2024-06-15 19:16:00,313][1653645] Updated weights for policy 0, policy_version 601593 (0.0013) [2024-06-15 19:16:00,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1232076800. Throughput: 0: 11349.8. Samples: 308075520. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:16:03,996][1653645] Updated weights for policy 0, policy_version 601648 (0.0090) [2024-06-15 19:16:05,544][1653645] Updated weights for policy 0, policy_version 601728 (0.0015) [2024-06-15 19:16:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1232338944. Throughput: 0: 11605.3. Samples: 308116480. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:16:09,436][1653645] Updated weights for policy 0, policy_version 601780 (0.0025) [2024-06-15 19:16:10,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1232470016. Throughput: 0: 11605.3. Samples: 308182016. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:16:12,280][1653645] Updated weights for policy 0, policy_version 601856 (0.0012) [2024-06-15 19:16:15,938][1653645] Updated weights for policy 0, policy_version 601925 (0.0013) [2024-06-15 19:16:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45894.1, 300 sec: 44542.3). Total num frames: 1232732160. Throughput: 0: 11502.9. Samples: 308248064. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:15,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 19:16:16,907][1653645] Updated weights for policy 0, policy_version 601973 (0.0014) [2024-06-15 19:16:20,908][1653645] Updated weights for policy 0, policy_version 602016 (0.0033) [2024-06-15 19:16:20,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 46421.0, 300 sec: 44653.9). Total num frames: 1232928768. Throughput: 0: 11514.2. Samples: 308285440. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:20,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:16:22,932][1653645] Updated weights for policy 0, policy_version 602080 (0.0014) [2024-06-15 19:16:23,730][1653645] Updated weights for policy 0, policy_version 602112 (0.0028) [2024-06-15 19:16:25,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 1233125376. Throughput: 0: 11514.3. Samples: 308354560. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:25,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:16:27,207][1651596] Signal inference workers to stop experience collection... (31300 times) [2024-06-15 19:16:27,248][1653645] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-06-15 19:16:27,359][1651596] Signal inference workers to resume experience collection... (31300 times) [2024-06-15 19:16:27,360][1653645] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-06-15 19:16:27,567][1653645] Updated weights for policy 0, policy_version 602195 (0.0018) [2024-06-15 19:16:30,958][1648982] Fps is (10 sec: 45876.9, 60 sec: 45875.3, 300 sec: 44543.6). Total num frames: 1233387520. Throughput: 0: 11446.1. Samples: 308420096. Policy #0 lag: (min: 32.0, avg: 116.9, max: 288.0) [2024-06-15 19:16:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:16:32,354][1653645] Updated weights for policy 0, policy_version 602261 (0.0023) [2024-06-15 19:16:34,746][1653645] Updated weights for policy 0, policy_version 602340 (0.0015) [2024-06-15 19:16:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 1233649664. Throughput: 0: 11491.6. Samples: 308451328. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:16:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:16:38,210][1653645] Updated weights for policy 0, policy_version 602384 (0.0015) [2024-06-15 19:16:40,196][1653645] Updated weights for policy 0, policy_version 602468 (0.0012) [2024-06-15 19:16:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 47513.7, 300 sec: 44875.5). Total num frames: 1233911808. Throughput: 0: 11434.7. Samples: 308520448. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:16:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:16:44,952][1653645] Updated weights for policy 0, policy_version 602512 (0.0012) [2024-06-15 19:16:45,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 1234010112. Throughput: 0: 11332.3. Samples: 308585472. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:16:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:16:47,631][1653645] Updated weights for policy 0, policy_version 602618 (0.0013) [2024-06-15 19:16:50,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1234239488. Throughput: 0: 10956.8. Samples: 308609536. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:16:50,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:16:51,406][1653645] Updated weights for policy 0, policy_version 602688 (0.0013) [2024-06-15 19:16:53,413][1653645] Updated weights for policy 0, policy_version 602752 (0.0013) [2024-06-15 19:16:55,958][1648982] Fps is (10 sec: 42596.9, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1234436096. Throughput: 0: 11002.2. Samples: 308677120. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:16:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:16:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000602752_1234436096.pth... [2024-06-15 19:16:56,004][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000597568_1223819264.pth [2024-06-15 19:16:57,984][1653645] Updated weights for policy 0, policy_version 602800 (0.0013) [2024-06-15 19:16:59,711][1653645] Updated weights for policy 0, policy_version 602866 (0.0133) [2024-06-15 19:17:00,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1234698240. Throughput: 0: 10934.0. Samples: 308740096. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:17:02,872][1653645] Updated weights for policy 0, policy_version 602912 (0.0034) [2024-06-15 19:17:04,883][1653645] Updated weights for policy 0, policy_version 602982 (0.0013) [2024-06-15 19:17:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1234960384. Throughput: 0: 10843.0. Samples: 308773376. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:17:08,992][1653645] Updated weights for policy 0, policy_version 603024 (0.0012) [2024-06-15 19:17:10,962][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1235124224. Throughput: 0: 10945.4. Samples: 308847104. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:10,962][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:17:11,252][1653645] Updated weights for policy 0, policy_version 603112 (0.0013) [2024-06-15 19:17:13,662][1653645] Updated weights for policy 0, policy_version 603137 (0.0013) [2024-06-15 19:17:14,085][1651596] Signal inference workers to stop experience collection... (31350 times) [2024-06-15 19:17:14,133][1653645] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-06-15 19:17:14,383][1651596] Signal inference workers to resume experience collection... (31350 times) [2024-06-15 19:17:14,384][1653645] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-06-15 19:17:15,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.7, 300 sec: 44436.7). Total num frames: 1235353600. Throughput: 0: 10774.7. Samples: 308904960. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:17:16,056][1653645] Updated weights for policy 0, policy_version 603201 (0.0013) [2024-06-15 19:17:20,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 42598.6, 300 sec: 44320.1). Total num frames: 1235484672. Throughput: 0: 10820.3. Samples: 308938240. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:17:21,016][1653645] Updated weights for policy 0, policy_version 603280 (0.0092) [2024-06-15 19:17:22,273][1653645] Updated weights for policy 0, policy_version 603344 (0.0013) [2024-06-15 19:17:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 1235746816. Throughput: 0: 10888.5. Samples: 309010432. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:17:26,116][1653645] Updated weights for policy 0, policy_version 603408 (0.0023) [2024-06-15 19:17:27,856][1653645] Updated weights for policy 0, policy_version 603458 (0.0013) [2024-06-15 19:17:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1236008960. Throughput: 0: 10740.6. Samples: 309068800. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:17:32,641][1653645] Updated weights for policy 0, policy_version 603536 (0.0013) [2024-06-15 19:17:34,046][1653645] Updated weights for policy 0, policy_version 603600 (0.0012) [2024-06-15 19:17:35,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1236271104. Throughput: 0: 11116.1. Samples: 309109760. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:17:37,687][1653645] Updated weights for policy 0, policy_version 603649 (0.0022) [2024-06-15 19:17:39,227][1653645] Updated weights for policy 0, policy_version 603708 (0.0013) [2024-06-15 19:17:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 44653.3). Total num frames: 1236467712. Throughput: 0: 10934.1. Samples: 309169152. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:40,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:17:44,844][1653645] Updated weights for policy 0, policy_version 603779 (0.0014) [2024-06-15 19:17:45,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 1236598784. Throughput: 0: 11104.8. Samples: 309239808. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:45,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:17:46,691][1653645] Updated weights for policy 0, policy_version 603842 (0.0012) [2024-06-15 19:17:47,693][1653645] Updated weights for policy 0, policy_version 603899 (0.0013) [2024-06-15 19:17:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 1236860928. Throughput: 0: 11036.5. Samples: 309270016. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:17:51,389][1653645] Updated weights for policy 0, policy_version 603966 (0.0015) [2024-06-15 19:17:55,960][1648982] Fps is (10 sec: 45866.1, 60 sec: 43689.5, 300 sec: 44319.8). Total num frames: 1237057536. Throughput: 0: 10717.4. Samples: 309329408. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:17:55,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:17:57,628][1653645] Updated weights for policy 0, policy_version 604080 (0.0014) [2024-06-15 19:17:58,695][1651596] Signal inference workers to stop experience collection... (31400 times) [2024-06-15 19:17:58,739][1653645] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-06-15 19:17:58,741][1653645] Updated weights for policy 0, policy_version 604132 (0.0039) [2024-06-15 19:17:58,957][1651596] Signal inference workers to resume experience collection... (31400 times) [2024-06-15 19:17:58,957][1653645] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-06-15 19:18:00,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 1237319680. Throughput: 0: 11059.2. Samples: 309402624. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:18:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:18:02,701][1653645] Updated weights for policy 0, policy_version 604180 (0.0013) [2024-06-15 19:18:03,493][1653645] Updated weights for policy 0, policy_version 604218 (0.0013) [2024-06-15 19:18:05,958][1648982] Fps is (10 sec: 52439.1, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1237581824. Throughput: 0: 11070.6. Samples: 309436416. Policy #0 lag: (min: 0.0, avg: 94.0, max: 256.0) [2024-06-15 19:18:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:18:08,120][1653645] Updated weights for policy 0, policy_version 604304 (0.0012) [2024-06-15 19:18:09,796][1653645] Updated weights for policy 0, policy_version 604371 (0.0033) [2024-06-15 19:18:10,632][1653645] Updated weights for policy 0, policy_version 604415 (0.0013) [2024-06-15 19:18:10,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1237843968. Throughput: 0: 10911.3. Samples: 309501440. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:18:15,367][1653645] Updated weights for policy 0, policy_version 604464 (0.0017) [2024-06-15 19:18:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1237975040. Throughput: 0: 11195.7. Samples: 309572608. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:18:20,178][1653645] Updated weights for policy 0, policy_version 604562 (0.0012) [2024-06-15 19:18:20,960][1648982] Fps is (10 sec: 36035.0, 60 sec: 45327.0, 300 sec: 44319.7). Total num frames: 1238204416. Throughput: 0: 10842.4. Samples: 309597696. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:20,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:18:21,694][1653645] Updated weights for policy 0, policy_version 604624 (0.0012) [2024-06-15 19:18:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1238368256. Throughput: 0: 11013.6. Samples: 309664768. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:18:27,342][1653645] Updated weights for policy 0, policy_version 604688 (0.0104) [2024-06-15 19:18:29,732][1653645] Updated weights for policy 0, policy_version 604792 (0.0013) [2024-06-15 19:18:30,958][1648982] Fps is (10 sec: 42610.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1238630400. Throughput: 0: 10740.6. Samples: 309723136. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:18:32,759][1653645] Updated weights for policy 0, policy_version 604833 (0.0018) [2024-06-15 19:18:34,609][1653645] Updated weights for policy 0, policy_version 604921 (0.0011) [2024-06-15 19:18:35,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1238892544. Throughput: 0: 10843.0. Samples: 309757952. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:18:40,273][1653645] Updated weights for policy 0, policy_version 604976 (0.0012) [2024-06-15 19:18:40,962][1648982] Fps is (10 sec: 42578.4, 60 sec: 43141.1, 300 sec: 44541.6). Total num frames: 1239056384. Throughput: 0: 11115.4. Samples: 309829632. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:40,963][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:18:44,177][1651596] Signal inference workers to stop experience collection... (31450 times) [2024-06-15 19:18:44,210][1653645] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-06-15 19:18:44,388][1651596] Signal inference workers to resume experience collection... (31450 times) [2024-06-15 19:18:44,389][1653645] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-06-15 19:18:44,391][1653645] Updated weights for policy 0, policy_version 605072 (0.0012) [2024-06-15 19:18:45,883][1653645] Updated weights for policy 0, policy_version 605136 (0.0012) [2024-06-15 19:18:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 1239318528. Throughput: 0: 10877.1. Samples: 309892096. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:18:50,958][1648982] Fps is (10 sec: 36061.5, 60 sec: 42598.3, 300 sec: 43986.9). Total num frames: 1239416832. Throughput: 0: 10763.4. Samples: 309920768. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:50,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:18:51,772][1653645] Updated weights for policy 0, policy_version 605200 (0.0019) [2024-06-15 19:18:52,835][1653645] Updated weights for policy 0, policy_version 605249 (0.0011) [2024-06-15 19:18:54,034][1653645] Updated weights for policy 0, policy_version 605312 (0.0012) [2024-06-15 19:18:55,970][1648982] Fps is (10 sec: 35999.5, 60 sec: 43683.0, 300 sec: 44318.2). Total num frames: 1239678976. Throughput: 0: 10817.2. Samples: 309988352. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:18:55,971][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:18:56,257][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000605328_1239711744.pth... [2024-06-15 19:18:56,364][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000600128_1229062144.pth [2024-06-15 19:18:58,247][1653645] Updated weights for policy 0, policy_version 605408 (0.0014) [2024-06-15 19:19:00,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1239941120. Throughput: 0: 10763.3. Samples: 310056960. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:00,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:19:02,859][1653645] Updated weights for policy 0, policy_version 605448 (0.0016) [2024-06-15 19:19:04,460][1653645] Updated weights for policy 0, policy_version 605523 (0.0013) [2024-06-15 19:19:05,460][1653645] Updated weights for policy 0, policy_version 605563 (0.0011) [2024-06-15 19:19:05,958][1648982] Fps is (10 sec: 52495.0, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1240203264. Throughput: 0: 10912.0. Samples: 310088704. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:19:08,357][1653645] Updated weights for policy 0, policy_version 605620 (0.0012) [2024-06-15 19:19:10,611][1653645] Updated weights for policy 0, policy_version 605680 (0.0013) [2024-06-15 19:19:10,965][1648982] Fps is (10 sec: 49117.9, 60 sec: 43139.4, 300 sec: 44430.1). Total num frames: 1240432640. Throughput: 0: 11057.5. Samples: 310162432. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:10,968][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 19:19:14,517][1653645] Updated weights for policy 0, policy_version 605716 (0.0025) [2024-06-15 19:19:15,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1240596480. Throughput: 0: 11138.9. Samples: 310224384. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:15,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 19:19:16,229][1653645] Updated weights for policy 0, policy_version 605779 (0.0091) [2024-06-15 19:19:18,736][1653645] Updated weights for policy 0, policy_version 605828 (0.0020) [2024-06-15 19:19:20,966][1648982] Fps is (10 sec: 42592.3, 60 sec: 44232.5, 300 sec: 44429.9). Total num frames: 1240858624. Throughput: 0: 11102.6. Samples: 310257664. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:20,967][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:19:21,613][1653645] Updated weights for policy 0, policy_version 605889 (0.0013) [2024-06-15 19:19:22,832][1653645] Updated weights for policy 0, policy_version 605947 (0.0012) [2024-06-15 19:19:25,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1240989696. Throughput: 0: 11071.7. Samples: 310327808. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:19:27,111][1653645] Updated weights for policy 0, policy_version 606010 (0.0014) [2024-06-15 19:19:27,204][1651596] Signal inference workers to stop experience collection... (31500 times) [2024-06-15 19:19:27,264][1653645] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-06-15 19:19:27,275][1651596] Signal inference workers to resume experience collection... (31500 times) [2024-06-15 19:19:27,285][1653645] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-06-15 19:19:28,424][1653645] Updated weights for policy 0, policy_version 606076 (0.0013) [2024-06-15 19:19:30,958][1648982] Fps is (10 sec: 42635.0, 60 sec: 44236.8, 300 sec: 44764.4). Total num frames: 1241284608. Throughput: 0: 11195.7. Samples: 310395904. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:30,963][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:19:31,992][1653645] Updated weights for policy 0, policy_version 606139 (0.0012) [2024-06-15 19:19:34,760][1653645] Updated weights for policy 0, policy_version 606204 (0.0012) [2024-06-15 19:19:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1241513984. Throughput: 0: 11286.8. Samples: 310428672. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:19:38,443][1653645] Updated weights for policy 0, policy_version 606264 (0.0104) [2024-06-15 19:19:39,667][1653645] Updated weights for policy 0, policy_version 606306 (0.0013) [2024-06-15 19:19:40,186][1653645] Updated weights for policy 0, policy_version 606336 (0.0046) [2024-06-15 19:19:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45332.6, 300 sec: 44875.5). Total num frames: 1241776128. Throughput: 0: 11233.0. Samples: 310493696. Policy #0 lag: (min: 11.0, avg: 102.2, max: 267.0) [2024-06-15 19:19:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:19:42,955][1653645] Updated weights for policy 0, policy_version 606395 (0.0012) [2024-06-15 19:19:45,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1242005504. Throughput: 0: 11377.8. Samples: 310568960. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:19:45,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 19:19:45,960][1653645] Updated weights for policy 0, policy_version 606456 (0.0013) [2024-06-15 19:19:49,672][1653645] Updated weights for policy 0, policy_version 606521 (0.0014) [2024-06-15 19:19:50,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 46967.6, 300 sec: 44653.4). Total num frames: 1242234880. Throughput: 0: 11468.8. Samples: 310604800. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:19:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:19:51,221][1653645] Updated weights for policy 0, policy_version 606584 (0.0013) [2024-06-15 19:19:53,796][1653645] Updated weights for policy 0, policy_version 606640 (0.0014) [2024-06-15 19:19:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45884.8, 300 sec: 44431.2). Total num frames: 1242431488. Throughput: 0: 11220.3. Samples: 310667264. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:19:55,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 19:19:57,255][1653645] Updated weights for policy 0, policy_version 606704 (0.0069) [2024-06-15 19:20:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 1242628096. Throughput: 0: 11457.4. Samples: 310739968. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:20:01,350][1653645] Updated weights for policy 0, policy_version 606780 (0.0014) [2024-06-15 19:20:04,729][1653645] Updated weights for policy 0, policy_version 606850 (0.0015) [2024-06-15 19:20:05,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1242923008. Throughput: 0: 11254.8. Samples: 310764032. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:20:06,342][1653645] Updated weights for policy 0, policy_version 606912 (0.0093) [2024-06-15 19:20:10,065][1653645] Updated weights for policy 0, policy_version 606970 (0.0013) [2024-06-15 19:20:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44242.1, 300 sec: 44434.9). Total num frames: 1243086848. Throughput: 0: 11093.4. Samples: 310827008. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:20:13,491][1653645] Updated weights for policy 0, policy_version 607024 (0.0012) [2024-06-15 19:20:13,600][1651596] Signal inference workers to stop experience collection... (31550 times) [2024-06-15 19:20:13,689][1653645] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-06-15 19:20:13,843][1651596] Signal inference workers to resume experience collection... (31550 times) [2024-06-15 19:20:13,844][1653645] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-06-15 19:20:14,939][1653645] Updated weights for policy 0, policy_version 607078 (0.0013) [2024-06-15 19:20:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1243348992. Throughput: 0: 11173.0. Samples: 310898688. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:20:16,305][1653645] Updated weights for policy 0, policy_version 607105 (0.0015) [2024-06-15 19:20:17,673][1653645] Updated weights for policy 0, policy_version 607167 (0.0050) [2024-06-15 19:20:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44243.1, 300 sec: 44320.1). Total num frames: 1243512832. Throughput: 0: 11241.2. Samples: 310934528. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:20,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 19:20:21,660][1653645] Updated weights for policy 0, policy_version 607225 (0.0048) [2024-06-15 19:20:24,523][1653645] Updated weights for policy 0, policy_version 607289 (0.0013) [2024-06-15 19:20:25,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 46967.4, 300 sec: 44653.3). Total num frames: 1243807744. Throughput: 0: 11434.6. Samples: 311008256. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:20:25,960][1653645] Updated weights for policy 0, policy_version 607331 (0.0035) [2024-06-15 19:20:28,904][1653645] Updated weights for policy 0, policy_version 607394 (0.0012) [2024-06-15 19:20:30,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 44542.3). Total num frames: 1244004352. Throughput: 0: 11184.4. Samples: 311072256. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:20:31,754][1653645] Updated weights for policy 0, policy_version 607440 (0.0015) [2024-06-15 19:20:35,197][1653645] Updated weights for policy 0, policy_version 607510 (0.0014) [2024-06-15 19:20:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.0, 300 sec: 44764.4). Total num frames: 1244266496. Throughput: 0: 11161.5. Samples: 311107072. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:20:36,691][1653645] Updated weights for policy 0, policy_version 607570 (0.0012) [2024-06-15 19:20:38,982][1653645] Updated weights for policy 0, policy_version 607622 (0.0044) [2024-06-15 19:20:40,959][1648982] Fps is (10 sec: 52421.7, 60 sec: 45874.2, 300 sec: 44875.3). Total num frames: 1244528640. Throughput: 0: 11366.1. Samples: 311178752. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:40,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:20:43,382][1653645] Updated weights for policy 0, policy_version 607701 (0.0029) [2024-06-15 19:20:45,965][1648982] Fps is (10 sec: 39295.3, 60 sec: 44231.7, 300 sec: 44430.1). Total num frames: 1244659712. Throughput: 0: 11273.6. Samples: 311247360. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:45,966][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:20:46,574][1653645] Updated weights for policy 0, policy_version 607748 (0.0012) [2024-06-15 19:20:48,230][1653645] Updated weights for policy 0, policy_version 607809 (0.0013) [2024-06-15 19:20:50,810][1653645] Updated weights for policy 0, policy_version 607874 (0.0014) [2024-06-15 19:20:50,958][1648982] Fps is (10 sec: 39326.3, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 1244921856. Throughput: 0: 11514.3. Samples: 311282176. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:20:55,352][1653645] Updated weights for policy 0, policy_version 607952 (0.0015) [2024-06-15 19:20:55,958][1648982] Fps is (10 sec: 45906.7, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1245118464. Throughput: 0: 11457.4. Samples: 311342592. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:20:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:20:56,277][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000608000_1245184000.pth... [2024-06-15 19:20:56,311][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000602752_1234436096.pth [2024-06-15 19:20:57,834][1653645] Updated weights for policy 0, policy_version 608016 (0.0013) [2024-06-15 19:20:59,144][1653645] Updated weights for policy 0, policy_version 608064 (0.0015) [2024-06-15 19:20:59,886][1651596] Signal inference workers to stop experience collection... (31600 times) [2024-06-15 19:20:59,902][1653645] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-06-15 19:21:00,081][1651596] Signal inference workers to resume experience collection... (31600 times) [2024-06-15 19:21:00,082][1653645] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-06-15 19:21:00,966][1648982] Fps is (10 sec: 49110.0, 60 sec: 46414.6, 300 sec: 44318.8). Total num frames: 1245413376. Throughput: 0: 11534.8. Samples: 311417856. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:21:00,967][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:21:02,022][1653645] Updated weights for policy 0, policy_version 608131 (0.0013) [2024-06-15 19:21:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1245577216. Throughput: 0: 11343.7. Samples: 311444992. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:21:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:21:07,070][1653645] Updated weights for policy 0, policy_version 608224 (0.0079) [2024-06-15 19:21:09,962][1653645] Updated weights for policy 0, policy_version 608272 (0.0012) [2024-06-15 19:21:10,958][1648982] Fps is (10 sec: 39355.8, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1245806592. Throughput: 0: 11309.6. Samples: 311517184. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:21:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:21:12,501][1653645] Updated weights for policy 0, policy_version 608352 (0.0014) [2024-06-15 19:21:15,814][1653645] Updated weights for policy 0, policy_version 608419 (0.0069) [2024-06-15 19:21:15,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1246035968. Throughput: 0: 11150.2. Samples: 311574016. Policy #0 lag: (min: 21.0, avg: 143.7, max: 277.0) [2024-06-15 19:21:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:21:20,523][1653645] Updated weights for policy 0, policy_version 608510 (0.0013) [2024-06-15 19:21:20,960][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1246232576. Throughput: 0: 11070.6. Samples: 311605248. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:20,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:21:22,980][1653645] Updated weights for policy 0, policy_version 608566 (0.0181) [2024-06-15 19:21:25,936][1653645] Updated weights for policy 0, policy_version 608630 (0.0012) [2024-06-15 19:21:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 1246461952. Throughput: 0: 10991.2. Samples: 311673344. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:21:28,051][1653645] Updated weights for policy 0, policy_version 608676 (0.0012) [2024-06-15 19:21:30,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1246625792. Throughput: 0: 10833.3. Samples: 311734784. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:21:31,885][1653645] Updated weights for policy 0, policy_version 608727 (0.0013) [2024-06-15 19:21:34,346][1653645] Updated weights for policy 0, policy_version 608803 (0.0140) [2024-06-15 19:21:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1246887936. Throughput: 0: 10865.8. Samples: 311771136. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:21:36,212][1653645] Updated weights for policy 0, policy_version 608833 (0.0013) [2024-06-15 19:21:37,640][1653645] Updated weights for policy 0, policy_version 608891 (0.0014) [2024-06-15 19:21:40,060][1653645] Updated weights for policy 0, policy_version 608946 (0.0012) [2024-06-15 19:21:40,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43691.4, 300 sec: 44542.2). Total num frames: 1247150080. Throughput: 0: 11059.1. Samples: 311840256. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:40,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:21:43,881][1653645] Updated weights for policy 0, policy_version 609008 (0.0014) [2024-06-15 19:21:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44788.1, 300 sec: 44431.2). Total num frames: 1247346688. Throughput: 0: 10856.5. Samples: 311906304. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:21:46,082][1653645] Updated weights for policy 0, policy_version 609072 (0.0075) [2024-06-15 19:21:48,840][1651596] Signal inference workers to stop experience collection... (31650 times) [2024-06-15 19:21:48,898][1653645] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-06-15 19:21:48,900][1653645] Updated weights for policy 0, policy_version 609123 (0.0022) [2024-06-15 19:21:49,049][1651596] Signal inference workers to resume experience collection... (31650 times) [2024-06-15 19:21:49,050][1653645] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-06-15 19:21:50,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 44783.0, 300 sec: 44653.4). Total num frames: 1247608832. Throughput: 0: 10990.9. Samples: 311939584. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:21:51,203][1653645] Updated weights for policy 0, policy_version 609200 (0.0016) [2024-06-15 19:21:55,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1247805440. Throughput: 0: 10990.9. Samples: 312011776. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:21:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:21:56,801][1653645] Updated weights for policy 0, policy_version 609296 (0.0014) [2024-06-15 19:21:58,019][1653645] Updated weights for policy 0, policy_version 609343 (0.0012) [2024-06-15 19:22:00,930][1653645] Updated weights for policy 0, policy_version 609404 (0.0049) [2024-06-15 19:22:00,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43697.0, 300 sec: 44320.2). Total num frames: 1248034816. Throughput: 0: 11059.2. Samples: 312071680. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:22:02,865][1653645] Updated weights for policy 0, policy_version 609456 (0.0014) [2024-06-15 19:22:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1248198656. Throughput: 0: 11195.7. Samples: 312109056. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:22:06,642][1653645] Updated weights for policy 0, policy_version 609491 (0.0053) [2024-06-15 19:22:09,232][1653645] Updated weights for policy 0, policy_version 609568 (0.0024) [2024-06-15 19:22:10,957][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1248460800. Throughput: 0: 11104.8. Samples: 312173056. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:22:12,521][1653645] Updated weights for policy 0, policy_version 609616 (0.0107) [2024-06-15 19:22:13,493][1653645] Updated weights for policy 0, policy_version 609664 (0.0017) [2024-06-15 19:22:15,512][1653645] Updated weights for policy 0, policy_version 609724 (0.0012) [2024-06-15 19:22:15,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1248722944. Throughput: 0: 11070.6. Samples: 312232960. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:15,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 19:22:20,020][1653645] Updated weights for policy 0, policy_version 609787 (0.0012) [2024-06-15 19:22:20,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1248886784. Throughput: 0: 11093.3. Samples: 312270336. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:22:21,890][1653645] Updated weights for policy 0, policy_version 609852 (0.0013) [2024-06-15 19:22:25,087][1653645] Updated weights for policy 0, policy_version 609891 (0.0026) [2024-06-15 19:22:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1249116160. Throughput: 0: 11047.9. Samples: 312337408. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:22:27,294][1653645] Updated weights for policy 0, policy_version 609968 (0.0013) [2024-06-15 19:22:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1249280000. Throughput: 0: 11002.3. Samples: 312401408. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:22:31,652][1653645] Updated weights for policy 0, policy_version 610032 (0.0114) [2024-06-15 19:22:33,275][1653645] Updated weights for policy 0, policy_version 610080 (0.0014) [2024-06-15 19:22:33,728][1653645] Updated weights for policy 0, policy_version 610106 (0.0013) [2024-06-15 19:22:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1249542144. Throughput: 0: 10934.1. Samples: 312431616. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:22:36,925][1653645] Updated weights for policy 0, policy_version 610164 (0.0011) [2024-06-15 19:22:38,416][1651596] Signal inference workers to stop experience collection... (31700 times) [2024-06-15 19:22:38,508][1653645] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-06-15 19:22:38,519][1653645] Updated weights for policy 0, policy_version 610200 (0.0177) [2024-06-15 19:22:38,592][1651596] Signal inference workers to resume experience collection... (31700 times) [2024-06-15 19:22:38,593][1653645] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-06-15 19:22:39,356][1653645] Updated weights for policy 0, policy_version 610240 (0.0012) [2024-06-15 19:22:40,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 1249771520. Throughput: 0: 10888.5. Samples: 312501760. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:22:42,644][1653645] Updated weights for policy 0, policy_version 610288 (0.0014) [2024-06-15 19:22:44,329][1653645] Updated weights for policy 0, policy_version 610338 (0.0014) [2024-06-15 19:22:45,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 1250033664. Throughput: 0: 11161.6. Samples: 312573952. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:22:48,258][1653645] Updated weights for policy 0, policy_version 610400 (0.0013) [2024-06-15 19:22:49,491][1653645] Updated weights for policy 0, policy_version 610448 (0.0067) [2024-06-15 19:22:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44782.8, 300 sec: 44875.8). Total num frames: 1250295808. Throughput: 0: 11173.0. Samples: 312611840. Policy #0 lag: (min: 15.0, avg: 120.7, max: 271.0) [2024-06-15 19:22:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:22:53,151][1653645] Updated weights for policy 0, policy_version 610502 (0.0017) [2024-06-15 19:22:54,416][1653645] Updated weights for policy 0, policy_version 610558 (0.0018) [2024-06-15 19:22:55,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44237.0, 300 sec: 44542.3). Total num frames: 1250459648. Throughput: 0: 11150.2. Samples: 312674816. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:22:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:22:56,530][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000610608_1250525184.pth... [2024-06-15 19:22:56,591][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000605328_1239711744.pth [2024-06-15 19:22:56,821][1653645] Updated weights for policy 0, policy_version 610617 (0.0087) [2024-06-15 19:23:00,294][1653645] Updated weights for policy 0, policy_version 610657 (0.0012) [2024-06-15 19:23:00,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1250689024. Throughput: 0: 11320.9. Samples: 312742400. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:23:01,903][1653645] Updated weights for policy 0, policy_version 610721 (0.0021) [2024-06-15 19:23:05,179][1653645] Updated weights for policy 0, policy_version 610755 (0.0015) [2024-06-15 19:23:05,968][1648982] Fps is (10 sec: 42555.5, 60 sec: 44775.6, 300 sec: 44207.5). Total num frames: 1250885632. Throughput: 0: 11159.1. Samples: 312772608. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:05,969][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:23:06,401][1653645] Updated weights for policy 0, policy_version 610809 (0.0098) [2024-06-15 19:23:08,283][1653645] Updated weights for policy 0, policy_version 610864 (0.0013) [2024-06-15 19:23:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1251082240. Throughput: 0: 11150.2. Samples: 312839168. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:23:11,941][1653645] Updated weights for policy 0, policy_version 610928 (0.0149) [2024-06-15 19:23:13,865][1653645] Updated weights for policy 0, policy_version 610978 (0.0012) [2024-06-15 19:23:15,958][1648982] Fps is (10 sec: 45920.6, 60 sec: 43690.6, 300 sec: 44542.7). Total num frames: 1251344384. Throughput: 0: 11161.6. Samples: 312903680. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:23:17,534][1653645] Updated weights for policy 0, policy_version 611040 (0.0014) [2024-06-15 19:23:20,324][1653645] Updated weights for policy 0, policy_version 611124 (0.0122) [2024-06-15 19:23:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1251606528. Throughput: 0: 11309.5. Samples: 312940544. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:23:23,827][1653645] Updated weights for policy 0, policy_version 611171 (0.0013) [2024-06-15 19:23:24,728][1651596] Signal inference workers to stop experience collection... (31750 times) [2024-06-15 19:23:24,758][1653645] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-06-15 19:23:24,979][1651596] Signal inference workers to resume experience collection... (31750 times) [2024-06-15 19:23:24,980][1653645] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-06-15 19:23:25,437][1653645] Updated weights for policy 0, policy_version 611232 (0.0107) [2024-06-15 19:23:25,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1251835904. Throughput: 0: 11116.1. Samples: 313001984. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:23:29,507][1653645] Updated weights for policy 0, policy_version 611266 (0.0013) [2024-06-15 19:23:30,710][1653645] Updated weights for policy 0, policy_version 611328 (0.0013) [2024-06-15 19:23:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1251999744. Throughput: 0: 11104.7. Samples: 313073664. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:23:32,242][1653645] Updated weights for policy 0, policy_version 611390 (0.0023) [2024-06-15 19:23:35,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 44236.6, 300 sec: 44543.0). Total num frames: 1252196352. Throughput: 0: 11002.3. Samples: 313106944. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:23:37,396][1653645] Updated weights for policy 0, policy_version 611476 (0.0116) [2024-06-15 19:23:40,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1252392960. Throughput: 0: 10990.9. Samples: 313169408. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:23:41,686][1653645] Updated weights for policy 0, policy_version 611529 (0.0089) [2024-06-15 19:23:43,497][1653645] Updated weights for policy 0, policy_version 611600 (0.0114) [2024-06-15 19:23:44,405][1653645] Updated weights for policy 0, policy_version 611643 (0.0013) [2024-06-15 19:23:45,960][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1252655104. Throughput: 0: 10979.5. Samples: 313236480. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:45,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:23:48,758][1653645] Updated weights for policy 0, policy_version 611712 (0.0012) [2024-06-15 19:23:50,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.8, 300 sec: 44877.4). Total num frames: 1252917248. Throughput: 0: 11016.1. Samples: 313268224. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:23:54,077][1653645] Updated weights for policy 0, policy_version 611779 (0.0015) [2024-06-15 19:23:55,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1253081088. Throughput: 0: 11093.3. Samples: 313338368. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:23:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:23:56,211][1653645] Updated weights for policy 0, policy_version 611872 (0.0093) [2024-06-15 19:23:59,988][1653645] Updated weights for policy 0, policy_version 611921 (0.0031) [2024-06-15 19:24:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1253277696. Throughput: 0: 10899.9. Samples: 313394176. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:24:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:24:02,036][1653645] Updated weights for policy 0, policy_version 612016 (0.0124) [2024-06-15 19:24:05,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 42605.5, 300 sec: 44099.0). Total num frames: 1253441536. Throughput: 0: 10843.0. Samples: 313428480. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:24:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:24:06,309][1653645] Updated weights for policy 0, policy_version 612048 (0.0104) [2024-06-15 19:24:08,056][1651596] Signal inference workers to stop experience collection... (31800 times) [2024-06-15 19:24:08,148][1653645] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-06-15 19:24:08,341][1651596] Signal inference workers to resume experience collection... (31800 times) [2024-06-15 19:24:08,343][1653645] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-06-15 19:24:08,346][1653645] Updated weights for policy 0, policy_version 612128 (0.0015) [2024-06-15 19:24:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1253703680. Throughput: 0: 10786.1. Samples: 313487360. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:24:10,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 19:24:12,556][1653645] Updated weights for policy 0, policy_version 612195 (0.0013) [2024-06-15 19:24:14,347][1653645] Updated weights for policy 0, policy_version 612281 (0.0120) [2024-06-15 19:24:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.8, 300 sec: 44432.5). Total num frames: 1253965824. Throughput: 0: 10683.7. Samples: 313554432. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:24:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:24:20,112][1653645] Updated weights for policy 0, policy_version 612368 (0.0013) [2024-06-15 19:24:20,958][1648982] Fps is (10 sec: 49151.0, 60 sec: 43144.3, 300 sec: 44764.4). Total num frames: 1254195200. Throughput: 0: 10831.6. Samples: 313594368. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:24:20,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:24:25,297][1653645] Updated weights for policy 0, policy_version 612464 (0.0013) [2024-06-15 19:24:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 44320.1). Total num frames: 1254359040. Throughput: 0: 10865.8. Samples: 313658368. Policy #0 lag: (min: 38.0, avg: 148.5, max: 294.0) [2024-06-15 19:24:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:24:30,058][1653645] Updated weights for policy 0, policy_version 612546 (0.0013) [2024-06-15 19:24:30,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 1254555648. Throughput: 0: 10661.0. Samples: 313716224. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:24:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:24:31,538][1653645] Updated weights for policy 0, policy_version 612604 (0.0017) [2024-06-15 19:24:32,780][1653645] Updated weights for policy 0, policy_version 612645 (0.0017) [2024-06-15 19:24:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1254752256. Throughput: 0: 10638.2. Samples: 313746944. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:24:35,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:24:36,343][1653645] Updated weights for policy 0, policy_version 612675 (0.0013) [2024-06-15 19:24:37,804][1653645] Updated weights for policy 0, policy_version 612736 (0.0010) [2024-06-15 19:24:40,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1255014400. Throughput: 0: 10524.5. Samples: 313811968. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:24:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:24:41,655][1653645] Updated weights for policy 0, policy_version 612801 (0.0014) [2024-06-15 19:24:42,921][1653645] Updated weights for policy 0, policy_version 612864 (0.0017) [2024-06-15 19:24:44,794][1653645] Updated weights for policy 0, policy_version 612928 (0.0039) [2024-06-15 19:24:45,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1255276544. Throughput: 0: 10774.7. Samples: 313879040. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:24:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:24:50,633][1653645] Updated weights for policy 0, policy_version 613027 (0.0012) [2024-06-15 19:24:50,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1255505920. Throughput: 0: 10911.3. Samples: 313919488. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:24:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:24:54,003][1653645] Updated weights for policy 0, policy_version 613072 (0.0013) [2024-06-15 19:24:54,150][1651596] Signal inference workers to stop experience collection... (31850 times) [2024-06-15 19:24:54,198][1653645] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-06-15 19:24:54,355][1651596] Signal inference workers to resume experience collection... (31850 times) [2024-06-15 19:24:54,356][1653645] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-06-15 19:24:55,112][1653645] Updated weights for policy 0, policy_version 613120 (0.0012) [2024-06-15 19:24:55,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43144.3, 300 sec: 44209.0). Total num frames: 1255669760. Throughput: 0: 10831.6. Samples: 313974784. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:24:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:24:55,970][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000613120_1255669760.pth... [2024-06-15 19:24:56,046][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000608000_1245184000.pth [2024-06-15 19:24:57,686][1653645] Updated weights for policy 0, policy_version 613183 (0.0014) [2024-06-15 19:25:00,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 1255800832. Throughput: 0: 10911.3. Samples: 314045440. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:25:02,068][1653645] Updated weights for policy 0, policy_version 613248 (0.0015) [2024-06-15 19:25:03,739][1653645] Updated weights for policy 0, policy_version 613304 (0.0013) [2024-06-15 19:25:05,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1256062976. Throughput: 0: 10649.7. Samples: 314073600. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:25:06,472][1653645] Updated weights for policy 0, policy_version 613349 (0.0019) [2024-06-15 19:25:07,955][1653645] Updated weights for policy 0, policy_version 613394 (0.0013) [2024-06-15 19:25:09,081][1653645] Updated weights for policy 0, policy_version 613440 (0.0013) [2024-06-15 19:25:10,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 1256325120. Throughput: 0: 10706.4. Samples: 314140160. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:10,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:25:13,719][1653645] Updated weights for policy 0, policy_version 613500 (0.0016) [2024-06-15 19:25:15,055][1653645] Updated weights for policy 0, policy_version 613541 (0.0017) [2024-06-15 19:25:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1256587264. Throughput: 0: 10956.8. Samples: 314209280. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:25:18,197][1653645] Updated weights for policy 0, policy_version 613616 (0.0012) [2024-06-15 19:25:20,250][1653645] Updated weights for policy 0, policy_version 613684 (0.0019) [2024-06-15 19:25:20,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44237.0, 300 sec: 44209.1). Total num frames: 1256849408. Throughput: 0: 11025.1. Samples: 314243072. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:20,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:25:24,571][1653645] Updated weights for policy 0, policy_version 613728 (0.0097) [2024-06-15 19:25:25,962][1648982] Fps is (10 sec: 39303.3, 60 sec: 43687.3, 300 sec: 43986.2). Total num frames: 1256980480. Throughput: 0: 11092.2. Samples: 314311168. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:25,963][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:25:27,083][1653645] Updated weights for policy 0, policy_version 613792 (0.0012) [2024-06-15 19:25:30,298][1653645] Updated weights for policy 0, policy_version 613872 (0.0167) [2024-06-15 19:25:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1257242624. Throughput: 0: 10934.1. Samples: 314371072. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:30,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 19:25:32,302][1653645] Updated weights for policy 0, policy_version 613939 (0.0123) [2024-06-15 19:25:35,958][1648982] Fps is (10 sec: 39339.4, 60 sec: 43690.6, 300 sec: 43542.7). Total num frames: 1257373696. Throughput: 0: 10740.6. Samples: 314402816. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:25:37,180][1653645] Updated weights for policy 0, policy_version 613988 (0.0084) [2024-06-15 19:25:38,975][1653645] Updated weights for policy 0, policy_version 614048 (0.0022) [2024-06-15 19:25:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43987.9). Total num frames: 1257635840. Throughput: 0: 11150.3. Samples: 314476544. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:25:41,723][1653645] Updated weights for policy 0, policy_version 614104 (0.0014) [2024-06-15 19:25:42,008][1651596] Signal inference workers to stop experience collection... (31900 times) [2024-06-15 19:25:42,060][1653645] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-06-15 19:25:42,325][1651596] Signal inference workers to resume experience collection... (31900 times) [2024-06-15 19:25:42,326][1653645] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-06-15 19:25:43,507][1653645] Updated weights for policy 0, policy_version 614164 (0.0014) [2024-06-15 19:25:45,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1257897984. Throughput: 0: 10945.4. Samples: 314537984. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:25:48,399][1653645] Updated weights for policy 0, policy_version 614224 (0.0012) [2024-06-15 19:25:49,549][1653645] Updated weights for policy 0, policy_version 614273 (0.0017) [2024-06-15 19:25:50,695][1653645] Updated weights for policy 0, policy_version 614335 (0.0011) [2024-06-15 19:25:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1258160128. Throughput: 0: 11218.5. Samples: 314578432. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:25:54,195][1653645] Updated weights for policy 0, policy_version 614400 (0.0015) [2024-06-15 19:25:55,730][1653645] Updated weights for policy 0, policy_version 614459 (0.0013) [2024-06-15 19:25:55,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 45875.3, 300 sec: 44099.2). Total num frames: 1258422272. Throughput: 0: 11241.2. Samples: 314646016. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:25:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:26:00,151][1653645] Updated weights for policy 0, policy_version 614499 (0.0013) [2024-06-15 19:26:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1258553344. Throughput: 0: 11161.6. Samples: 314711552. Policy #0 lag: (min: 15.0, avg: 110.4, max: 271.0) [2024-06-15 19:26:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:26:02,120][1653645] Updated weights for policy 0, policy_version 614576 (0.0013) [2024-06-15 19:26:05,428][1653645] Updated weights for policy 0, policy_version 614644 (0.0013) [2024-06-15 19:26:05,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 45875.1, 300 sec: 44097.9). Total num frames: 1258815488. Throughput: 0: 11229.8. Samples: 314748416. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:26:06,785][1653645] Updated weights for policy 0, policy_version 614688 (0.0013) [2024-06-15 19:26:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1258946560. Throughput: 0: 11242.4. Samples: 314817024. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:26:11,124][1653645] Updated weights for policy 0, policy_version 614724 (0.0011) [2024-06-15 19:26:13,168][1653645] Updated weights for policy 0, policy_version 614801 (0.0012) [2024-06-15 19:26:15,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1259208704. Throughput: 0: 11195.7. Samples: 314874880. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:26:17,536][1653645] Updated weights for policy 0, policy_version 614880 (0.0012) [2024-06-15 19:26:19,112][1653645] Updated weights for policy 0, policy_version 614931 (0.0017) [2024-06-15 19:26:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1259470848. Throughput: 0: 11173.0. Samples: 314905600. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:26:24,398][1653645] Updated weights for policy 0, policy_version 615012 (0.0013) [2024-06-15 19:26:25,622][1653645] Updated weights for policy 0, policy_version 615057 (0.0012) [2024-06-15 19:26:25,955][1651596] Signal inference workers to stop experience collection... (31950 times) [2024-06-15 19:26:25,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44786.4, 300 sec: 44209.0). Total num frames: 1259667456. Throughput: 0: 10945.4. Samples: 314969088. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:26:25,971][1653645] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-06-15 19:26:26,253][1651596] Signal inference workers to resume experience collection... (31950 times) [2024-06-15 19:26:26,254][1653645] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-06-15 19:26:29,047][1653645] Updated weights for policy 0, policy_version 615120 (0.0013) [2024-06-15 19:26:30,139][1653645] Updated weights for policy 0, policy_version 615164 (0.0012) [2024-06-15 19:26:30,959][1648982] Fps is (10 sec: 39317.7, 60 sec: 43689.9, 300 sec: 43986.7). Total num frames: 1259864064. Throughput: 0: 11115.8. Samples: 315038208. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:26:32,435][1653645] Updated weights for policy 0, policy_version 615232 (0.0015) [2024-06-15 19:26:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44783.0, 300 sec: 43764.8). Total num frames: 1260060672. Throughput: 0: 11036.4. Samples: 315075072. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:26:36,898][1653645] Updated weights for policy 0, policy_version 615316 (0.0153) [2024-06-15 19:26:40,421][1653645] Updated weights for policy 0, policy_version 615364 (0.0023) [2024-06-15 19:26:40,958][1648982] Fps is (10 sec: 42602.7, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1260290048. Throughput: 0: 11013.8. Samples: 315141632. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:26:43,094][1653645] Updated weights for policy 0, policy_version 615440 (0.0108) [2024-06-15 19:26:44,121][1653645] Updated weights for policy 0, policy_version 615485 (0.0018) [2024-06-15 19:26:45,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 1260519424. Throughput: 0: 11138.9. Samples: 315212800. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:26:47,659][1653645] Updated weights for policy 0, policy_version 615552 (0.0013) [2024-06-15 19:26:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1260781568. Throughput: 0: 10888.6. Samples: 315238400. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:26:52,471][1653645] Updated weights for policy 0, policy_version 615618 (0.0014) [2024-06-15 19:26:53,278][1653645] Updated weights for policy 0, policy_version 615669 (0.0023) [2024-06-15 19:26:55,016][1653645] Updated weights for policy 0, policy_version 615698 (0.0073) [2024-06-15 19:26:55,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.8, 300 sec: 44097.9). Total num frames: 1261043712. Throughput: 0: 11002.3. Samples: 315312128. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:26:55,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:26:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000615744_1261043712.pth... [2024-06-15 19:26:56,010][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000610608_1250525184.pth [2024-06-15 19:26:58,339][1653645] Updated weights for policy 0, policy_version 615776 (0.0122) [2024-06-15 19:27:00,231][1653645] Updated weights for policy 0, policy_version 615845 (0.0012) [2024-06-15 19:27:00,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1261305856. Throughput: 0: 11127.5. Samples: 315375616. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:27:04,611][1653645] Updated weights for policy 0, policy_version 615920 (0.0113) [2024-06-15 19:27:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 43986.8). Total num frames: 1261436928. Throughput: 0: 11298.1. Samples: 315414016. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:27:07,105][1653645] Updated weights for policy 0, policy_version 615978 (0.0013) [2024-06-15 19:27:09,864][1653645] Updated weights for policy 0, policy_version 616032 (0.0016) [2024-06-15 19:27:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 1261699072. Throughput: 0: 11423.3. Samples: 315483136. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:27:10,958][1651596] Signal inference workers to stop experience collection... (32000 times) [2024-06-15 19:27:11,077][1653645] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-06-15 19:27:11,280][1651596] Signal inference workers to resume experience collection... (32000 times) [2024-06-15 19:27:11,280][1653645] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-06-15 19:27:12,510][1653645] Updated weights for policy 0, policy_version 616121 (0.0013) [2024-06-15 19:27:15,962][1648982] Fps is (10 sec: 42582.2, 60 sec: 44234.0, 300 sec: 43986.3). Total num frames: 1261862912. Throughput: 0: 11251.9. Samples: 315544576. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:15,962][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:27:15,982][1653645] Updated weights for policy 0, policy_version 616147 (0.0012) [2024-06-15 19:27:19,262][1653645] Updated weights for policy 0, policy_version 616224 (0.0105) [2024-06-15 19:27:20,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1262092288. Throughput: 0: 11207.1. Samples: 315579392. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:20,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:27:21,669][1653645] Updated weights for policy 0, policy_version 616272 (0.0033) [2024-06-15 19:27:23,503][1653645] Updated weights for policy 0, policy_version 616336 (0.0093) [2024-06-15 19:27:24,801][1653645] Updated weights for policy 0, policy_version 616380 (0.0012) [2024-06-15 19:27:25,958][1648982] Fps is (10 sec: 49170.9, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1262354432. Throughput: 0: 10979.6. Samples: 315635712. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:27:28,498][1653645] Updated weights for policy 0, policy_version 616442 (0.0014) [2024-06-15 19:27:30,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 43691.4, 300 sec: 43875.8). Total num frames: 1262485504. Throughput: 0: 11036.4. Samples: 315709440. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:27:32,582][1653645] Updated weights for policy 0, policy_version 616510 (0.0038) [2024-06-15 19:27:34,804][1653645] Updated weights for policy 0, policy_version 616561 (0.0012) [2024-06-15 19:27:35,586][1653645] Updated weights for policy 0, policy_version 616592 (0.0013) [2024-06-15 19:27:35,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45329.0, 300 sec: 44098.0). Total num frames: 1262780416. Throughput: 0: 11150.2. Samples: 315740160. Policy #0 lag: (min: 34.0, avg: 140.4, max: 290.0) [2024-06-15 19:27:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:27:39,387][1653645] Updated weights for policy 0, policy_version 616643 (0.0011) [2024-06-15 19:27:40,685][1653645] Updated weights for policy 0, policy_version 616702 (0.0013) [2024-06-15 19:27:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1263009792. Throughput: 0: 10922.7. Samples: 315803648. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:27:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:27:45,164][1653645] Updated weights for policy 0, policy_version 616760 (0.0015) [2024-06-15 19:27:45,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1263140864. Throughput: 0: 10968.2. Samples: 315869184. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:27:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:27:46,390][1653645] Updated weights for policy 0, policy_version 616800 (0.0013) [2024-06-15 19:27:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1263403008. Throughput: 0: 10683.7. Samples: 315894784. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:27:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:27:51,122][1653645] Updated weights for policy 0, policy_version 616898 (0.0015) [2024-06-15 19:27:52,469][1653645] Updated weights for policy 0, policy_version 616960 (0.0018) [2024-06-15 19:27:55,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 41506.3, 300 sec: 43542.6). Total num frames: 1263534080. Throughput: 0: 10729.3. Samples: 315965952. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:27:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:27:58,090][1653645] Updated weights for policy 0, policy_version 617024 (0.0013) [2024-06-15 19:27:58,696][1651596] Signal inference workers to stop experience collection... (32050 times) [2024-06-15 19:27:58,821][1653645] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-06-15 19:27:58,974][1651596] Signal inference workers to resume experience collection... (32050 times) [2024-06-15 19:27:58,974][1653645] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-06-15 19:27:59,640][1653645] Updated weights for policy 0, policy_version 617088 (0.0015) [2024-06-15 19:28:00,863][1653645] Updated weights for policy 0, policy_version 617150 (0.0018) [2024-06-15 19:28:00,982][1648982] Fps is (10 sec: 52299.4, 60 sec: 43672.6, 300 sec: 44206.8). Total num frames: 1263927296. Throughput: 0: 10803.9. Samples: 316030976. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:00,983][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:28:05,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1264058368. Throughput: 0: 10797.6. Samples: 316065280. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:28:08,279][1653645] Updated weights for policy 0, policy_version 617218 (0.0013) [2024-06-15 19:28:10,144][1653645] Updated weights for policy 0, policy_version 617300 (0.0013) [2024-06-15 19:28:10,958][1648982] Fps is (10 sec: 36134.2, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 1264287744. Throughput: 0: 11195.7. Samples: 316139520. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:10,972][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:28:12,243][1653645] Updated weights for policy 0, policy_version 617408 (0.0021) [2024-06-15 19:28:15,411][1653645] Updated weights for policy 0, policy_version 617472 (0.0022) [2024-06-15 19:28:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45332.0, 300 sec: 43986.9). Total num frames: 1264582656. Throughput: 0: 10831.6. Samples: 316196864. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:28:20,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 42052.4, 300 sec: 43320.4). Total num frames: 1264615424. Throughput: 0: 11036.5. Samples: 316236800. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:28:21,878][1653645] Updated weights for policy 0, policy_version 617542 (0.0012) [2024-06-15 19:28:22,934][1653645] Updated weights for policy 0, policy_version 617600 (0.0014) [2024-06-15 19:28:24,349][1653645] Updated weights for policy 0, policy_version 617659 (0.0012) [2024-06-15 19:28:25,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1264975872. Throughput: 0: 11025.0. Samples: 316299776. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:28:27,294][1653645] Updated weights for policy 0, policy_version 617721 (0.0012) [2024-06-15 19:28:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1265106944. Throughput: 0: 11218.5. Samples: 316374016. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:28:33,421][1653645] Updated weights for policy 0, policy_version 617811 (0.0012) [2024-06-15 19:28:35,931][1653645] Updated weights for policy 0, policy_version 617914 (0.0106) [2024-06-15 19:28:35,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1265467392. Throughput: 0: 11343.6. Samples: 316405248. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:28:39,056][1651596] Signal inference workers to stop experience collection... (32100 times) [2024-06-15 19:28:39,098][1653645] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-06-15 19:28:39,279][1651596] Signal inference workers to resume experience collection... (32100 times) [2024-06-15 19:28:39,280][1653645] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-06-15 19:28:40,978][1648982] Fps is (10 sec: 52319.8, 60 sec: 43675.5, 300 sec: 43983.8). Total num frames: 1265631232. Throughput: 0: 10951.7. Samples: 316459008. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:40,979][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:28:44,327][1653645] Updated weights for policy 0, policy_version 617986 (0.0012) [2024-06-15 19:28:45,958][1648982] Fps is (10 sec: 29491.5, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1265762304. Throughput: 0: 11145.0. Samples: 316532224. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:28:46,373][1653645] Updated weights for policy 0, policy_version 618072 (0.0149) [2024-06-15 19:28:48,087][1653645] Updated weights for policy 0, policy_version 618144 (0.0073) [2024-06-15 19:28:50,958][1648982] Fps is (10 sec: 42687.5, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1266057216. Throughput: 0: 10899.9. Samples: 316555776. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:28:51,449][1653645] Updated weights for policy 0, policy_version 618224 (0.0154) [2024-06-15 19:28:55,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.3, 300 sec: 43653.6). Total num frames: 1266155520. Throughput: 0: 10854.3. Samples: 316627968. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:28:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:28:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000618240_1266155520.pth... [2024-06-15 19:28:56,012][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000613120_1255669760.pth [2024-06-15 19:28:57,140][1653645] Updated weights for policy 0, policy_version 618259 (0.0013) [2024-06-15 19:28:59,504][1653645] Updated weights for policy 0, policy_version 618352 (0.0013) [2024-06-15 19:29:00,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 42615.8, 300 sec: 44209.0). Total num frames: 1266483200. Throughput: 0: 10786.1. Samples: 316682240. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:29:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:29:01,510][1653645] Updated weights for policy 0, policy_version 618431 (0.0090) [2024-06-15 19:29:03,951][1653645] Updated weights for policy 0, policy_version 618494 (0.0017) [2024-06-15 19:29:05,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1266679808. Throughput: 0: 10706.5. Samples: 316718592. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:29:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:29:09,873][1653645] Updated weights for policy 0, policy_version 618544 (0.0032) [2024-06-15 19:29:10,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 42598.2, 300 sec: 43653.6). Total num frames: 1266843648. Throughput: 0: 11013.7. Samples: 316795392. Policy #0 lag: (min: 31.0, avg: 145.0, max: 287.0) [2024-06-15 19:29:10,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:29:12,538][1653645] Updated weights for policy 0, policy_version 618642 (0.0119) [2024-06-15 19:29:15,229][1653645] Updated weights for policy 0, policy_version 618704 (0.0056) [2024-06-15 19:29:15,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 1267171328. Throughput: 0: 10524.4. Samples: 316847616. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:29:15,998][1653645] Updated weights for policy 0, policy_version 618743 (0.0014) [2024-06-15 19:29:20,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 1267236864. Throughput: 0: 10831.6. Samples: 316892672. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:29:21,037][1653645] Updated weights for policy 0, policy_version 618784 (0.0014) [2024-06-15 19:29:23,525][1651596] Signal inference workers to stop experience collection... (32150 times) [2024-06-15 19:29:23,556][1653645] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-06-15 19:29:23,772][1651596] Signal inference workers to resume experience collection... (32150 times) [2024-06-15 19:29:23,773][1653645] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-06-15 19:29:23,776][1653645] Updated weights for policy 0, policy_version 618880 (0.0014) [2024-06-15 19:29:25,331][1653645] Updated weights for policy 0, policy_version 618944 (0.0014) [2024-06-15 19:29:25,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.8, 300 sec: 44209.0). Total num frames: 1267597312. Throughput: 0: 10904.9. Samples: 316949504. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:29:27,746][1653645] Updated weights for policy 0, policy_version 619001 (0.0031) [2024-06-15 19:29:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1267728384. Throughput: 0: 10922.7. Samples: 317023744. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:29:33,867][1653645] Updated weights for policy 0, policy_version 619056 (0.0014) [2024-06-15 19:29:35,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 42052.1, 300 sec: 43986.8). Total num frames: 1267990528. Throughput: 0: 11104.6. Samples: 317055488. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:35,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:29:35,966][1653645] Updated weights for policy 0, policy_version 619140 (0.0012) [2024-06-15 19:29:38,357][1653645] Updated weights for policy 0, policy_version 619201 (0.0013) [2024-06-15 19:29:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43705.8, 300 sec: 43986.9). Total num frames: 1268252672. Throughput: 0: 10854.5. Samples: 317116416. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:29:44,577][1653645] Updated weights for policy 0, policy_version 619265 (0.0020) [2024-06-15 19:29:45,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 1268350976. Throughput: 0: 11241.3. Samples: 317188096. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:29:46,630][1653645] Updated weights for policy 0, policy_version 619345 (0.0013) [2024-06-15 19:29:48,251][1653645] Updated weights for policy 0, policy_version 619424 (0.0030) [2024-06-15 19:29:50,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1268678656. Throughput: 0: 10956.8. Samples: 317211648. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:29:51,856][1653645] Updated weights for policy 0, policy_version 619517 (0.0032) [2024-06-15 19:29:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1268776960. Throughput: 0: 10774.8. Samples: 317280256. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:29:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:29:58,045][1653645] Updated weights for policy 0, policy_version 619554 (0.0014) [2024-06-15 19:29:59,936][1653645] Updated weights for policy 0, policy_version 619632 (0.0013) [2024-06-15 19:30:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.7, 300 sec: 44098.0). Total num frames: 1269071872. Throughput: 0: 10831.6. Samples: 317335040. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:30:01,474][1653645] Updated weights for policy 0, policy_version 619711 (0.0015) [2024-06-15 19:30:04,367][1653645] Updated weights for policy 0, policy_version 619766 (0.0013) [2024-06-15 19:30:05,966][1648982] Fps is (10 sec: 52384.3, 60 sec: 43684.4, 300 sec: 43985.6). Total num frames: 1269301248. Throughput: 0: 10624.8. Samples: 317370880. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:05,967][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:30:10,078][1651596] Signal inference workers to stop experience collection... (32200 times) [2024-06-15 19:30:10,128][1653645] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-06-15 19:30:10,289][1651596] Signal inference workers to resume experience collection... (32200 times) [2024-06-15 19:30:10,299][1653645] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-06-15 19:30:10,301][1653645] Updated weights for policy 0, policy_version 619808 (0.0013) [2024-06-15 19:30:10,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 42598.5, 300 sec: 43431.5). Total num frames: 1269399552. Throughput: 0: 10945.4. Samples: 317442048. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:10,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:30:11,396][1653645] Updated weights for policy 0, policy_version 619856 (0.0012) [2024-06-15 19:30:12,855][1653645] Updated weights for policy 0, policy_version 619922 (0.0013) [2024-06-15 19:30:13,523][1653645] Updated weights for policy 0, policy_version 619968 (0.0013) [2024-06-15 19:30:15,678][1653645] Updated weights for policy 0, policy_version 620031 (0.0013) [2024-06-15 19:30:15,958][1648982] Fps is (10 sec: 52474.1, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 1269825536. Throughput: 0: 10729.2. Samples: 317506560. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:30:20,978][1648982] Fps is (10 sec: 42512.5, 60 sec: 43129.9, 300 sec: 43540.2). Total num frames: 1269825536. Throughput: 0: 10872.3. Samples: 317544960. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:20,979][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:30:22,657][1653645] Updated weights for policy 0, policy_version 620117 (0.0042) [2024-06-15 19:30:24,103][1653645] Updated weights for policy 0, policy_version 620179 (0.0037) [2024-06-15 19:30:24,849][1653645] Updated weights for policy 0, policy_version 620223 (0.0010) [2024-06-15 19:30:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1270218752. Throughput: 0: 10831.7. Samples: 317603840. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:30:27,207][1653645] Updated weights for policy 0, policy_version 620284 (0.0016) [2024-06-15 19:30:30,958][1648982] Fps is (10 sec: 52536.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1270349824. Throughput: 0: 10934.1. Samples: 317680128. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:30:33,679][1653645] Updated weights for policy 0, policy_version 620352 (0.0013) [2024-06-15 19:30:35,004][1653645] Updated weights for policy 0, policy_version 620416 (0.0113) [2024-06-15 19:30:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44783.2, 300 sec: 44209.0). Total num frames: 1270677504. Throughput: 0: 11138.8. Samples: 317712896. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:30:36,286][1653645] Updated weights for policy 0, policy_version 620477 (0.0014) [2024-06-15 19:30:37,970][1653645] Updated weights for policy 0, policy_version 620528 (0.0013) [2024-06-15 19:30:40,958][1648982] Fps is (10 sec: 52426.2, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 1270874112. Throughput: 0: 11172.9. Samples: 317783040. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:30:44,215][1653645] Updated weights for policy 0, policy_version 620592 (0.0013) [2024-06-15 19:30:45,062][1653645] Updated weights for policy 0, policy_version 620627 (0.0040) [2024-06-15 19:30:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.3, 300 sec: 43875.8). Total num frames: 1271103488. Throughput: 0: 11503.0. Samples: 317852672. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:30:46,483][1651596] Signal inference workers to stop experience collection... (32250 times) [2024-06-15 19:30:46,526][1653645] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-06-15 19:30:46,721][1651596] Signal inference workers to resume experience collection... (32250 times) [2024-06-15 19:30:46,722][1653645] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-06-15 19:30:46,968][1653645] Updated weights for policy 0, policy_version 620706 (0.0023) [2024-06-15 19:30:48,956][1653645] Updated weights for policy 0, policy_version 620757 (0.0040) [2024-06-15 19:30:50,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 1271398400. Throughput: 0: 11323.1. Samples: 317880320. Policy #0 lag: (min: 15.0, avg: 163.9, max: 271.0) [2024-06-15 19:30:50,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 19:30:55,600][1653645] Updated weights for policy 0, policy_version 620832 (0.0049) [2024-06-15 19:30:55,958][1648982] Fps is (10 sec: 39319.9, 60 sec: 45328.9, 300 sec: 43875.7). Total num frames: 1271496704. Throughput: 0: 11389.1. Samples: 317954560. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:30:55,959][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 19:30:56,337][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000620864_1271529472.pth... [2024-06-15 19:30:56,509][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000615744_1261043712.pth [2024-06-15 19:30:56,514][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000620864_1271529472.pth [2024-06-15 19:30:57,463][1653645] Updated weights for policy 0, policy_version 620899 (0.0012) [2024-06-15 19:30:59,652][1653645] Updated weights for policy 0, policy_version 620990 (0.0013) [2024-06-15 19:31:00,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 1271824384. Throughput: 0: 11138.9. Samples: 318007808. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:31:01,997][1653645] Updated weights for policy 0, policy_version 621052 (0.0014) [2024-06-15 19:31:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43696.8, 300 sec: 43986.9). Total num frames: 1271922688. Throughput: 0: 11007.3. Samples: 318040064. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:31:08,996][1653645] Updated weights for policy 0, policy_version 621104 (0.0014) [2024-06-15 19:31:10,863][1653645] Updated weights for policy 0, policy_version 621168 (0.0011) [2024-06-15 19:31:10,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 45875.4, 300 sec: 43875.8). Total num frames: 1272152064. Throughput: 0: 11229.9. Samples: 318109184. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:31:13,298][1653645] Updated weights for policy 0, policy_version 621251 (0.0014) [2024-06-15 19:31:14,697][1653645] Updated weights for policy 0, policy_version 621306 (0.0012) [2024-06-15 19:31:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1272446976. Throughput: 0: 10672.3. Samples: 318160384. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:31:20,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44251.9, 300 sec: 43431.5). Total num frames: 1272479744. Throughput: 0: 10774.7. Samples: 318197760. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:31:22,501][1653645] Updated weights for policy 0, policy_version 621392 (0.0134) [2024-06-15 19:31:24,271][1653645] Updated weights for policy 0, policy_version 621458 (0.0107) [2024-06-15 19:31:25,727][1653645] Updated weights for policy 0, policy_version 621507 (0.0024) [2024-06-15 19:31:25,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.8, 300 sec: 44098.1). Total num frames: 1272872960. Throughput: 0: 10558.7. Samples: 318258176. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:31:30,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1272971264. Throughput: 0: 10592.7. Samples: 318329344. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:31:32,504][1653645] Updated weights for policy 0, policy_version 621573 (0.0013) [2024-06-15 19:31:33,180][1651596] Signal inference workers to stop experience collection... (32300 times) [2024-06-15 19:31:33,252][1653645] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-06-15 19:31:33,399][1651596] Signal inference workers to resume experience collection... (32300 times) [2024-06-15 19:31:33,400][1653645] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-06-15 19:31:33,601][1653645] Updated weights for policy 0, policy_version 621624 (0.0014) [2024-06-15 19:31:35,197][1653645] Updated weights for policy 0, policy_version 621680 (0.0012) [2024-06-15 19:31:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 42598.4, 300 sec: 43875.8). Total num frames: 1273233408. Throughput: 0: 10854.4. Samples: 318368768. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:31:36,868][1653645] Updated weights for policy 0, policy_version 621744 (0.0013) [2024-06-15 19:31:38,578][1653645] Updated weights for policy 0, policy_version 621815 (0.0013) [2024-06-15 19:31:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1273495552. Throughput: 0: 10353.9. Samples: 318420480. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:31:45,471][1653645] Updated weights for policy 0, policy_version 621872 (0.0015) [2024-06-15 19:31:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 1273593856. Throughput: 0: 10922.7. Samples: 318499328. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:31:47,023][1653645] Updated weights for policy 0, policy_version 621923 (0.0106) [2024-06-15 19:31:48,775][1653645] Updated weights for policy 0, policy_version 621987 (0.0017) [2024-06-15 19:31:49,994][1653645] Updated weights for policy 0, policy_version 622048 (0.0012) [2024-06-15 19:31:50,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1274019840. Throughput: 0: 10695.1. Samples: 318521344. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:50,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:31:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.5, 300 sec: 43098.2). Total num frames: 1274019840. Throughput: 0: 10808.9. Samples: 318595584. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:31:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:31:56,874][1653645] Updated weights for policy 0, policy_version 622112 (0.0110) [2024-06-15 19:31:58,535][1653645] Updated weights for policy 0, policy_version 622147 (0.0012) [2024-06-15 19:32:00,750][1653645] Updated weights for policy 0, policy_version 622242 (0.0014) [2024-06-15 19:32:00,958][1648982] Fps is (10 sec: 36045.8, 60 sec: 42598.4, 300 sec: 43875.8). Total num frames: 1274380288. Throughput: 0: 10968.2. Samples: 318653952. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:32:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:32:02,463][1653645] Updated weights for policy 0, policy_version 622322 (0.0134) [2024-06-15 19:32:05,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1274544128. Throughput: 0: 10820.3. Samples: 318684672. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:32:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:32:09,131][1653645] Updated weights for policy 0, policy_version 622369 (0.0047) [2024-06-15 19:32:10,959][1648982] Fps is (10 sec: 32765.0, 60 sec: 42597.8, 300 sec: 43543.0). Total num frames: 1274707968. Throughput: 0: 11150.0. Samples: 318759936. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:32:10,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:32:11,515][1653645] Updated weights for policy 0, policy_version 622452 (0.0087) [2024-06-15 19:32:12,498][1651596] Signal inference workers to stop experience collection... (32350 times) [2024-06-15 19:32:12,558][1653645] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-06-15 19:32:12,752][1651596] Signal inference workers to resume experience collection... (32350 times) [2024-06-15 19:32:12,753][1653645] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-06-15 19:32:13,191][1653645] Updated weights for policy 0, policy_version 622533 (0.0039) [2024-06-15 19:32:14,133][1653645] Updated weights for policy 0, policy_version 622586 (0.0012) [2024-06-15 19:32:15,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1275068416. Throughput: 0: 10899.9. Samples: 318819840. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:32:16,015][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:32:20,958][1648982] Fps is (10 sec: 45879.4, 60 sec: 44783.0, 300 sec: 43431.5). Total num frames: 1275166720. Throughput: 0: 11002.3. Samples: 318863872. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:32:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:32:21,124][1653645] Updated weights for policy 0, policy_version 622649 (0.0017) [2024-06-15 19:32:22,649][1653645] Updated weights for policy 0, policy_version 622688 (0.0107) [2024-06-15 19:32:24,248][1653645] Updated weights for policy 0, policy_version 622768 (0.0013) [2024-06-15 19:32:25,420][1653645] Updated weights for policy 0, policy_version 622818 (0.0111) [2024-06-15 19:32:25,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1275592704. Throughput: 0: 11275.3. Samples: 318927872. Policy #0 lag: (min: 8.0, avg: 68.0, max: 264.0) [2024-06-15 19:32:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:32:30,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 43542.6). Total num frames: 1275625472. Throughput: 0: 11286.7. Samples: 319007232. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:32:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:32:31,851][1653645] Updated weights for policy 0, policy_version 622902 (0.0012) [2024-06-15 19:32:33,964][1653645] Updated weights for policy 0, policy_version 622960 (0.0018) [2024-06-15 19:32:34,982][1653645] Updated weights for policy 0, policy_version 623008 (0.0012) [2024-06-15 19:32:35,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1275985920. Throughput: 0: 11514.4. Samples: 319039488. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:32:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:32:36,987][1653645] Updated weights for policy 0, policy_version 623103 (0.0084) [2024-06-15 19:32:40,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1276116992. Throughput: 0: 11264.0. Samples: 319102464. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:32:40,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:32:43,356][1653645] Updated weights for policy 0, policy_version 623154 (0.0114) [2024-06-15 19:32:44,736][1653645] Updated weights for policy 0, policy_version 623200 (0.0013) [2024-06-15 19:32:45,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 46967.4, 300 sec: 44097.9). Total num frames: 1276411904. Throughput: 0: 11559.8. Samples: 319174144. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:32:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:32:47,549][1653645] Updated weights for policy 0, policy_version 623316 (0.0275) [2024-06-15 19:32:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1276641280. Throughput: 0: 11320.9. Samples: 319194112. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:32:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:32:54,850][1653645] Updated weights for policy 0, policy_version 623378 (0.0016) [2024-06-15 19:32:55,831][1651596] Signal inference workers to stop experience collection... (32400 times) [2024-06-15 19:32:55,885][1653645] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-06-15 19:32:55,907][1653645] Updated weights for policy 0, policy_version 623427 (0.0013) [2024-06-15 19:32:55,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45875.2, 300 sec: 43546.2). Total num frames: 1276772352. Throughput: 0: 11537.3. Samples: 319279104. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:32:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:32:56,069][1651596] Signal inference workers to resume experience collection... (32400 times) [2024-06-15 19:32:56,070][1653645] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-06-15 19:32:56,453][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000623456_1276837888.pth... [2024-06-15 19:32:56,602][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000618240_1266155520.pth [2024-06-15 19:32:57,903][1653645] Updated weights for policy 0, policy_version 623525 (0.0123) [2024-06-15 19:32:59,979][1653645] Updated weights for policy 0, policy_version 623611 (0.0013) [2024-06-15 19:33:00,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1277165568. Throughput: 0: 11423.3. Samples: 319333888. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:33:05,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1277198336. Throughput: 0: 11264.0. Samples: 319370752. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:33:06,652][1653645] Updated weights for policy 0, policy_version 623669 (0.0015) [2024-06-15 19:33:08,840][1653645] Updated weights for policy 0, policy_version 623760 (0.0015) [2024-06-15 19:33:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 48060.4, 300 sec: 44097.9). Total num frames: 1277591552. Throughput: 0: 11309.5. Samples: 319436800. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:33:11,621][1653645] Updated weights for policy 0, policy_version 623857 (0.0104) [2024-06-15 19:33:15,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1277689856. Throughput: 0: 11013.7. Samples: 319502848. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:15,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:33:18,654][1653645] Updated weights for policy 0, policy_version 623920 (0.0012) [2024-06-15 19:33:20,289][1653645] Updated weights for policy 0, policy_version 624000 (0.0015) [2024-06-15 19:33:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 46967.4, 300 sec: 44098.0). Total num frames: 1277984768. Throughput: 0: 11184.3. Samples: 319542784. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:33:22,575][1653645] Updated weights for policy 0, policy_version 624080 (0.0079) [2024-06-15 19:33:23,681][1653645] Updated weights for policy 0, policy_version 624122 (0.0015) [2024-06-15 19:33:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1278214144. Throughput: 0: 10979.5. Samples: 319596544. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:33:30,302][1653645] Updated weights for policy 0, policy_version 624176 (0.0012) [2024-06-15 19:33:30,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 45328.9, 300 sec: 43653.6). Total num frames: 1278345216. Throughput: 0: 11059.1. Samples: 319671808. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:33:31,342][1653645] Updated weights for policy 0, policy_version 624209 (0.0013) [2024-06-15 19:33:33,257][1653645] Updated weights for policy 0, policy_version 624288 (0.0013) [2024-06-15 19:33:35,144][1653645] Updated weights for policy 0, policy_version 624341 (0.0023) [2024-06-15 19:33:35,445][1651596] Signal inference workers to stop experience collection... (32450 times) [2024-06-15 19:33:35,526][1653645] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-06-15 19:33:35,715][1651596] Signal inference workers to resume experience collection... (32450 times) [2024-06-15 19:33:35,716][1653645] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-06-15 19:33:35,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45328.8, 300 sec: 44323.2). Total num frames: 1278705664. Throughput: 0: 11286.7. Samples: 319702016. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:33:40,796][1653645] Updated weights for policy 0, policy_version 624400 (0.0012) [2024-06-15 19:33:40,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 1278771200. Throughput: 0: 11104.7. Samples: 319778816. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:33:43,157][1653645] Updated weights for policy 0, policy_version 624470 (0.0016) [2024-06-15 19:33:44,682][1653645] Updated weights for policy 0, policy_version 624531 (0.0014) [2024-06-15 19:33:45,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1279131648. Throughput: 0: 11184.4. Samples: 319837184. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:33:46,964][1653645] Updated weights for policy 0, policy_version 624592 (0.0012) [2024-06-15 19:33:47,957][1653645] Updated weights for policy 0, policy_version 624629 (0.0012) [2024-06-15 19:33:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1279262720. Throughput: 0: 11093.3. Samples: 319869952. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:33:54,216][1653645] Updated weights for policy 0, policy_version 624698 (0.0014) [2024-06-15 19:33:55,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1279459328. Throughput: 0: 11138.9. Samples: 319938048. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:33:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:33:56,157][1653645] Updated weights for policy 0, policy_version 624754 (0.0013) [2024-06-15 19:33:57,684][1653645] Updated weights for policy 0, policy_version 624823 (0.0014) [2024-06-15 19:33:58,955][1653645] Updated weights for policy 0, policy_version 624864 (0.0011) [2024-06-15 19:34:00,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1279787008. Throughput: 0: 11036.5. Samples: 319999488. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:34:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:34:05,546][1653645] Updated weights for policy 0, policy_version 624916 (0.0033) [2024-06-15 19:34:05,974][1648982] Fps is (10 sec: 39256.5, 60 sec: 44224.6, 300 sec: 44095.5). Total num frames: 1279852544. Throughput: 0: 10975.5. Samples: 320036864. Policy #0 lag: (min: 15.0, avg: 90.8, max: 271.0) [2024-06-15 19:34:05,975][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:34:07,172][1653645] Updated weights for policy 0, policy_version 624976 (0.0015) [2024-06-15 19:34:08,627][1653645] Updated weights for policy 0, policy_version 625040 (0.0012) [2024-06-15 19:34:10,596][1653645] Updated weights for policy 0, policy_version 625105 (0.0012) [2024-06-15 19:34:10,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1280245760. Throughput: 0: 11127.5. Samples: 320097280. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:34:11,583][1653645] Updated weights for policy 0, policy_version 625149 (0.0021) [2024-06-15 19:34:15,958][1648982] Fps is (10 sec: 45951.2, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1280311296. Throughput: 0: 10968.3. Samples: 320165376. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:34:17,926][1653645] Updated weights for policy 0, policy_version 625196 (0.0013) [2024-06-15 19:34:19,486][1653645] Updated weights for policy 0, policy_version 625242 (0.0015) [2024-06-15 19:34:20,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1280606208. Throughput: 0: 11082.0. Samples: 320200704. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:34:21,509][1653645] Updated weights for policy 0, policy_version 625328 (0.0077) [2024-06-15 19:34:21,641][1651596] Signal inference workers to stop experience collection... (32500 times) [2024-06-15 19:34:21,681][1653645] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-06-15 19:34:21,910][1651596] Signal inference workers to resume experience collection... (32500 times) [2024-06-15 19:34:21,911][1653645] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-06-15 19:34:22,801][1653645] Updated weights for policy 0, policy_version 625376 (0.0011) [2024-06-15 19:34:25,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 1280835584. Throughput: 0: 10706.4. Samples: 320260608. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:34:29,214][1653645] Updated weights for policy 0, policy_version 625414 (0.0011) [2024-06-15 19:34:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1280966656. Throughput: 0: 10922.7. Samples: 320328704. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:34:31,385][1653645] Updated weights for policy 0, policy_version 625488 (0.0013) [2024-06-15 19:34:33,270][1653645] Updated weights for policy 0, policy_version 625555 (0.0012) [2024-06-15 19:34:35,274][1653645] Updated weights for policy 0, policy_version 625648 (0.0012) [2024-06-15 19:34:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1281359872. Throughput: 0: 10899.9. Samples: 320360448. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:34:40,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 1281359872. Throughput: 0: 10774.7. Samples: 320422912. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:34:42,679][1653645] Updated weights for policy 0, policy_version 625712 (0.0012) [2024-06-15 19:34:44,051][1653645] Updated weights for policy 0, policy_version 625749 (0.0011) [2024-06-15 19:34:45,478][1653645] Updated weights for policy 0, policy_version 625809 (0.0024) [2024-06-15 19:34:45,957][1648982] Fps is (10 sec: 32768.9, 60 sec: 42598.5, 300 sec: 44098.0). Total num frames: 1281687552. Throughput: 0: 10865.8. Samples: 320488448. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:34:47,865][1653645] Updated weights for policy 0, policy_version 625904 (0.0013) [2024-06-15 19:34:50,996][1648982] Fps is (10 sec: 52230.1, 60 sec: 43662.9, 300 sec: 44425.5). Total num frames: 1281884160. Throughput: 0: 10553.5. Samples: 320512000. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:50,998][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:34:54,374][1653645] Updated weights for policy 0, policy_version 625953 (0.0042) [2024-06-15 19:34:55,958][1648982] Fps is (10 sec: 32765.7, 60 sec: 42597.9, 300 sec: 43875.7). Total num frames: 1282015232. Throughput: 0: 10865.6. Samples: 320586240. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:34:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:34:56,508][1653645] Updated weights for policy 0, policy_version 626004 (0.0013) [2024-06-15 19:34:56,716][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000626016_1282080768.pth... [2024-06-15 19:34:56,825][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000620864_1271529472.pth [2024-06-15 19:34:58,644][1653645] Updated weights for policy 0, policy_version 626083 (0.0016) [2024-06-15 19:35:00,307][1653645] Updated weights for policy 0, policy_version 626160 (0.0012) [2024-06-15 19:35:00,958][1648982] Fps is (10 sec: 52627.8, 60 sec: 43690.5, 300 sec: 44432.4). Total num frames: 1282408448. Throughput: 0: 10626.8. Samples: 320643584. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:35:05,689][1653645] Updated weights for policy 0, policy_version 626196 (0.0022) [2024-06-15 19:35:05,958][1648982] Fps is (10 sec: 45878.3, 60 sec: 43702.7, 300 sec: 44320.2). Total num frames: 1282473984. Throughput: 0: 10774.8. Samples: 320685568. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:35:07,978][1651596] Signal inference workers to stop experience collection... (32550 times) [2024-06-15 19:35:08,079][1653645] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-06-15 19:35:08,090][1653645] Updated weights for policy 0, policy_version 626264 (0.0013) [2024-06-15 19:35:08,197][1651596] Signal inference workers to resume experience collection... (32550 times) [2024-06-15 19:35:08,198][1653645] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-06-15 19:35:10,439][1653645] Updated weights for policy 0, policy_version 626352 (0.0139) [2024-06-15 19:35:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 42598.1, 300 sec: 43986.8). Total num frames: 1282801664. Throughput: 0: 10922.7. Samples: 320752128. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:10,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:35:12,300][1653645] Updated weights for policy 0, policy_version 626428 (0.0014) [2024-06-15 19:35:15,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 44434.3). Total num frames: 1282932736. Throughput: 0: 10865.8. Samples: 320817664. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:35:18,223][1653645] Updated weights for policy 0, policy_version 626489 (0.0019) [2024-06-15 19:35:20,800][1653645] Updated weights for policy 0, policy_version 626546 (0.0015) [2024-06-15 19:35:20,958][1648982] Fps is (10 sec: 36046.1, 60 sec: 42598.4, 300 sec: 43875.8). Total num frames: 1283162112. Throughput: 0: 10888.6. Samples: 320850432. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:35:22,663][1653645] Updated weights for policy 0, policy_version 626625 (0.0012) [2024-06-15 19:35:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1283457024. Throughput: 0: 10740.6. Samples: 320906240. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:35:29,792][1653645] Updated weights for policy 0, policy_version 626690 (0.0022) [2024-06-15 19:35:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 43653.6). Total num frames: 1283555328. Throughput: 0: 10865.7. Samples: 320977408. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:35:31,969][1653645] Updated weights for policy 0, policy_version 626754 (0.0013) [2024-06-15 19:35:33,605][1653645] Updated weights for policy 0, policy_version 626818 (0.0013) [2024-06-15 19:35:35,429][1653645] Updated weights for policy 0, policy_version 626896 (0.0011) [2024-06-15 19:35:35,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 1283915776. Throughput: 0: 11000.2. Samples: 321006592. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:35,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:35:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 1283981312. Throughput: 0: 10763.5. Samples: 321070592. Policy #0 lag: (min: 47.0, avg: 110.0, max: 303.0) [2024-06-15 19:35:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:35:42,087][1653645] Updated weights for policy 0, policy_version 626947 (0.0013) [2024-06-15 19:35:43,267][1653645] Updated weights for policy 0, policy_version 627004 (0.0015) [2024-06-15 19:35:44,848][1653645] Updated weights for policy 0, policy_version 627067 (0.0014) [2024-06-15 19:35:45,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 43144.4, 300 sec: 43653.6). Total num frames: 1284276224. Throughput: 0: 11025.1. Samples: 321139712. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:35:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:35:46,634][1653645] Updated weights for policy 0, policy_version 627120 (0.0115) [2024-06-15 19:35:47,615][1653645] Updated weights for policy 0, policy_version 627158 (0.0012) [2024-06-15 19:35:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43718.4, 300 sec: 44098.0). Total num frames: 1284505600. Throughput: 0: 10717.8. Samples: 321167872. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:35:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:35:52,497][1651596] Signal inference workers to stop experience collection... (32600 times) [2024-06-15 19:35:52,572][1653645] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-06-15 19:35:52,743][1651596] Signal inference workers to resume experience collection... (32600 times) [2024-06-15 19:35:52,744][1653645] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-06-15 19:35:52,746][1653645] Updated weights for policy 0, policy_version 627216 (0.0021) [2024-06-15 19:35:55,042][1653645] Updated weights for policy 0, policy_version 627269 (0.0012) [2024-06-15 19:35:55,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44783.4, 300 sec: 43653.6). Total num frames: 1284702208. Throughput: 0: 10991.0. Samples: 321246720. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:35:55,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:35:56,370][1653645] Updated weights for policy 0, policy_version 627326 (0.0012) [2024-06-15 19:35:58,268][1653645] Updated weights for policy 0, policy_version 627391 (0.0012) [2024-06-15 19:35:59,814][1653645] Updated weights for policy 0, policy_version 627446 (0.0043) [2024-06-15 19:36:00,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1285029888. Throughput: 0: 10911.3. Samples: 321308672. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:36:04,427][1653645] Updated weights for policy 0, policy_version 627504 (0.0014) [2024-06-15 19:36:05,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 44782.7, 300 sec: 44097.9). Total num frames: 1285160960. Throughput: 0: 11241.2. Samples: 321356288. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:36:06,134][1653645] Updated weights for policy 0, policy_version 627536 (0.0014) [2024-06-15 19:36:08,534][1653645] Updated weights for policy 0, policy_version 627616 (0.0058) [2024-06-15 19:36:10,408][1653645] Updated weights for policy 0, policy_version 627696 (0.0020) [2024-06-15 19:36:10,966][1648982] Fps is (10 sec: 52384.1, 60 sec: 45868.9, 300 sec: 44429.9). Total num frames: 1285554176. Throughput: 0: 11341.5. Samples: 321416704. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:10,967][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:36:15,433][1653645] Updated weights for policy 0, policy_version 627768 (0.0018) [2024-06-15 19:36:15,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1285685248. Throughput: 0: 11548.5. Samples: 321497088. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:36:17,920][1653645] Updated weights for policy 0, policy_version 627824 (0.0011) [2024-06-15 19:36:20,388][1653645] Updated weights for policy 0, policy_version 627908 (0.0013) [2024-06-15 19:36:20,958][1648982] Fps is (10 sec: 45913.0, 60 sec: 47513.4, 300 sec: 44542.2). Total num frames: 1286012928. Throughput: 0: 11662.2. Samples: 321531392. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:36:21,637][1653645] Updated weights for policy 0, policy_version 627961 (0.0012) [2024-06-15 19:36:25,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 1286111232. Throughput: 0: 11616.8. Samples: 321593344. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:36:26,951][1653645] Updated weights for policy 0, policy_version 628022 (0.0017) [2024-06-15 19:36:29,832][1653645] Updated weights for policy 0, policy_version 628064 (0.0016) [2024-06-15 19:36:30,963][1648982] Fps is (10 sec: 32751.5, 60 sec: 46417.2, 300 sec: 44430.4). Total num frames: 1286340608. Throughput: 0: 11808.7. Samples: 321671168. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:30,964][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:36:31,840][1653645] Updated weights for policy 0, policy_version 628144 (0.0017) [2024-06-15 19:36:32,815][1651596] Signal inference workers to stop experience collection... (32650 times) [2024-06-15 19:36:32,855][1653645] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-06-15 19:36:33,059][1651596] Signal inference workers to resume experience collection... (32650 times) [2024-06-15 19:36:33,061][1653645] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-06-15 19:36:33,336][1653645] Updated weights for policy 0, policy_version 628208 (0.0012) [2024-06-15 19:36:35,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1286602752. Throughput: 0: 11673.6. Samples: 321693184. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:36:37,862][1653645] Updated weights for policy 0, policy_version 628256 (0.0014) [2024-06-15 19:36:40,958][1648982] Fps is (10 sec: 39341.9, 60 sec: 45875.1, 300 sec: 44542.2). Total num frames: 1286733824. Throughput: 0: 11662.2. Samples: 321771520. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:36:42,644][1653645] Updated weights for policy 0, policy_version 628358 (0.0014) [2024-06-15 19:36:44,376][1653645] Updated weights for policy 0, policy_version 628432 (0.0012) [2024-06-15 19:36:45,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 47513.7, 300 sec: 44431.2). Total num frames: 1287127040. Throughput: 0: 11377.8. Samples: 321820672. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:36:49,649][1653645] Updated weights for policy 0, policy_version 628496 (0.0013) [2024-06-15 19:36:50,816][1653645] Updated weights for policy 0, policy_version 628544 (0.0012) [2024-06-15 19:36:50,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1287258112. Throughput: 0: 11252.7. Samples: 321862656. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:50,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:36:55,330][1653645] Updated weights for policy 0, policy_version 628624 (0.0012) [2024-06-15 19:36:55,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 1287454720. Throughput: 0: 11436.8. Samples: 321931264. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:36:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:36:56,519][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000628672_1287520256.pth... [2024-06-15 19:36:56,682][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000623456_1276837888.pth [2024-06-15 19:36:57,763][1653645] Updated weights for policy 0, policy_version 628722 (0.0067) [2024-06-15 19:37:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1287651328. Throughput: 0: 10956.8. Samples: 321990144. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:37:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:37:01,975][1653645] Updated weights for policy 0, policy_version 628755 (0.0046) [2024-06-15 19:37:05,959][1648982] Fps is (10 sec: 32764.1, 60 sec: 43689.9, 300 sec: 44320.1). Total num frames: 1287782400. Throughput: 0: 10956.6. Samples: 322024448. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:37:05,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:37:06,228][1653645] Updated weights for policy 0, policy_version 628819 (0.0013) [2024-06-15 19:37:07,908][1653645] Updated weights for policy 0, policy_version 628900 (0.0109) [2024-06-15 19:37:09,889][1653645] Updated weights for policy 0, policy_version 628983 (0.0013) [2024-06-15 19:37:10,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43697.0, 300 sec: 44431.2). Total num frames: 1288175616. Throughput: 0: 11036.4. Samples: 322089984. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:37:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:37:14,555][1653645] Updated weights for policy 0, policy_version 629047 (0.0012) [2024-06-15 19:37:15,958][1648982] Fps is (10 sec: 52435.3, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1288306688. Throughput: 0: 10855.7. Samples: 322159616. Policy #0 lag: (min: 15.0, avg: 82.9, max: 271.0) [2024-06-15 19:37:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:37:18,176][1653645] Updated weights for policy 0, policy_version 629091 (0.0013) [2024-06-15 19:37:18,499][1651596] Signal inference workers to stop experience collection... (32700 times) [2024-06-15 19:37:18,538][1653645] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-06-15 19:37:18,750][1651596] Signal inference workers to resume experience collection... (32700 times) [2024-06-15 19:37:18,750][1653645] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-06-15 19:37:20,073][1653645] Updated weights for policy 0, policy_version 629171 (0.0071) [2024-06-15 19:37:20,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43144.8, 300 sec: 44098.0). Total num frames: 1288601600. Throughput: 0: 11150.2. Samples: 322194944. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:37:21,617][1653645] Updated weights for policy 0, policy_version 629237 (0.0111) [2024-06-15 19:37:25,910][1653645] Updated weights for policy 0, policy_version 629268 (0.0025) [2024-06-15 19:37:25,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1288732672. Throughput: 0: 10752.0. Samples: 322255360. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:25,958][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 19:37:26,717][1653645] Updated weights for policy 0, policy_version 629309 (0.0129) [2024-06-15 19:37:30,780][1653645] Updated weights for policy 0, policy_version 629376 (0.0012) [2024-06-15 19:37:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43694.6, 300 sec: 43986.9). Total num frames: 1288962048. Throughput: 0: 11150.2. Samples: 322322432. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:37:32,903][1653645] Updated weights for policy 0, policy_version 629461 (0.0014) [2024-06-15 19:37:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1289224192. Throughput: 0: 10717.9. Samples: 322344960. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:37:37,296][1653645] Updated weights for policy 0, policy_version 629509 (0.0013) [2024-06-15 19:37:38,660][1653645] Updated weights for policy 0, policy_version 629568 (0.0025) [2024-06-15 19:37:40,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1289355264. Throughput: 0: 10774.7. Samples: 322416128. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:37:42,576][1653645] Updated weights for policy 0, policy_version 629636 (0.0012) [2024-06-15 19:37:44,555][1653645] Updated weights for policy 0, policy_version 629728 (0.0011) [2024-06-15 19:37:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1289748480. Throughput: 0: 10877.2. Samples: 322479616. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:37:50,025][1653645] Updated weights for policy 0, policy_version 629793 (0.0013) [2024-06-15 19:37:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1289879552. Throughput: 0: 10968.4. Samples: 322518016. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:37:53,354][1653645] Updated weights for policy 0, policy_version 629840 (0.0012) [2024-06-15 19:37:55,333][1653645] Updated weights for policy 0, policy_version 629936 (0.0014) [2024-06-15 19:37:55,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1290141696. Throughput: 0: 11047.8. Samples: 322587136. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:37:55,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 19:37:56,833][1653645] Updated weights for policy 0, policy_version 630007 (0.0011) [2024-06-15 19:38:00,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1290272768. Throughput: 0: 10922.7. Samples: 322651136. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:38:01,267][1651596] Signal inference workers to stop experience collection... (32750 times) [2024-06-15 19:38:01,349][1653645] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-06-15 19:38:01,435][1651596] Signal inference workers to resume experience collection... (32750 times) [2024-06-15 19:38:01,437][1653645] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-06-15 19:38:01,970][1653645] Updated weights for policy 0, policy_version 630052 (0.0014) [2024-06-15 19:38:05,210][1653645] Updated weights for policy 0, policy_version 630096 (0.0012) [2024-06-15 19:38:05,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 45329.8, 300 sec: 43764.7). Total num frames: 1290502144. Throughput: 0: 10956.7. Samples: 322688000. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:38:06,524][1653645] Updated weights for policy 0, policy_version 630160 (0.0014) [2024-06-15 19:38:08,356][1653645] Updated weights for policy 0, policy_version 630242 (0.0013) [2024-06-15 19:38:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1290797056. Throughput: 0: 11013.7. Samples: 322750976. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:38:13,311][1653645] Updated weights for policy 0, policy_version 630306 (0.0112) [2024-06-15 19:38:15,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 1290928128. Throughput: 0: 11150.2. Samples: 322824192. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:38:16,248][1653645] Updated weights for policy 0, policy_version 630338 (0.0011) [2024-06-15 19:38:17,269][1653645] Updated weights for policy 0, policy_version 630400 (0.0013) [2024-06-15 19:38:18,208][1653645] Updated weights for policy 0, policy_version 630448 (0.0012) [2024-06-15 19:38:19,264][1653645] Updated weights for policy 0, policy_version 630496 (0.0027) [2024-06-15 19:38:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1291321344. Throughput: 0: 11468.8. Samples: 322861056. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:38:24,404][1653645] Updated weights for policy 0, policy_version 630564 (0.0039) [2024-06-15 19:38:25,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1291452416. Throughput: 0: 11298.1. Samples: 322924544. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:38:27,608][1653645] Updated weights for policy 0, policy_version 630594 (0.0013) [2024-06-15 19:38:28,550][1653645] Updated weights for policy 0, policy_version 630643 (0.0015) [2024-06-15 19:38:29,654][1653645] Updated weights for policy 0, policy_version 630689 (0.0012) [2024-06-15 19:38:30,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 46421.3, 300 sec: 44209.1). Total num frames: 1291747328. Throughput: 0: 11548.4. Samples: 322999296. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:38:31,238][1653645] Updated weights for policy 0, policy_version 630752 (0.0014) [2024-06-15 19:38:34,843][1653645] Updated weights for policy 0, policy_version 630803 (0.0017) [2024-06-15 19:38:35,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1291976704. Throughput: 0: 11400.6. Samples: 323031040. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:38:39,845][1653645] Updated weights for policy 0, policy_version 630880 (0.0014) [2024-06-15 19:38:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 1292107776. Throughput: 0: 11537.1. Samples: 323106304. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:38:41,631][1653645] Updated weights for policy 0, policy_version 630944 (0.0011) [2024-06-15 19:38:41,743][1651596] Signal inference workers to stop experience collection... (32800 times) [2024-06-15 19:38:41,823][1653645] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-06-15 19:38:42,101][1651596] Signal inference workers to resume experience collection... (32800 times) [2024-06-15 19:38:42,102][1653645] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-06-15 19:38:43,344][1653645] Updated weights for policy 0, policy_version 631008 (0.0013) [2024-06-15 19:38:45,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1292402688. Throughput: 0: 11377.8. Samples: 323163136. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:38:46,139][1653645] Updated weights for policy 0, policy_version 631062 (0.0014) [2024-06-15 19:38:46,928][1653645] Updated weights for policy 0, policy_version 631102 (0.0013) [2024-06-15 19:38:50,997][1648982] Fps is (10 sec: 39167.9, 60 sec: 43662.3, 300 sec: 44203.1). Total num frames: 1292500992. Throughput: 0: 11299.7. Samples: 323196928. Policy #0 lag: (min: 23.0, avg: 100.0, max: 279.0) [2024-06-15 19:38:50,998][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:38:52,755][1653645] Updated weights for policy 0, policy_version 631168 (0.0012) [2024-06-15 19:38:53,933][1653645] Updated weights for policy 0, policy_version 631217 (0.0012) [2024-06-15 19:38:55,480][1653645] Updated weights for policy 0, policy_version 631287 (0.0013) [2024-06-15 19:38:55,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1292894208. Throughput: 0: 11457.4. Samples: 323266560. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:38:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:38:55,980][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000631296_1292894208.pth... [2024-06-15 19:38:56,056][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000626016_1282080768.pth [2024-06-15 19:38:58,620][1653645] Updated weights for policy 0, policy_version 631344 (0.0013) [2024-06-15 19:39:00,958][1648982] Fps is (10 sec: 52634.5, 60 sec: 45875.0, 300 sec: 44655.8). Total num frames: 1293025280. Throughput: 0: 11229.9. Samples: 323329536. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:39:04,655][1653645] Updated weights for policy 0, policy_version 631424 (0.0013) [2024-06-15 19:39:05,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 45875.4, 300 sec: 44098.0). Total num frames: 1293254656. Throughput: 0: 11366.4. Samples: 323372544. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:39:06,533][1653645] Updated weights for policy 0, policy_version 631505 (0.0012) [2024-06-15 19:39:09,592][1653645] Updated weights for policy 0, policy_version 631553 (0.0156) [2024-06-15 19:39:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45328.9, 300 sec: 44764.4). Total num frames: 1293516800. Throughput: 0: 11173.0. Samples: 323427328. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:10,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:39:11,052][1653645] Updated weights for policy 0, policy_version 631612 (0.0014) [2024-06-15 19:39:15,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44783.1, 300 sec: 44098.0). Total num frames: 1293615104. Throughput: 0: 11138.8. Samples: 323500544. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:39:16,386][1653645] Updated weights for policy 0, policy_version 631667 (0.0013) [2024-06-15 19:39:17,785][1653645] Updated weights for policy 0, policy_version 631733 (0.0095) [2024-06-15 19:39:19,627][1653645] Updated weights for policy 0, policy_version 631798 (0.0012) [2024-06-15 19:39:20,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 1293942784. Throughput: 0: 11036.4. Samples: 323527680. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:39:22,257][1653645] Updated weights for policy 0, policy_version 631843 (0.0082) [2024-06-15 19:39:25,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1294073856. Throughput: 0: 10752.0. Samples: 323590144. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:39:27,335][1651596] Signal inference workers to stop experience collection... (32850 times) [2024-06-15 19:39:27,401][1653645] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-06-15 19:39:27,486][1651596] Signal inference workers to resume experience collection... (32850 times) [2024-06-15 19:39:27,486][1653645] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-06-15 19:39:27,938][1653645] Updated weights for policy 0, policy_version 631905 (0.0011) [2024-06-15 19:39:30,117][1653645] Updated weights for policy 0, policy_version 631988 (0.0080) [2024-06-15 19:39:30,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1294368768. Throughput: 0: 10990.9. Samples: 323657728. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:39:31,988][1653645] Updated weights for policy 0, policy_version 632052 (0.0196) [2024-06-15 19:39:34,814][1653645] Updated weights for policy 0, policy_version 632096 (0.0142) [2024-06-15 19:39:35,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1294598144. Throughput: 0: 10886.7. Samples: 323686400. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:39:40,958][1648982] Fps is (10 sec: 32766.9, 60 sec: 43144.3, 300 sec: 44097.9). Total num frames: 1294696448. Throughput: 0: 10911.2. Samples: 323757568. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:39:41,073][1653645] Updated weights for policy 0, policy_version 632192 (0.0012) [2024-06-15 19:39:42,840][1653645] Updated weights for policy 0, policy_version 632256 (0.0088) [2024-06-15 19:39:44,293][1653645] Updated weights for policy 0, policy_version 632320 (0.0013) [2024-06-15 19:39:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 44436.9). Total num frames: 1294991360. Throughput: 0: 10558.6. Samples: 323804672. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:45,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 19:39:48,361][1653645] Updated weights for policy 0, policy_version 632380 (0.0010) [2024-06-15 19:39:50,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 43719.2, 300 sec: 44431.3). Total num frames: 1295122432. Throughput: 0: 10387.9. Samples: 323840000. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:39:53,108][1653645] Updated weights for policy 0, policy_version 632437 (0.0012) [2024-06-15 19:39:54,542][1653645] Updated weights for policy 0, policy_version 632483 (0.0011) [2024-06-15 19:39:55,966][1648982] Fps is (10 sec: 45836.3, 60 sec: 42592.4, 300 sec: 44207.8). Total num frames: 1295450112. Throughput: 0: 10704.5. Samples: 323909120. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:39:55,967][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:39:56,188][1653645] Updated weights for policy 0, policy_version 632553 (0.0131) [2024-06-15 19:39:59,455][1653645] Updated weights for policy 0, policy_version 632612 (0.0020) [2024-06-15 19:40:00,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 1295646720. Throughput: 0: 10467.6. Samples: 323971584. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:40:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:40:05,076][1653645] Updated weights for policy 0, policy_version 632672 (0.0014) [2024-06-15 19:40:05,958][1648982] Fps is (10 sec: 32794.9, 60 sec: 42052.0, 300 sec: 43986.9). Total num frames: 1295777792. Throughput: 0: 10695.1. Samples: 324008960. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:40:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:40:06,112][1653645] Updated weights for policy 0, policy_version 632720 (0.0011) [2024-06-15 19:40:07,559][1651596] Signal inference workers to stop experience collection... (32900 times) [2024-06-15 19:40:07,606][1653645] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-06-15 19:40:07,865][1651596] Signal inference workers to resume experience collection... (32900 times) [2024-06-15 19:40:07,878][1653645] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-06-15 19:40:07,880][1653645] Updated weights for policy 0, policy_version 632800 (0.0033) [2024-06-15 19:40:08,694][1653645] Updated weights for policy 0, policy_version 632832 (0.0015) [2024-06-15 19:40:10,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 44431.2). Total num frames: 1296039936. Throughput: 0: 10626.8. Samples: 324068352. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:40:10,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:40:11,896][1653645] Updated weights for policy 0, policy_version 632880 (0.0098) [2024-06-15 19:40:15,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 1296171008. Throughput: 0: 10740.6. Samples: 324141056. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:40:15,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 19:40:17,965][1653645] Updated weights for policy 0, policy_version 632963 (0.0016) [2024-06-15 19:40:19,488][1653645] Updated weights for policy 0, policy_version 633033 (0.0013) [2024-06-15 19:40:20,550][1653645] Updated weights for policy 0, policy_version 633088 (0.0013) [2024-06-15 19:40:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1296564224. Throughput: 0: 10774.7. Samples: 324171264. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:40:20,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:40:23,564][1653645] Updated weights for policy 0, policy_version 633152 (0.0013) [2024-06-15 19:40:25,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1296695296. Throughput: 0: 10581.4. Samples: 324233728. Policy #0 lag: (min: 111.0, avg: 171.2, max: 351.0) [2024-06-15 19:40:25,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 19:40:29,937][1653645] Updated weights for policy 0, policy_version 633232 (0.0014) [2024-06-15 19:40:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 44209.1). Total num frames: 1296957440. Throughput: 0: 11150.2. Samples: 324306432. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:40:30,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:40:31,460][1653645] Updated weights for policy 0, policy_version 633301 (0.0013) [2024-06-15 19:40:34,505][1653645] Updated weights for policy 0, policy_version 633360 (0.0019) [2024-06-15 19:40:35,555][1653645] Updated weights for policy 0, policy_version 633408 (0.0013) [2024-06-15 19:40:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1297219584. Throughput: 0: 10979.6. Samples: 324334080. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:40:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:40:40,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 43144.6, 300 sec: 44097.9). Total num frames: 1297285120. Throughput: 0: 11163.7. Samples: 324411392. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:40:40,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:40:42,136][1653645] Updated weights for policy 0, policy_version 633490 (0.0013) [2024-06-15 19:40:44,337][1653645] Updated weights for policy 0, policy_version 633599 (0.0014) [2024-06-15 19:40:45,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1297612800. Throughput: 0: 11025.0. Samples: 324467712. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:40:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:40:47,247][1653645] Updated weights for policy 0, policy_version 633648 (0.0012) [2024-06-15 19:40:50,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.5, 300 sec: 44209.0). Total num frames: 1297743872. Throughput: 0: 10979.6. Samples: 324503040. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:40:50,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:40:52,110][1653645] Updated weights for policy 0, policy_version 633668 (0.0012) [2024-06-15 19:40:52,990][1651596] Signal inference workers to stop experience collection... (32950 times) [2024-06-15 19:40:53,065][1653645] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-06-15 19:40:53,227][1651596] Signal inference workers to resume experience collection... (32950 times) [2024-06-15 19:40:53,228][1653645] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-06-15 19:40:53,605][1653645] Updated weights for policy 0, policy_version 633744 (0.0016) [2024-06-15 19:40:55,713][1653645] Updated weights for policy 0, policy_version 633840 (0.0082) [2024-06-15 19:40:55,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44242.8, 300 sec: 44320.1). Total num frames: 1298104320. Throughput: 0: 11195.7. Samples: 324572160. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:40:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:40:55,996][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000633856_1298137088.pth... [2024-06-15 19:40:56,038][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000628672_1287520256.pth [2024-06-15 19:40:59,104][1653645] Updated weights for policy 0, policy_version 633904 (0.0011) [2024-06-15 19:41:00,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1298268160. Throughput: 0: 10945.4. Samples: 324633600. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:41:05,341][1653645] Updated weights for policy 0, policy_version 633976 (0.0085) [2024-06-15 19:41:05,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 44236.9, 300 sec: 43654.9). Total num frames: 1298432000. Throughput: 0: 11070.5. Samples: 324669440. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:05,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:41:07,460][1653645] Updated weights for policy 0, policy_version 634053 (0.0012) [2024-06-15 19:41:08,824][1653645] Updated weights for policy 0, policy_version 634112 (0.0017) [2024-06-15 19:41:10,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1298694144. Throughput: 0: 10911.3. Samples: 324724736. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:41:11,731][1653645] Updated weights for policy 0, policy_version 634169 (0.0012) [2024-06-15 19:41:15,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 43690.7, 300 sec: 43320.5). Total num frames: 1298792448. Throughput: 0: 10922.7. Samples: 324797952. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:41:17,543][1653645] Updated weights for policy 0, policy_version 634226 (0.0118) [2024-06-15 19:41:19,375][1653645] Updated weights for policy 0, policy_version 634290 (0.0077) [2024-06-15 19:41:20,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 44209.0). Total num frames: 1299152896. Throughput: 0: 10990.9. Samples: 324828672. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:41:21,198][1653645] Updated weights for policy 0, policy_version 634368 (0.0014) [2024-06-15 19:41:25,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.6, 300 sec: 43987.7). Total num frames: 1299316736. Throughput: 0: 10626.9. Samples: 324889600. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:25,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:41:28,293][1653645] Updated weights for policy 0, policy_version 634433 (0.0012) [2024-06-15 19:41:29,572][1653645] Updated weights for policy 0, policy_version 634486 (0.0015) [2024-06-15 19:41:30,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1299513344. Throughput: 0: 10922.7. Samples: 324959232. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:41:31,631][1653645] Updated weights for policy 0, policy_version 634564 (0.0013) [2024-06-15 19:41:33,042][1653645] Updated weights for policy 0, policy_version 634624 (0.0023) [2024-06-15 19:41:34,610][1651596] Signal inference workers to stop experience collection... (33000 times) [2024-06-15 19:41:34,648][1653645] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-06-15 19:41:34,855][1651596] Signal inference workers to resume experience collection... (33000 times) [2024-06-15 19:41:34,855][1653645] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-06-15 19:41:35,646][1653645] Updated weights for policy 0, policy_version 634683 (0.0014) [2024-06-15 19:41:35,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1299841024. Throughput: 0: 10729.3. Samples: 324985856. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:35,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 19:41:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.7, 300 sec: 43209.3). Total num frames: 1299873792. Throughput: 0: 10695.2. Samples: 325053440. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:41:41,384][1653645] Updated weights for policy 0, policy_version 634724 (0.0014) [2024-06-15 19:41:43,739][1653645] Updated weights for policy 0, policy_version 634786 (0.0013) [2024-06-15 19:41:45,420][1653645] Updated weights for policy 0, policy_version 634864 (0.0012) [2024-06-15 19:41:45,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1300234240. Throughput: 0: 10638.2. Samples: 325112320. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:41:46,245][1653645] Updated weights for policy 0, policy_version 634883 (0.0011) [2024-06-15 19:41:50,967][1648982] Fps is (10 sec: 49108.5, 60 sec: 43684.4, 300 sec: 43763.4). Total num frames: 1300365312. Throughput: 0: 10499.6. Samples: 325142016. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:50,967][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:41:52,842][1653645] Updated weights for policy 0, policy_version 634948 (0.0013) [2024-06-15 19:41:54,270][1653645] Updated weights for policy 0, policy_version 635008 (0.0015) [2024-06-15 19:41:55,958][1648982] Fps is (10 sec: 29490.8, 60 sec: 40414.0, 300 sec: 43653.6). Total num frames: 1300529152. Throughput: 0: 10854.4. Samples: 325213184. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:41:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:41:56,990][1653645] Updated weights for policy 0, policy_version 635076 (0.0022) [2024-06-15 19:42:00,213][1653645] Updated weights for policy 0, policy_version 635190 (0.0098) [2024-06-15 19:42:00,958][1648982] Fps is (10 sec: 52475.5, 60 sec: 43690.6, 300 sec: 44431.4). Total num frames: 1300889600. Throughput: 0: 10376.5. Samples: 325264896. Policy #0 lag: (min: 48.0, avg: 97.5, max: 272.0) [2024-06-15 19:42:00,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:42:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 43209.3). Total num frames: 1300922368. Throughput: 0: 10513.1. Samples: 325301760. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:42:06,790][1653645] Updated weights for policy 0, policy_version 635255 (0.0013) [2024-06-15 19:42:08,374][1653645] Updated weights for policy 0, policy_version 635300 (0.0013) [2024-06-15 19:42:09,286][1653645] Updated weights for policy 0, policy_version 635344 (0.0157) [2024-06-15 19:42:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43144.3, 300 sec: 43986.8). Total num frames: 1301282816. Throughput: 0: 10649.6. Samples: 325368832. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:10,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:42:11,491][1653645] Updated weights for policy 0, policy_version 635411 (0.0079) [2024-06-15 19:42:15,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 1301413888. Throughput: 0: 10604.1. Samples: 325436416. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:42:17,377][1653645] Updated weights for policy 0, policy_version 635460 (0.0021) [2024-06-15 19:42:18,575][1653645] Updated weights for policy 0, policy_version 635519 (0.0012) [2024-06-15 19:42:20,067][1653645] Updated weights for policy 0, policy_version 635583 (0.0016) [2024-06-15 19:42:20,960][1648982] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1301708800. Throughput: 0: 10797.6. Samples: 325471744. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:20,960][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 19:42:21,219][1651596] Signal inference workers to stop experience collection... (33050 times) [2024-06-15 19:42:21,267][1653645] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-06-15 19:42:21,574][1651596] Signal inference workers to resume experience collection... (33050 times) [2024-06-15 19:42:21,574][1653645] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-06-15 19:42:21,943][1653645] Updated weights for policy 0, policy_version 635648 (0.0027) [2024-06-15 19:42:24,194][1653645] Updated weights for policy 0, policy_version 635705 (0.0094) [2024-06-15 19:42:25,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1301938176. Throughput: 0: 10513.1. Samples: 325526528. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:25,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:42:30,418][1653645] Updated weights for policy 0, policy_version 635773 (0.0014) [2024-06-15 19:42:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.4, 300 sec: 43542.6). Total num frames: 1302069248. Throughput: 0: 10774.7. Samples: 325597184. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:42:33,701][1653645] Updated weights for policy 0, policy_version 635856 (0.0014) [2024-06-15 19:42:34,799][1653645] Updated weights for policy 0, policy_version 635892 (0.0012) [2024-06-15 19:42:35,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 42598.5, 300 sec: 44209.1). Total num frames: 1302396928. Throughput: 0: 10776.9. Samples: 325626880. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:42:36,521][1653645] Updated weights for policy 0, policy_version 635957 (0.0021) [2024-06-15 19:42:40,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 1302462464. Throughput: 0: 10524.4. Samples: 325686784. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:40,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:42:42,960][1653645] Updated weights for policy 0, policy_version 636000 (0.0013) [2024-06-15 19:42:43,719][1653645] Updated weights for policy 0, policy_version 636030 (0.0012) [2024-06-15 19:42:45,958][1648982] Fps is (10 sec: 29491.5, 60 sec: 40959.9, 300 sec: 43431.5). Total num frames: 1302691840. Throughput: 0: 10808.9. Samples: 325751296. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:42:46,184][1653645] Updated weights for policy 0, policy_version 636100 (0.0015) [2024-06-15 19:42:47,513][1653645] Updated weights for policy 0, policy_version 636159 (0.0011) [2024-06-15 19:42:49,685][1653645] Updated weights for policy 0, policy_version 636208 (0.0013) [2024-06-15 19:42:50,962][1648982] Fps is (10 sec: 52405.8, 60 sec: 43693.9, 300 sec: 43541.9). Total num frames: 1302986752. Throughput: 0: 10614.4. Samples: 325779456. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:50,963][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:42:55,953][1653645] Updated weights for policy 0, policy_version 636272 (0.0014) [2024-06-15 19:42:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 43431.5). Total num frames: 1303085056. Throughput: 0: 10672.4. Samples: 325849088. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:42:55,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 19:42:56,148][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000636288_1303117824.pth... [2024-06-15 19:42:56,194][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000631296_1292894208.pth [2024-06-15 19:42:58,122][1653645] Updated weights for policy 0, policy_version 636336 (0.0011) [2024-06-15 19:42:59,681][1653645] Updated weights for policy 0, policy_version 636400 (0.0025) [2024-06-15 19:43:00,958][1648982] Fps is (10 sec: 42617.5, 60 sec: 42052.3, 300 sec: 43764.8). Total num frames: 1303412736. Throughput: 0: 10490.3. Samples: 325908480. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:00,960][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 19:43:01,028][1653645] Updated weights for policy 0, policy_version 636436 (0.0013) [2024-06-15 19:43:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 43098.2). Total num frames: 1303511040. Throughput: 0: 10410.7. Samples: 325940224. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:05,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:43:07,441][1653645] Updated weights for policy 0, policy_version 636512 (0.0024) [2024-06-15 19:43:08,246][1653645] Updated weights for policy 0, policy_version 636543 (0.0012) [2024-06-15 19:43:09,975][1651596] Signal inference workers to stop experience collection... (33100 times) [2024-06-15 19:43:10,051][1653645] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-06-15 19:43:10,317][1651596] Signal inference workers to resume experience collection... (33100 times) [2024-06-15 19:43:10,318][1653645] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-06-15 19:43:10,537][1653645] Updated weights for policy 0, policy_version 636610 (0.0175) [2024-06-15 19:43:10,957][1648982] Fps is (10 sec: 39322.1, 60 sec: 42052.5, 300 sec: 43653.7). Total num frames: 1303805952. Throughput: 0: 10695.1. Samples: 326007808. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:10,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:43:14,525][1653645] Updated weights for policy 0, policy_version 636704 (0.0013) [2024-06-15 19:43:15,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 1304035328. Throughput: 0: 10296.8. Samples: 326060544. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:15,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:43:19,689][1653645] Updated weights for policy 0, policy_version 636752 (0.0014) [2024-06-15 19:43:20,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 40413.9, 300 sec: 42987.2). Total num frames: 1304133632. Throughput: 0: 10547.2. Samples: 326101504. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:43:20,980][1653645] Updated weights for policy 0, policy_version 636800 (0.0013) [2024-06-15 19:43:23,333][1653645] Updated weights for policy 0, policy_version 636882 (0.0014) [2024-06-15 19:43:24,292][1653645] Updated weights for policy 0, policy_version 636928 (0.0085) [2024-06-15 19:43:25,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 41506.1, 300 sec: 42987.2). Total num frames: 1304428544. Throughput: 0: 10490.3. Samples: 326158848. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:43:27,557][1653645] Updated weights for policy 0, policy_version 636984 (0.0012) [2024-06-15 19:43:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 42654.0). Total num frames: 1304559616. Throughput: 0: 10649.6. Samples: 326230528. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:43:32,115][1653645] Updated weights for policy 0, policy_version 637040 (0.0022) [2024-06-15 19:43:33,980][1653645] Updated weights for policy 0, policy_version 637072 (0.0146) [2024-06-15 19:43:35,941][1653645] Updated weights for policy 0, policy_version 637168 (0.0119) [2024-06-15 19:43:35,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 1304920064. Throughput: 0: 10912.4. Samples: 326270464. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:43:39,478][1653645] Updated weights for policy 0, policy_version 637244 (0.0071) [2024-06-15 19:43:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 1305083904. Throughput: 0: 10467.6. Samples: 326320128. Policy #0 lag: (min: 9.0, avg: 89.7, max: 265.0) [2024-06-15 19:43:40,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 19:43:44,117][1653645] Updated weights for policy 0, policy_version 637308 (0.0014) [2024-06-15 19:43:45,958][1648982] Fps is (10 sec: 29490.4, 60 sec: 42052.0, 300 sec: 43103.9). Total num frames: 1305214976. Throughput: 0: 10683.6. Samples: 326389248. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:43:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:43:47,355][1653645] Updated weights for policy 0, policy_version 637351 (0.0011) [2024-06-15 19:43:49,308][1653645] Updated weights for policy 0, policy_version 637440 (0.0030) [2024-06-15 19:43:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 41509.2, 300 sec: 42653.9). Total num frames: 1305477120. Throughput: 0: 10661.0. Samples: 326419968. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:43:50,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 19:43:52,500][1653645] Updated weights for policy 0, policy_version 637500 (0.0013) [2024-06-15 19:43:55,929][1653645] Updated weights for policy 0, policy_version 637562 (0.0011) [2024-06-15 19:43:55,960][1648982] Fps is (10 sec: 49153.4, 60 sec: 43690.7, 300 sec: 42987.2). Total num frames: 1305706496. Throughput: 0: 10558.5. Samples: 326482944. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:43:55,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:43:59,015][1651596] Signal inference workers to stop experience collection... (33150 times) [2024-06-15 19:43:59,149][1653645] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-06-15 19:43:59,378][1651596] Signal inference workers to resume experience collection... (33150 times) [2024-06-15 19:43:59,379][1653645] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-06-15 19:44:00,628][1653645] Updated weights for policy 0, policy_version 637648 (0.0012) [2024-06-15 19:44:00,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 42876.1). Total num frames: 1305903104. Throughput: 0: 10843.1. Samples: 326548480. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 19:44:01,669][1653645] Updated weights for policy 0, policy_version 637696 (0.0013) [2024-06-15 19:44:04,655][1653645] Updated weights for policy 0, policy_version 637756 (0.0040) [2024-06-15 19:44:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1306132480. Throughput: 0: 10683.7. Samples: 326582272. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:44:07,457][1653645] Updated weights for policy 0, policy_version 637797 (0.0012) [2024-06-15 19:44:10,842][1653645] Updated weights for policy 0, policy_version 637857 (0.0054) [2024-06-15 19:44:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 43098.2). Total num frames: 1306329088. Throughput: 0: 11070.6. Samples: 326657024. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:44:12,862][1653645] Updated weights for policy 0, policy_version 637942 (0.0013) [2024-06-15 19:44:15,370][1653645] Updated weights for policy 0, policy_version 638000 (0.0013) [2024-06-15 19:44:15,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43691.0, 300 sec: 43098.3). Total num frames: 1306656768. Throughput: 0: 10843.0. Samples: 326718464. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:44:18,697][1653645] Updated weights for policy 0, policy_version 638053 (0.0013) [2024-06-15 19:44:20,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 43098.2). Total num frames: 1306787840. Throughput: 0: 10729.2. Samples: 326753280. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:44:22,466][1653645] Updated weights for policy 0, policy_version 638129 (0.0017) [2024-06-15 19:44:23,802][1653645] Updated weights for policy 0, policy_version 638176 (0.0015) [2024-06-15 19:44:24,678][1653645] Updated weights for policy 0, policy_version 638208 (0.0020) [2024-06-15 19:44:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 43098.3). Total num frames: 1307082752. Throughput: 0: 11059.2. Samples: 326817792. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:44:27,054][1653645] Updated weights for policy 0, policy_version 638270 (0.0013) [2024-06-15 19:44:30,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 45875.0, 300 sec: 43098.2). Total num frames: 1307312128. Throughput: 0: 11093.4. Samples: 326888448. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:44:33,780][1653645] Updated weights for policy 0, policy_version 638343 (0.0015) [2024-06-15 19:44:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 43320.5). Total num frames: 1307475968. Throughput: 0: 11332.3. Samples: 326929920. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:44:36,570][1653645] Updated weights for policy 0, policy_version 638448 (0.0235) [2024-06-15 19:44:39,053][1653645] Updated weights for policy 0, policy_version 638504 (0.0012) [2024-06-15 19:44:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 1307705344. Throughput: 0: 11081.9. Samples: 326981632. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:44:41,720][1651596] Signal inference workers to stop experience collection... (33200 times) [2024-06-15 19:44:41,759][1653645] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-06-15 19:44:41,987][1651596] Signal inference workers to resume experience collection... (33200 times) [2024-06-15 19:44:41,990][1653645] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-06-15 19:44:42,199][1653645] Updated weights for policy 0, policy_version 638590 (0.0014) [2024-06-15 19:44:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43691.0, 300 sec: 43098.3). Total num frames: 1307836416. Throughput: 0: 11218.5. Samples: 327053312. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:44:48,046][1653645] Updated weights for policy 0, policy_version 638642 (0.0011) [2024-06-15 19:44:49,879][1653645] Updated weights for policy 0, policy_version 638720 (0.0014) [2024-06-15 19:44:50,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 45329.0, 300 sec: 43210.6). Total num frames: 1308196864. Throughput: 0: 11264.0. Samples: 327089152. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:44:52,714][1653645] Updated weights for policy 0, policy_version 638789 (0.0012) [2024-06-15 19:44:53,720][1653645] Updated weights for policy 0, policy_version 638845 (0.0012) [2024-06-15 19:44:55,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44236.8, 300 sec: 43098.2). Total num frames: 1308360704. Throughput: 0: 10922.7. Samples: 327148544. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:44:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:44:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000638848_1308360704.pth... [2024-06-15 19:44:56,046][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000633856_1298137088.pth [2024-06-15 19:45:00,503][1653645] Updated weights for policy 0, policy_version 638914 (0.0013) [2024-06-15 19:45:00,958][1648982] Fps is (10 sec: 32768.7, 60 sec: 43690.8, 300 sec: 43209.4). Total num frames: 1308524544. Throughput: 0: 11116.1. Samples: 327218688. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:45:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:45:02,439][1653645] Updated weights for policy 0, policy_version 639024 (0.0094) [2024-06-15 19:45:05,856][1653645] Updated weights for policy 0, policy_version 639101 (0.0013) [2024-06-15 19:45:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 43542.6). Total num frames: 1308884992. Throughput: 0: 10991.0. Samples: 327247872. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:45:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:45:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43144.6, 300 sec: 43209.3). Total num frames: 1308917760. Throughput: 0: 11161.6. Samples: 327320064. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:45:10,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 19:45:12,129][1653645] Updated weights for policy 0, policy_version 639176 (0.0014) [2024-06-15 19:45:13,469][1653645] Updated weights for policy 0, policy_version 639248 (0.0212) [2024-06-15 19:45:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 1309278208. Throughput: 0: 10956.9. Samples: 327381504. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:45:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:45:16,285][1653645] Updated weights for policy 0, policy_version 639297 (0.0016) [2024-06-15 19:45:17,651][1653645] Updated weights for policy 0, policy_version 639356 (0.0031) [2024-06-15 19:45:20,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 1309409280. Throughput: 0: 10820.3. Samples: 327416832. Policy #0 lag: (min: 8.0, avg: 100.6, max: 264.0) [2024-06-15 19:45:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:45:23,335][1653645] Updated weights for policy 0, policy_version 639424 (0.0012) [2024-06-15 19:45:24,409][1651596] Signal inference workers to stop experience collection... (33250 times) [2024-06-15 19:45:24,442][1653645] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-06-15 19:45:24,680][1651596] Signal inference workers to resume experience collection... (33250 times) [2024-06-15 19:45:24,681][1653645] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-06-15 19:45:25,214][1653645] Updated weights for policy 0, policy_version 639507 (0.0014) [2024-06-15 19:45:25,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 1309769728. Throughput: 0: 11241.3. Samples: 327487488. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:45:26,161][1653645] Updated weights for policy 0, policy_version 639550 (0.0014) [2024-06-15 19:45:28,527][1653645] Updated weights for policy 0, policy_version 639608 (0.0012) [2024-06-15 19:45:30,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 1309933568. Throughput: 0: 11195.6. Samples: 327557120. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:30,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:45:34,340][1653645] Updated weights for policy 0, policy_version 639664 (0.0014) [2024-06-15 19:45:35,959][1648982] Fps is (10 sec: 39317.8, 60 sec: 44782.1, 300 sec: 43653.5). Total num frames: 1310162944. Throughput: 0: 11263.8. Samples: 327596032. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:35,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 19:45:37,274][1653645] Updated weights for policy 0, policy_version 639795 (0.0013) [2024-06-15 19:45:39,739][1653645] Updated weights for policy 0, policy_version 639840 (0.0011) [2024-06-15 19:45:40,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45875.3, 300 sec: 43542.6). Total num frames: 1310457856. Throughput: 0: 11184.4. Samples: 327651840. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:40,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 19:45:45,878][1653645] Updated weights for policy 0, policy_version 639891 (0.0016) [2024-06-15 19:45:45,958][1648982] Fps is (10 sec: 32771.3, 60 sec: 44236.7, 300 sec: 43209.4). Total num frames: 1310490624. Throughput: 0: 11275.3. Samples: 327726080. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:45,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:45:48,176][1653645] Updated weights for policy 0, policy_version 639984 (0.0119) [2024-06-15 19:45:49,916][1653645] Updated weights for policy 0, policy_version 640048 (0.0012) [2024-06-15 19:45:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 43209.4). Total num frames: 1310851072. Throughput: 0: 11127.5. Samples: 327748608. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:45:52,604][1653645] Updated weights for policy 0, policy_version 640116 (0.0013) [2024-06-15 19:45:55,959][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 1310982144. Throughput: 0: 10877.1. Samples: 327809536. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:45:55,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:45:58,399][1653645] Updated weights for policy 0, policy_version 640147 (0.0013) [2024-06-15 19:45:59,403][1653645] Updated weights for policy 0, policy_version 640189 (0.0027) [2024-06-15 19:46:00,788][1653645] Updated weights for policy 0, policy_version 640241 (0.0013) [2024-06-15 19:46:00,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 44782.7, 300 sec: 43320.4). Total num frames: 1311211520. Throughput: 0: 11138.8. Samples: 327882752. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:46:03,774][1653645] Updated weights for policy 0, policy_version 640326 (0.0013) [2024-06-15 19:46:05,184][1653645] Updated weights for policy 0, policy_version 640382 (0.0012) [2024-06-15 19:46:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1311506432. Throughput: 0: 10877.1. Samples: 327906304. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:46:10,886][1651596] Signal inference workers to stop experience collection... (33300 times) [2024-06-15 19:46:10,938][1653645] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-06-15 19:46:10,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 43690.4, 300 sec: 43209.3). Total num frames: 1311539200. Throughput: 0: 10843.0. Samples: 327975424. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:10,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:46:11,229][1651596] Signal inference workers to resume experience collection... (33300 times) [2024-06-15 19:46:11,230][1653645] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-06-15 19:46:12,199][1653645] Updated weights for policy 0, policy_version 640448 (0.0123) [2024-06-15 19:46:14,710][1653645] Updated weights for policy 0, policy_version 640544 (0.0013) [2024-06-15 19:46:15,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 1311899648. Throughput: 0: 10490.4. Samples: 328029184. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:15,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:46:17,001][1653645] Updated weights for policy 0, policy_version 640609 (0.0083) [2024-06-15 19:46:20,958][1648982] Fps is (10 sec: 49153.7, 60 sec: 43690.6, 300 sec: 43098.3). Total num frames: 1312030720. Throughput: 0: 10297.1. Samples: 328059392. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:46:23,411][1653645] Updated weights for policy 0, policy_version 640657 (0.0013) [2024-06-15 19:46:25,031][1653645] Updated weights for policy 0, policy_version 640723 (0.0012) [2024-06-15 19:46:25,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 41506.1, 300 sec: 43209.3). Total num frames: 1312260096. Throughput: 0: 10763.4. Samples: 328136192. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 19:46:26,859][1653645] Updated weights for policy 0, policy_version 640800 (0.0149) [2024-06-15 19:46:28,613][1653645] Updated weights for policy 0, policy_version 640850 (0.0015) [2024-06-15 19:46:29,462][1653645] Updated weights for policy 0, policy_version 640896 (0.0012) [2024-06-15 19:46:30,958][1648982] Fps is (10 sec: 52426.5, 60 sec: 43690.5, 300 sec: 43098.2). Total num frames: 1312555008. Throughput: 0: 10330.9. Samples: 328190976. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:30,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:46:35,541][1653645] Updated weights for policy 0, policy_version 640944 (0.0011) [2024-06-15 19:46:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 42053.0, 300 sec: 43431.5). Total num frames: 1312686080. Throughput: 0: 10683.7. Samples: 328229376. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:46:36,913][1653645] Updated weights for policy 0, policy_version 640978 (0.0013) [2024-06-15 19:46:38,643][1653645] Updated weights for policy 0, policy_version 641060 (0.0014) [2024-06-15 19:46:39,690][1653645] Updated weights for policy 0, policy_version 641104 (0.0016) [2024-06-15 19:46:40,548][1653645] Updated weights for policy 0, policy_version 641152 (0.0013) [2024-06-15 19:46:40,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 1313079296. Throughput: 0: 10820.2. Samples: 328296448. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:46:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 43099.5). Total num frames: 1313079296. Throughput: 0: 10831.7. Samples: 328370176. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:46:47,043][1653645] Updated weights for policy 0, policy_version 641205 (0.0012) [2024-06-15 19:46:48,456][1653645] Updated weights for policy 0, policy_version 641254 (0.0012) [2024-06-15 19:46:49,681][1653645] Updated weights for policy 0, policy_version 641315 (0.0013) [2024-06-15 19:46:50,472][1651596] Signal inference workers to stop experience collection... (33350 times) [2024-06-15 19:46:50,524][1653645] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-06-15 19:46:50,805][1651596] Signal inference workers to resume experience collection... (33350 times) [2024-06-15 19:46:50,806][1653645] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-06-15 19:46:50,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1313505280. Throughput: 0: 11138.9. Samples: 328407552. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:46:51,186][1653645] Updated weights for policy 0, policy_version 641376 (0.0014) [2024-06-15 19:46:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 1313603584. Throughput: 0: 11036.5. Samples: 328472064. Policy #0 lag: (min: 13.0, avg: 79.2, max: 269.0) [2024-06-15 19:46:55,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:46:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000641408_1313603584.pth... [2024-06-15 19:46:56,035][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000636288_1303117824.pth [2024-06-15 19:46:56,807][1653645] Updated weights for policy 0, policy_version 641417 (0.0013) [2024-06-15 19:46:57,978][1653645] Updated weights for policy 0, policy_version 641466 (0.0011) [2024-06-15 19:47:00,188][1653645] Updated weights for policy 0, policy_version 641529 (0.0023) [2024-06-15 19:47:00,957][1648982] Fps is (10 sec: 39322.1, 60 sec: 44783.1, 300 sec: 43986.9). Total num frames: 1313898496. Throughput: 0: 11525.7. Samples: 328547840. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:47:01,679][1653645] Updated weights for policy 0, policy_version 641593 (0.0014) [2024-06-15 19:47:03,234][1653645] Updated weights for policy 0, policy_version 641650 (0.0013) [2024-06-15 19:47:05,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1314127872. Throughput: 0: 11480.2. Samples: 328576000. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:47:08,621][1653645] Updated weights for policy 0, policy_version 641696 (0.0012) [2024-06-15 19:47:10,859][1653645] Updated weights for policy 0, policy_version 641748 (0.0011) [2024-06-15 19:47:10,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45875.5, 300 sec: 43653.7). Total num frames: 1314291712. Throughput: 0: 11377.8. Samples: 328648192. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:47:12,444][1653645] Updated weights for policy 0, policy_version 641808 (0.0018) [2024-06-15 19:47:14,972][1653645] Updated weights for policy 0, policy_version 641872 (0.0014) [2024-06-15 19:47:15,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 1314619392. Throughput: 0: 11446.2. Samples: 328706048. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:47:20,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44236.7, 300 sec: 43209.3). Total num frames: 1314684928. Throughput: 0: 11366.4. Samples: 328740864. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:20,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:47:21,222][1653645] Updated weights for policy 0, policy_version 641955 (0.0041) [2024-06-15 19:47:22,992][1653645] Updated weights for policy 0, policy_version 642002 (0.0013) [2024-06-15 19:47:25,269][1653645] Updated weights for policy 0, policy_version 642096 (0.0020) [2024-06-15 19:47:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 43986.9). Total num frames: 1315045376. Throughput: 0: 11252.6. Samples: 328802816. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:47:28,507][1653645] Updated weights for policy 0, policy_version 642175 (0.0020) [2024-06-15 19:47:30,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.9, 300 sec: 43320.4). Total num frames: 1315176448. Throughput: 0: 11229.8. Samples: 328875520. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:47:32,940][1653645] Updated weights for policy 0, policy_version 642233 (0.0014) [2024-06-15 19:47:35,894][1653645] Updated weights for policy 0, policy_version 642304 (0.0015) [2024-06-15 19:47:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1315438592. Throughput: 0: 11093.3. Samples: 328906752. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:47:36,036][1651596] Signal inference workers to stop experience collection... (33400 times) [2024-06-15 19:47:36,063][1653645] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-06-15 19:47:36,320][1651596] Signal inference workers to resume experience collection... (33400 times) [2024-06-15 19:47:36,321][1653645] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-06-15 19:47:37,436][1653645] Updated weights for policy 0, policy_version 642365 (0.0011) [2024-06-15 19:47:40,656][1653645] Updated weights for policy 0, policy_version 642416 (0.0011) [2024-06-15 19:47:40,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1315700736. Throughput: 0: 10968.2. Samples: 328965632. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:47:44,427][1653645] Updated weights for policy 0, policy_version 642480 (0.0013) [2024-06-15 19:47:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45875.3, 300 sec: 43543.2). Total num frames: 1315831808. Throughput: 0: 11002.3. Samples: 329042944. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:47:46,613][1653645] Updated weights for policy 0, policy_version 642498 (0.0012) [2024-06-15 19:47:48,463][1653645] Updated weights for policy 0, policy_version 642576 (0.0014) [2024-06-15 19:47:49,405][1653645] Updated weights for policy 0, policy_version 642623 (0.0012) [2024-06-15 19:47:50,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1316126720. Throughput: 0: 10990.9. Samples: 329070592. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:47:55,215][1653645] Updated weights for policy 0, policy_version 642689 (0.0013) [2024-06-15 19:47:55,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44783.0, 300 sec: 43653.6). Total num frames: 1316290560. Throughput: 0: 11002.3. Samples: 329143296. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:47:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:47:56,687][1653645] Updated weights for policy 0, policy_version 642752 (0.0050) [2024-06-15 19:48:00,032][1653645] Updated weights for policy 0, policy_version 642832 (0.0172) [2024-06-15 19:48:00,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1316618240. Throughput: 0: 11036.5. Samples: 329202688. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:48:02,141][1653645] Updated weights for policy 0, policy_version 642896 (0.0091) [2024-06-15 19:48:05,997][1648982] Fps is (10 sec: 45696.0, 60 sec: 43662.1, 300 sec: 43869.9). Total num frames: 1316749312. Throughput: 0: 11004.1. Samples: 329236480. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:05,998][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:48:07,211][1653645] Updated weights for policy 0, policy_version 642960 (0.0013) [2024-06-15 19:48:08,152][1653645] Updated weights for policy 0, policy_version 643004 (0.0012) [2024-06-15 19:48:10,576][1653645] Updated weights for policy 0, policy_version 643056 (0.0161) [2024-06-15 19:48:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1317011456. Throughput: 0: 11502.9. Samples: 329320448. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:10,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:48:12,243][1653645] Updated weights for policy 0, policy_version 643136 (0.0092) [2024-06-15 19:48:13,505][1653645] Updated weights for policy 0, policy_version 643197 (0.0026) [2024-06-15 19:48:15,958][1648982] Fps is (10 sec: 52633.8, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 1317273600. Throughput: 0: 11377.7. Samples: 329387520. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:48:19,251][1653645] Updated weights for policy 0, policy_version 643255 (0.0154) [2024-06-15 19:48:20,476][1651596] Signal inference workers to stop experience collection... (33450 times) [2024-06-15 19:48:20,530][1653645] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-06-15 19:48:20,666][1651596] Signal inference workers to resume experience collection... (33450 times) [2024-06-15 19:48:20,667][1653645] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-06-15 19:48:20,669][1653645] Updated weights for policy 0, policy_version 643296 (0.0045) [2024-06-15 19:48:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 46421.5, 300 sec: 44209.0). Total num frames: 1317470208. Throughput: 0: 11719.1. Samples: 329434112. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:48:21,911][1653645] Updated weights for policy 0, policy_version 643347 (0.0211) [2024-06-15 19:48:23,360][1653645] Updated weights for policy 0, policy_version 643410 (0.0014) [2024-06-15 19:48:25,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1317797888. Throughput: 0: 11616.7. Samples: 329488384. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:25,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:48:29,853][1653645] Updated weights for policy 0, policy_version 643473 (0.0012) [2024-06-15 19:48:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45329.2, 300 sec: 43986.9). Total num frames: 1317896192. Throughput: 0: 11741.8. Samples: 329571328. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:30,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 19:48:31,863][1653645] Updated weights for policy 0, policy_version 643557 (0.0109) [2024-06-15 19:48:33,033][1653645] Updated weights for policy 0, policy_version 643616 (0.0014) [2024-06-15 19:48:35,285][1653645] Updated weights for policy 0, policy_version 643696 (0.0015) [2024-06-15 19:48:35,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 48059.8, 300 sec: 44875.5). Total num frames: 1318322176. Throughput: 0: 11650.9. Samples: 329594880. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:48:40,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1318322176. Throughput: 0: 11730.5. Samples: 329671168. Policy #0 lag: (min: 15.0, avg: 103.2, max: 271.0) [2024-06-15 19:48:40,958][1648982] Avg episode reward: [(0, '36.820')] [2024-06-15 19:48:41,530][1653645] Updated weights for policy 0, policy_version 643744 (0.0012) [2024-06-15 19:48:42,859][1653645] Updated weights for policy 0, policy_version 643808 (0.0012) [2024-06-15 19:48:44,693][1653645] Updated weights for policy 0, policy_version 643890 (0.0013) [2024-06-15 19:48:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 48605.8, 300 sec: 44986.6). Total num frames: 1318748160. Throughput: 0: 11867.0. Samples: 329736704. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:48:45,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 19:48:46,232][1653645] Updated weights for policy 0, policy_version 643938 (0.0013) [2024-06-15 19:48:50,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 1318846464. Throughput: 0: 11854.6. Samples: 329769472. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:48:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:48:53,336][1653645] Updated weights for policy 0, policy_version 644016 (0.0021) [2024-06-15 19:48:54,533][1653645] Updated weights for policy 0, policy_version 644069 (0.0011) [2024-06-15 19:48:55,824][1653645] Updated weights for policy 0, policy_version 644128 (0.0014) [2024-06-15 19:48:55,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 48059.7, 300 sec: 44986.6). Total num frames: 1319174144. Throughput: 0: 11685.0. Samples: 329846272. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:48:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:48:56,496][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000644160_1319239680.pth... [2024-06-15 19:48:56,668][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000638848_1308360704.pth [2024-06-15 19:48:57,724][1653645] Updated weights for policy 0, policy_version 644214 (0.0125) [2024-06-15 19:49:00,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1319370752. Throughput: 0: 11651.0. Samples: 329911808. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:49:03,703][1651596] Signal inference workers to stop experience collection... (33500 times) [2024-06-15 19:49:03,740][1653645] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-06-15 19:49:03,913][1651596] Signal inference workers to resume experience collection... (33500 times) [2024-06-15 19:49:03,914][1653645] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-06-15 19:49:04,282][1653645] Updated weights for policy 0, policy_version 644256 (0.0013) [2024-06-15 19:49:05,935][1653645] Updated weights for policy 0, policy_version 644327 (0.0014) [2024-06-15 19:49:05,963][1648982] Fps is (10 sec: 39303.1, 60 sec: 46994.5, 300 sec: 44874.8). Total num frames: 1319567360. Throughput: 0: 11558.6. Samples: 329954304. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:05,963][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:49:07,055][1653645] Updated weights for policy 0, policy_version 644374 (0.0060) [2024-06-15 19:49:08,596][1653645] Updated weights for policy 0, policy_version 644437 (0.0014) [2024-06-15 19:49:10,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 48059.7, 300 sec: 44875.5). Total num frames: 1319895040. Throughput: 0: 11628.1. Samples: 330011648. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:49:15,219][1653645] Updated weights for policy 0, policy_version 644483 (0.0014) [2024-06-15 19:49:15,958][1648982] Fps is (10 sec: 39340.0, 60 sec: 44783.1, 300 sec: 44653.3). Total num frames: 1319960576. Throughput: 0: 11525.7. Samples: 330089984. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:49:17,313][1653645] Updated weights for policy 0, policy_version 644580 (0.0015) [2024-06-15 19:49:18,093][1653645] Updated weights for policy 0, policy_version 644624 (0.0013) [2024-06-15 19:49:19,066][1653645] Updated weights for policy 0, policy_version 644671 (0.0010) [2024-06-15 19:49:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 48059.7, 300 sec: 44986.6). Total num frames: 1320353792. Throughput: 0: 11616.7. Samples: 330117632. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:49:20,988][1653645] Updated weights for policy 0, policy_version 644720 (0.0013) [2024-06-15 19:49:25,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1320419328. Throughput: 0: 11525.7. Samples: 330189824. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 19:49:26,786][1653645] Updated weights for policy 0, policy_version 644752 (0.0023) [2024-06-15 19:49:27,895][1653645] Updated weights for policy 0, policy_version 644801 (0.0012) [2024-06-15 19:49:29,007][1653645] Updated weights for policy 0, policy_version 644864 (0.0013) [2024-06-15 19:49:30,129][1653645] Updated weights for policy 0, policy_version 644915 (0.0012) [2024-06-15 19:49:30,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 48606.0, 300 sec: 45208.7). Total num frames: 1320812544. Throughput: 0: 11650.9. Samples: 330260992. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:49:31,992][1653645] Updated weights for policy 0, policy_version 644960 (0.0013) [2024-06-15 19:49:35,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1320943616. Throughput: 0: 11707.8. Samples: 330296320. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:35,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:49:37,161][1653645] Updated weights for policy 0, policy_version 644994 (0.0010) [2024-06-15 19:49:38,015][1653645] Updated weights for policy 0, policy_version 645040 (0.0012) [2024-06-15 19:49:38,883][1653645] Updated weights for policy 0, policy_version 645075 (0.0041) [2024-06-15 19:49:40,558][1651596] Signal inference workers to stop experience collection... (33550 times) [2024-06-15 19:49:40,608][1653645] Updated weights for policy 0, policy_version 645156 (0.0016) [2024-06-15 19:49:40,651][1653645] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-06-15 19:49:40,819][1651596] Signal inference workers to resume experience collection... (33550 times) [2024-06-15 19:49:40,820][1653645] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-06-15 19:49:40,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 49698.0, 300 sec: 45653.0). Total num frames: 1321304064. Throughput: 0: 11616.7. Samples: 330369024. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:49:43,718][1653645] Updated weights for policy 0, policy_version 645232 (0.0089) [2024-06-15 19:49:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1321467904. Throughput: 0: 11662.2. Samples: 330436608. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:49:49,046][1653645] Updated weights for policy 0, policy_version 645285 (0.0021) [2024-06-15 19:49:50,105][1653645] Updated weights for policy 0, policy_version 645333 (0.0013) [2024-06-15 19:49:50,958][1648982] Fps is (10 sec: 42595.9, 60 sec: 48059.3, 300 sec: 45319.7). Total num frames: 1321730048. Throughput: 0: 11583.6. Samples: 330475520. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:49:51,358][1653645] Updated weights for policy 0, policy_version 645379 (0.0013) [2024-06-15 19:49:52,468][1653645] Updated weights for policy 0, policy_version 645440 (0.0013) [2024-06-15 19:49:55,335][1653645] Updated weights for policy 0, policy_version 645500 (0.0014) [2024-06-15 19:49:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 46967.5, 300 sec: 45653.0). Total num frames: 1321992192. Throughput: 0: 11753.2. Samples: 330540544. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:49:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:50:00,329][1653645] Updated weights for policy 0, policy_version 645562 (0.0012) [2024-06-15 19:50:00,958][1648982] Fps is (10 sec: 39324.3, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1322123264. Throughput: 0: 11582.6. Samples: 330611200. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:50:00,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:50:02,059][1653645] Updated weights for policy 0, policy_version 645620 (0.0152) [2024-06-15 19:50:04,008][1653645] Updated weights for policy 0, policy_version 645685 (0.0100) [2024-06-15 19:50:05,333][1653645] Updated weights for policy 0, policy_version 645712 (0.0011) [2024-06-15 19:50:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 48063.6, 300 sec: 45875.2). Total num frames: 1322450944. Throughput: 0: 11707.7. Samples: 330644480. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:50:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:50:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1322516480. Throughput: 0: 11616.7. Samples: 330712576. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:50:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:50:11,039][1653645] Updated weights for policy 0, policy_version 645764 (0.0026) [2024-06-15 19:50:11,988][1653645] Updated weights for policy 0, policy_version 645818 (0.0030) [2024-06-15 19:50:13,780][1653645] Updated weights for policy 0, policy_version 645885 (0.0013) [2024-06-15 19:50:15,286][1653645] Updated weights for policy 0, policy_version 645936 (0.0012) [2024-06-15 19:50:15,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 49152.0, 300 sec: 45764.1). Total num frames: 1322909696. Throughput: 0: 11605.3. Samples: 330783232. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:50:15,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:50:15,970][1653645] Updated weights for policy 0, policy_version 645953 (0.0011) [2024-06-15 19:50:17,468][1653645] Updated weights for policy 0, policy_version 646012 (0.0013) [2024-06-15 19:50:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1323040768. Throughput: 0: 11582.6. Samples: 330817536. Policy #0 lag: (min: 15.0, avg: 69.2, max: 271.0) [2024-06-15 19:50:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:50:23,119][1653645] Updated weights for policy 0, policy_version 646054 (0.0014) [2024-06-15 19:50:24,734][1653645] Updated weights for policy 0, policy_version 646119 (0.0013) [2024-06-15 19:50:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 48059.6, 300 sec: 45319.8). Total num frames: 1323302912. Throughput: 0: 11514.3. Samples: 330887168. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:50:27,095][1653645] Updated weights for policy 0, policy_version 646176 (0.0013) [2024-06-15 19:50:27,589][1651596] Signal inference workers to stop experience collection... (33600 times) [2024-06-15 19:50:27,638][1653645] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-06-15 19:50:27,793][1651596] Signal inference workers to resume experience collection... (33600 times) [2024-06-15 19:50:27,794][1653645] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-06-15 19:50:28,666][1653645] Updated weights for policy 0, policy_version 646240 (0.0013) [2024-06-15 19:50:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.1, 300 sec: 45431.0). Total num frames: 1323565056. Throughput: 0: 11434.7. Samples: 330951168. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:30,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 19:50:33,633][1653645] Updated weights for policy 0, policy_version 646274 (0.0010) [2024-06-15 19:50:34,756][1653645] Updated weights for policy 0, policy_version 646336 (0.0021) [2024-06-15 19:50:35,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1323696128. Throughput: 0: 11469.0. Samples: 330991616. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:50:37,098][1653645] Updated weights for policy 0, policy_version 646389 (0.0027) [2024-06-15 19:50:39,413][1653645] Updated weights for policy 0, policy_version 646449 (0.0013) [2024-06-15 19:50:40,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45875.3, 300 sec: 45986.3). Total num frames: 1324056576. Throughput: 0: 11286.8. Samples: 331048448. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:50:41,117][1653645] Updated weights for policy 0, policy_version 646518 (0.0025) [2024-06-15 19:50:45,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 44236.6, 300 sec: 44986.5). Total num frames: 1324122112. Throughput: 0: 11355.0. Samples: 331122176. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:45,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:50:45,976][1653645] Updated weights for policy 0, policy_version 646545 (0.0013) [2024-06-15 19:50:48,369][1653645] Updated weights for policy 0, policy_version 646600 (0.0134) [2024-06-15 19:50:49,517][1653645] Updated weights for policy 0, policy_version 646649 (0.0011) [2024-06-15 19:50:50,888][1653645] Updated weights for policy 0, policy_version 646692 (0.0011) [2024-06-15 19:50:50,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44783.4, 300 sec: 45542.0). Total num frames: 1324417024. Throughput: 0: 11343.6. Samples: 331154944. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:50:52,582][1653645] Updated weights for policy 0, policy_version 646754 (0.0012) [2024-06-15 19:50:55,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.5, 300 sec: 45430.9). Total num frames: 1324613632. Throughput: 0: 11172.9. Samples: 331215360. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:50:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:50:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000646784_1324613632.pth... [2024-06-15 19:50:56,023][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000641408_1313603584.pth [2024-06-15 19:50:56,028][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000646784_1324613632.pth [2024-06-15 19:50:57,786][1653645] Updated weights for policy 0, policy_version 646800 (0.0020) [2024-06-15 19:51:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1324777472. Throughput: 0: 11195.8. Samples: 331287040. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:00,958][1648982] Avg episode reward: [(0, '37.550')] [2024-06-15 19:51:01,157][1653645] Updated weights for policy 0, policy_version 646881 (0.0015) [2024-06-15 19:51:02,372][1653645] Updated weights for policy 0, policy_version 646928 (0.0011) [2024-06-15 19:51:04,178][1653645] Updated weights for policy 0, policy_version 647008 (0.0162) [2024-06-15 19:51:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 44782.9, 300 sec: 46097.4). Total num frames: 1325137920. Throughput: 0: 10911.3. Samples: 331308544. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:51:09,501][1653645] Updated weights for policy 0, policy_version 647056 (0.0013) [2024-06-15 19:51:10,961][1648982] Fps is (10 sec: 49135.8, 60 sec: 45872.7, 300 sec: 45319.3). Total num frames: 1325268992. Throughput: 0: 11081.2. Samples: 331385856. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:10,962][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:51:13,079][1653645] Updated weights for policy 0, policy_version 647120 (0.0013) [2024-06-15 19:51:13,609][1651596] Signal inference workers to stop experience collection... (33650 times) [2024-06-15 19:51:13,648][1653645] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-06-15 19:51:13,892][1651596] Signal inference workers to resume experience collection... (33650 times) [2024-06-15 19:51:13,893][1653645] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-06-15 19:51:14,552][1653645] Updated weights for policy 0, policy_version 647184 (0.0013) [2024-06-15 19:51:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 45764.1). Total num frames: 1325531136. Throughput: 0: 10922.7. Samples: 331442688. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:51:16,413][1653645] Updated weights for policy 0, policy_version 647264 (0.0100) [2024-06-15 19:51:20,896][1653645] Updated weights for policy 0, policy_version 647298 (0.0013) [2024-06-15 19:51:20,958][1648982] Fps is (10 sec: 39335.0, 60 sec: 43690.7, 300 sec: 45430.9). Total num frames: 1325662208. Throughput: 0: 10740.6. Samples: 331474944. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:51:22,261][1653645] Updated weights for policy 0, policy_version 647354 (0.0024) [2024-06-15 19:51:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.7, 300 sec: 45208.8). Total num frames: 1325891584. Throughput: 0: 11116.1. Samples: 331548672. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:51:26,275][1653645] Updated weights for policy 0, policy_version 647424 (0.0132) [2024-06-15 19:51:27,838][1653645] Updated weights for policy 0, policy_version 647488 (0.0013) [2024-06-15 19:51:30,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 43690.5, 300 sec: 45764.1). Total num frames: 1326186496. Throughput: 0: 10831.6. Samples: 331609600. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:51:33,123][1653645] Updated weights for policy 0, policy_version 647554 (0.0014) [2024-06-15 19:51:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1326317568. Throughput: 0: 11013.7. Samples: 331650560. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:51:36,941][1653645] Updated weights for policy 0, policy_version 647620 (0.0124) [2024-06-15 19:51:38,357][1653645] Updated weights for policy 0, policy_version 647683 (0.0013) [2024-06-15 19:51:39,531][1653645] Updated weights for policy 0, policy_version 647744 (0.0013) [2024-06-15 19:51:40,958][1648982] Fps is (10 sec: 49153.8, 60 sec: 43690.7, 300 sec: 46097.4). Total num frames: 1326678016. Throughput: 0: 10991.0. Samples: 331709952. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:40,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:51:41,071][1653645] Updated weights for policy 0, policy_version 647806 (0.0012) [2024-06-15 19:51:45,852][1653645] Updated weights for policy 0, policy_version 647856 (0.0013) [2024-06-15 19:51:45,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 44783.1, 300 sec: 45097.7). Total num frames: 1326809088. Throughput: 0: 11025.1. Samples: 331783168. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:51:48,764][1653645] Updated weights for policy 0, policy_version 647904 (0.0013) [2024-06-15 19:51:50,026][1653645] Updated weights for policy 0, policy_version 647968 (0.0016) [2024-06-15 19:51:50,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44783.0, 300 sec: 45764.2). Total num frames: 1327104000. Throughput: 0: 11286.8. Samples: 331816448. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:51:51,605][1653645] Updated weights for policy 0, policy_version 648032 (0.0017) [2024-06-15 19:51:51,764][1651596] Signal inference workers to stop experience collection... (33700 times) [2024-06-15 19:51:51,814][1653645] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-06-15 19:51:51,990][1651596] Signal inference workers to resume experience collection... (33700 times) [2024-06-15 19:51:51,992][1653645] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-06-15 19:51:55,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44783.1, 300 sec: 45430.9). Total num frames: 1327300608. Throughput: 0: 11185.2. Samples: 331889152. Policy #0 lag: (min: 14.0, avg: 89.4, max: 270.0) [2024-06-15 19:51:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:51:56,097][1653645] Updated weights for policy 0, policy_version 648097 (0.0013) [2024-06-15 19:51:59,109][1653645] Updated weights for policy 0, policy_version 648165 (0.0075) [2024-06-15 19:52:00,877][1653645] Updated weights for policy 0, policy_version 648230 (0.0013) [2024-06-15 19:52:00,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1327562752. Throughput: 0: 11491.5. Samples: 331959808. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:52:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 45653.0). Total num frames: 1327759360. Throughput: 0: 11400.5. Samples: 331987968. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:05,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 19:52:06,895][1653645] Updated weights for policy 0, policy_version 648323 (0.0014) [2024-06-15 19:52:08,227][1653645] Updated weights for policy 0, policy_version 648380 (0.0012) [2024-06-15 19:52:10,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45331.6, 300 sec: 45319.8). Total num frames: 1327988736. Throughput: 0: 11537.1. Samples: 332067840. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:52:11,064][1653645] Updated weights for policy 0, policy_version 648445 (0.0013) [2024-06-15 19:52:12,416][1653645] Updated weights for policy 0, policy_version 648496 (0.0015) [2024-06-15 19:52:14,162][1653645] Updated weights for policy 0, policy_version 648564 (0.0150) [2024-06-15 19:52:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 46097.4). Total num frames: 1328283648. Throughput: 0: 11457.5. Samples: 332125184. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:52:19,456][1653645] Updated weights for policy 0, policy_version 648624 (0.0013) [2024-06-15 19:52:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1328414720. Throughput: 0: 11559.8. Samples: 332170752. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:52:22,513][1653645] Updated weights for policy 0, policy_version 648695 (0.0106) [2024-06-15 19:52:23,700][1653645] Updated weights for policy 0, policy_version 648736 (0.0051) [2024-06-15 19:52:25,110][1653645] Updated weights for policy 0, policy_version 648785 (0.0013) [2024-06-15 19:52:25,868][1653645] Updated weights for policy 0, policy_version 648830 (0.0012) [2024-06-15 19:52:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 48605.9, 300 sec: 46208.5). Total num frames: 1328807936. Throughput: 0: 11537.0. Samples: 332229120. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:52:30,808][1653645] Updated weights for policy 0, policy_version 648883 (0.0015) [2024-06-15 19:52:30,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 45329.2, 300 sec: 45653.0). Total num frames: 1328906240. Throughput: 0: 11639.4. Samples: 332306944. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:52:33,506][1653645] Updated weights for policy 0, policy_version 648950 (0.0012) [2024-06-15 19:52:34,877][1651596] Signal inference workers to stop experience collection... (33750 times) [2024-06-15 19:52:34,914][1651596] Signal inference workers to resume experience collection... (33750 times) [2024-06-15 19:52:34,925][1653645] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-06-15 19:52:34,965][1653645] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-06-15 19:52:35,184][1653645] Updated weights for policy 0, policy_version 649032 (0.0086) [2024-06-15 19:52:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 49152.1, 300 sec: 45986.3). Total num frames: 1329266688. Throughput: 0: 11798.8. Samples: 332347392. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:52:36,332][1653645] Updated weights for policy 0, policy_version 649088 (0.0013) [2024-06-15 19:52:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.7, 300 sec: 45875.2). Total num frames: 1329364992. Throughput: 0: 11684.9. Samples: 332414976. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:52:41,889][1653645] Updated weights for policy 0, policy_version 649144 (0.0041) [2024-06-15 19:52:45,073][1653645] Updated weights for policy 0, policy_version 649207 (0.0142) [2024-06-15 19:52:45,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 47513.5, 300 sec: 45875.2). Total num frames: 1329659904. Throughput: 0: 11662.2. Samples: 332484608. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:52:47,520][1653645] Updated weights for policy 0, policy_version 649313 (0.0013) [2024-06-15 19:52:50,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 45875.1, 300 sec: 45986.3). Total num frames: 1329856512. Throughput: 0: 11502.9. Samples: 332505600. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:52:53,273][1653645] Updated weights for policy 0, policy_version 649376 (0.0011) [2024-06-15 19:52:55,607][1653645] Updated weights for policy 0, policy_version 649426 (0.0012) [2024-06-15 19:52:55,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 45875.0, 300 sec: 45541.9). Total num frames: 1330053120. Throughput: 0: 11514.2. Samples: 332585984. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:52:55,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 19:52:56,230][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000649456_1330085888.pth... [2024-06-15 19:52:56,344][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000644160_1319239680.pth [2024-06-15 19:52:56,825][1653645] Updated weights for policy 0, policy_version 649488 (0.0014) [2024-06-15 19:52:58,803][1653645] Updated weights for policy 0, policy_version 649556 (0.0013) [2024-06-15 19:53:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 46967.5, 300 sec: 46214.6). Total num frames: 1330380800. Throughput: 0: 11434.7. Samples: 332639744. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:53:04,958][1653645] Updated weights for policy 0, policy_version 649632 (0.0014) [2024-06-15 19:53:05,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45875.0, 300 sec: 45764.1). Total num frames: 1330511872. Throughput: 0: 11366.3. Samples: 332682240. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:05,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 19:53:07,730][1653645] Updated weights for policy 0, policy_version 649685 (0.0014) [2024-06-15 19:53:09,589][1653645] Updated weights for policy 0, policy_version 649760 (0.0013) [2024-06-15 19:53:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 46421.3, 300 sec: 45764.2). Total num frames: 1330774016. Throughput: 0: 11571.2. Samples: 332749824. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:53:12,058][1653645] Updated weights for policy 0, policy_version 649840 (0.0012) [2024-06-15 19:53:15,966][1648982] Fps is (10 sec: 39289.2, 60 sec: 43684.5, 300 sec: 45540.7). Total num frames: 1330905088. Throughput: 0: 11205.0. Samples: 332811264. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:15,967][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:53:16,853][1653645] Updated weights for policy 0, policy_version 649892 (0.0011) [2024-06-15 19:53:19,903][1651596] Signal inference workers to stop experience collection... (33800 times) [2024-06-15 19:53:19,942][1653645] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-06-15 19:53:19,945][1653645] Updated weights for policy 0, policy_version 649937 (0.0013) [2024-06-15 19:53:20,217][1651596] Signal inference workers to resume experience collection... (33800 times) [2024-06-15 19:53:20,218][1653645] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-06-15 19:53:20,962][1648982] Fps is (10 sec: 39303.4, 60 sec: 45871.7, 300 sec: 45319.1). Total num frames: 1331167232. Throughput: 0: 11160.4. Samples: 332849664. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:20,963][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:53:21,529][1653645] Updated weights for policy 0, policy_version 650001 (0.0017) [2024-06-15 19:53:23,089][1653645] Updated weights for policy 0, policy_version 650064 (0.0013) [2024-06-15 19:53:24,242][1653645] Updated weights for policy 0, policy_version 650107 (0.0019) [2024-06-15 19:53:25,958][1648982] Fps is (10 sec: 52472.2, 60 sec: 43690.5, 300 sec: 45875.2). Total num frames: 1331429376. Throughput: 0: 10956.8. Samples: 332908032. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:53:28,578][1653645] Updated weights for policy 0, policy_version 650146 (0.0030) [2024-06-15 19:53:30,958][1648982] Fps is (10 sec: 39339.6, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1331560448. Throughput: 0: 11013.7. Samples: 332980224. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 19:53:32,087][1653645] Updated weights for policy 0, policy_version 650200 (0.0012) [2024-06-15 19:53:33,198][1653645] Updated weights for policy 0, policy_version 650245 (0.0012) [2024-06-15 19:53:35,408][1653645] Updated weights for policy 0, policy_version 650322 (0.0013) [2024-06-15 19:53:35,958][1648982] Fps is (10 sec: 49153.5, 60 sec: 44236.8, 300 sec: 46097.4). Total num frames: 1331920896. Throughput: 0: 11275.4. Samples: 333012992. Policy #0 lag: (min: 63.0, avg: 148.8, max: 319.0) [2024-06-15 19:53:35,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:53:40,252][1653645] Updated weights for policy 0, policy_version 650387 (0.0013) [2024-06-15 19:53:40,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1332084736. Throughput: 0: 10956.8. Samples: 333079040. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:53:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:53:43,652][1653645] Updated weights for policy 0, policy_version 650449 (0.0014) [2024-06-15 19:53:44,675][1653645] Updated weights for policy 0, policy_version 650496 (0.0013) [2024-06-15 19:53:45,698][1653645] Updated weights for policy 0, policy_version 650533 (0.0011) [2024-06-15 19:53:45,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 1332314112. Throughput: 0: 11298.1. Samples: 333148160. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:53:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:53:47,733][1653645] Updated weights for policy 0, policy_version 650608 (0.0013) [2024-06-15 19:53:50,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 43690.7, 300 sec: 45097.7). Total num frames: 1332477952. Throughput: 0: 10854.5. Samples: 333170688. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:53:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:53:52,545][1653645] Updated weights for policy 0, policy_version 650672 (0.0013) [2024-06-15 19:53:55,862][1653645] Updated weights for policy 0, policy_version 650723 (0.0012) [2024-06-15 19:53:55,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.9, 300 sec: 45097.6). Total num frames: 1332674560. Throughput: 0: 10922.7. Samples: 333241344. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:53:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:53:57,448][1653645] Updated weights for policy 0, policy_version 650768 (0.0024) [2024-06-15 19:53:59,656][1653645] Updated weights for policy 0, policy_version 650848 (0.0013) [2024-06-15 19:54:00,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.5, 300 sec: 45542.7). Total num frames: 1333002240. Throughput: 0: 10822.3. Samples: 333298176. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:00,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:54:03,903][1651596] Signal inference workers to stop experience collection... (33850 times) [2024-06-15 19:54:03,984][1653645] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-06-15 19:54:04,084][1651596] Signal inference workers to resume experience collection... (33850 times) [2024-06-15 19:54:04,084][1653645] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-06-15 19:54:04,086][1653645] Updated weights for policy 0, policy_version 650896 (0.0014) [2024-06-15 19:54:05,031][1653645] Updated weights for policy 0, policy_version 650944 (0.0016) [2024-06-15 19:54:05,958][1648982] Fps is (10 sec: 45872.6, 60 sec: 43690.4, 300 sec: 44875.4). Total num frames: 1333133312. Throughput: 0: 10798.5. Samples: 333335552. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:54:07,881][1653645] Updated weights for policy 0, policy_version 651001 (0.0102) [2024-06-15 19:54:09,092][1653645] Updated weights for policy 0, policy_version 651025 (0.0012) [2024-06-15 19:54:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.7, 300 sec: 45653.0). Total num frames: 1333428224. Throughput: 0: 11059.2. Samples: 333405696. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:10,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 19:54:11,384][1653645] Updated weights for policy 0, policy_version 651110 (0.0110) [2024-06-15 19:54:11,976][1653645] Updated weights for policy 0, policy_version 651136 (0.0012) [2024-06-15 19:54:15,958][1648982] Fps is (10 sec: 42600.6, 60 sec: 44243.0, 300 sec: 44764.4). Total num frames: 1333559296. Throughput: 0: 10956.8. Samples: 333473280. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:54:19,037][1653645] Updated weights for policy 0, policy_version 651216 (0.0014) [2024-06-15 19:54:20,361][1653645] Updated weights for policy 0, policy_version 651265 (0.0077) [2024-06-15 19:54:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44240.0, 300 sec: 45430.9). Total num frames: 1333821440. Throughput: 0: 11025.0. Samples: 333509120. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:54:21,616][1653645] Updated weights for policy 0, policy_version 651319 (0.0013) [2024-06-15 19:54:22,938][1653645] Updated weights for policy 0, policy_version 651365 (0.0023) [2024-06-15 19:54:25,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1334050816. Throughput: 0: 10934.0. Samples: 333571072. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:54:27,851][1653645] Updated weights for policy 0, policy_version 651424 (0.0156) [2024-06-15 19:54:30,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1334181888. Throughput: 0: 10888.5. Samples: 333638144. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:54:31,736][1653645] Updated weights for policy 0, policy_version 651488 (0.0013) [2024-06-15 19:54:34,078][1653645] Updated weights for policy 0, policy_version 651568 (0.0011) [2024-06-15 19:54:34,932][1653645] Updated weights for policy 0, policy_version 651600 (0.0010) [2024-06-15 19:54:35,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1334542336. Throughput: 0: 10888.5. Samples: 333660672. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:54:40,507][1653645] Updated weights for policy 0, policy_version 651650 (0.0011) [2024-06-15 19:54:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 44542.2). Total num frames: 1334607872. Throughput: 0: 10786.1. Samples: 333726720. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:54:43,884][1653645] Updated weights for policy 0, policy_version 651713 (0.0015) [2024-06-15 19:54:45,449][1653645] Updated weights for policy 0, policy_version 651777 (0.0011) [2024-06-15 19:54:45,958][1648982] Fps is (10 sec: 32766.9, 60 sec: 42598.2, 300 sec: 44542.3). Total num frames: 1334870016. Throughput: 0: 10945.4. Samples: 333790720. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:54:46,630][1651596] Signal inference workers to stop experience collection... (33900 times) [2024-06-15 19:54:46,715][1653645] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-06-15 19:54:46,850][1651596] Signal inference workers to resume experience collection... (33900 times) [2024-06-15 19:54:46,882][1653645] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-06-15 19:54:47,451][1653645] Updated weights for policy 0, policy_version 651861 (0.0012) [2024-06-15 19:54:50,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1335099392. Throughput: 0: 10672.5. Samples: 333815808. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:54:52,829][1653645] Updated weights for policy 0, policy_version 651920 (0.0012) [2024-06-15 19:54:54,001][1653645] Updated weights for policy 0, policy_version 651965 (0.0061) [2024-06-15 19:54:55,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.2, 300 sec: 44431.1). Total num frames: 1335230464. Throughput: 0: 10626.8. Samples: 333883904. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:54:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:54:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000651968_1335230464.pth... [2024-06-15 19:54:56,226][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000646784_1324613632.pth [2024-06-15 19:54:57,758][1653645] Updated weights for policy 0, policy_version 652035 (0.0098) [2024-06-15 19:54:59,215][1653645] Updated weights for policy 0, policy_version 652101 (0.0014) [2024-06-15 19:55:00,268][1653645] Updated weights for policy 0, policy_version 652156 (0.0014) [2024-06-15 19:55:00,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.8, 300 sec: 44653.4). Total num frames: 1335623680. Throughput: 0: 10581.4. Samples: 333949440. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:55:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:55:05,269][1653645] Updated weights for policy 0, policy_version 652208 (0.0012) [2024-06-15 19:55:05,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43691.2, 300 sec: 44875.5). Total num frames: 1335754752. Throughput: 0: 10638.3. Samples: 333987840. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:55:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:55:07,600][1653645] Updated weights for policy 0, policy_version 652246 (0.0013) [2024-06-15 19:55:08,848][1653645] Updated weights for policy 0, policy_version 652289 (0.0014) [2024-06-15 19:55:10,228][1653645] Updated weights for policy 0, policy_version 652348 (0.0014) [2024-06-15 19:55:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.7, 300 sec: 44431.2). Total num frames: 1336016896. Throughput: 0: 10831.7. Samples: 334058496. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:55:10,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 19:55:12,052][1653645] Updated weights for policy 0, policy_version 652414 (0.0014) [2024-06-15 19:55:15,962][1648982] Fps is (10 sec: 39304.8, 60 sec: 43141.6, 300 sec: 44430.5). Total num frames: 1336147968. Throughput: 0: 10819.3. Samples: 334125056. Policy #0 lag: (min: 11.0, avg: 116.5, max: 267.0) [2024-06-15 19:55:15,963][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:55:19,559][1653645] Updated weights for policy 0, policy_version 652498 (0.0014) [2024-06-15 19:55:20,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.7, 300 sec: 44431.2). Total num frames: 1336410112. Throughput: 0: 10934.0. Samples: 334152704. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 19:55:21,617][1653645] Updated weights for policy 0, policy_version 652578 (0.0014) [2024-06-15 19:55:23,928][1653645] Updated weights for policy 0, policy_version 652644 (0.0012) [2024-06-15 19:55:25,958][1648982] Fps is (10 sec: 52450.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1336672256. Throughput: 0: 10820.3. Samples: 334213632. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:55:29,792][1653645] Updated weights for policy 0, policy_version 652704 (0.0020) [2024-06-15 19:55:30,904][1653645] Updated weights for policy 0, policy_version 652752 (0.0011) [2024-06-15 19:55:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 1336836096. Throughput: 0: 11025.1. Samples: 334286848. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:30,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 19:55:31,892][1653645] Updated weights for policy 0, policy_version 652800 (0.0012) [2024-06-15 19:55:32,291][1651596] Signal inference workers to stop experience collection... (33950 times) [2024-06-15 19:55:32,323][1653645] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-06-15 19:55:32,563][1651596] Signal inference workers to resume experience collection... (33950 times) [2024-06-15 19:55:32,563][1653645] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-06-15 19:55:33,643][1653645] Updated weights for policy 0, policy_version 652863 (0.0036) [2024-06-15 19:55:35,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1337131008. Throughput: 0: 11173.0. Samples: 334318592. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:55:36,257][1653645] Updated weights for policy 0, policy_version 652923 (0.0013) [2024-06-15 19:55:40,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1337229312. Throughput: 0: 11264.1. Samples: 334390784. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:55:42,220][1653645] Updated weights for policy 0, policy_version 653008 (0.0016) [2024-06-15 19:55:43,188][1653645] Updated weights for policy 0, policy_version 653056 (0.0010) [2024-06-15 19:55:44,889][1653645] Updated weights for policy 0, policy_version 653115 (0.0014) [2024-06-15 19:55:45,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 1337589760. Throughput: 0: 11218.4. Samples: 334454272. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 19:55:48,411][1653645] Updated weights for policy 0, policy_version 653182 (0.0048) [2024-06-15 19:55:50,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1337720832. Throughput: 0: 11127.4. Samples: 334488576. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:55:53,886][1653645] Updated weights for policy 0, policy_version 653249 (0.0013) [2024-06-15 19:55:55,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45875.4, 300 sec: 44764.4). Total num frames: 1337982976. Throughput: 0: 11036.5. Samples: 334555136. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:55:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:55:56,061][1653645] Updated weights for policy 0, policy_version 653314 (0.0010) [2024-06-15 19:55:58,412][1653645] Updated weights for policy 0, policy_version 653381 (0.0135) [2024-06-15 19:55:59,699][1653645] Updated weights for policy 0, policy_version 653440 (0.0015) [2024-06-15 19:56:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1338245120. Throughput: 0: 10935.1. Samples: 334617088. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:56:05,774][1653645] Updated weights for policy 0, policy_version 653511 (0.0124) [2024-06-15 19:56:05,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 44236.6, 300 sec: 44542.7). Total num frames: 1338408960. Throughput: 0: 11241.2. Samples: 334658560. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:56:07,701][1653645] Updated weights for policy 0, policy_version 653570 (0.0013) [2024-06-15 19:56:09,271][1653645] Updated weights for policy 0, policy_version 653631 (0.0120) [2024-06-15 19:56:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1338671104. Throughput: 0: 11150.2. Samples: 334715392. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:56:11,586][1653645] Updated weights for policy 0, policy_version 653686 (0.0095) [2024-06-15 19:56:15,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 43693.7, 300 sec: 44431.2). Total num frames: 1338769408. Throughput: 0: 11207.1. Samples: 334791168. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:15,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:56:17,708][1653645] Updated weights for policy 0, policy_version 653744 (0.0010) [2024-06-15 19:56:19,148][1651596] Signal inference workers to stop experience collection... (34000 times) [2024-06-15 19:56:19,202][1653645] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-06-15 19:56:19,481][1651596] Signal inference workers to resume experience collection... (34000 times) [2024-06-15 19:56:19,482][1653645] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-06-15 19:56:20,203][1653645] Updated weights for policy 0, policy_version 653828 (0.0011) [2024-06-15 19:56:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1339097088. Throughput: 0: 11138.8. Samples: 334819840. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:20,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 19:56:23,575][1653645] Updated weights for policy 0, policy_version 653889 (0.0012) [2024-06-15 19:56:24,512][1653645] Updated weights for policy 0, policy_version 653942 (0.0013) [2024-06-15 19:56:25,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1339293696. Throughput: 0: 10740.6. Samples: 334874112. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:56:29,510][1653645] Updated weights for policy 0, policy_version 654010 (0.0120) [2024-06-15 19:56:30,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.6, 300 sec: 44542.2). Total num frames: 1339457536. Throughput: 0: 10979.6. Samples: 334948352. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:56:31,536][1653645] Updated weights for policy 0, policy_version 654064 (0.0011) [2024-06-15 19:56:33,183][1653645] Updated weights for policy 0, policy_version 654137 (0.0018) [2024-06-15 19:56:35,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1339752448. Throughput: 0: 10808.9. Samples: 334974976. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:56:36,196][1653645] Updated weights for policy 0, policy_version 654198 (0.0014) [2024-06-15 19:56:40,785][1653645] Updated weights for policy 0, policy_version 654242 (0.0012) [2024-06-15 19:56:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 1339883520. Throughput: 0: 10922.6. Samples: 335046656. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:40,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:56:41,389][1653645] Updated weights for policy 0, policy_version 654272 (0.0011) [2024-06-15 19:56:43,571][1653645] Updated weights for policy 0, policy_version 654323 (0.0011) [2024-06-15 19:56:44,828][1653645] Updated weights for policy 0, policy_version 654394 (0.0015) [2024-06-15 19:56:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1340211200. Throughput: 0: 11002.3. Samples: 335112192. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:56:47,920][1653645] Updated weights for policy 0, policy_version 654461 (0.0013) [2024-06-15 19:56:50,960][1648982] Fps is (10 sec: 45876.0, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1340342272. Throughput: 0: 10729.3. Samples: 335141376. Policy #0 lag: (min: 36.0, avg: 131.4, max: 292.0) [2024-06-15 19:56:50,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:56:53,217][1653645] Updated weights for policy 0, policy_version 654520 (0.0100) [2024-06-15 19:56:55,500][1653645] Updated weights for policy 0, policy_version 654576 (0.0012) [2024-06-15 19:56:55,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 1340604416. Throughput: 0: 11070.5. Samples: 335213568. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:56:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:56:56,259][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000654608_1340637184.pth... [2024-06-15 19:56:56,401][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000649456_1330085888.pth [2024-06-15 19:56:57,054][1653645] Updated weights for policy 0, policy_version 654641 (0.0012) [2024-06-15 19:56:59,589][1653645] Updated weights for policy 0, policy_version 654674 (0.0028) [2024-06-15 19:57:00,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1340866560. Throughput: 0: 10786.2. Samples: 335276544. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:57:02,878][1653645] Updated weights for policy 0, policy_version 654725 (0.0047) [2024-06-15 19:57:04,311][1653645] Updated weights for policy 0, policy_version 654777 (0.0012) [2024-06-15 19:57:05,843][1651596] Signal inference workers to stop experience collection... (34050 times) [2024-06-15 19:57:05,930][1653645] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-06-15 19:57:05,958][1648982] Fps is (10 sec: 39323.3, 60 sec: 43144.7, 300 sec: 44098.0). Total num frames: 1340997632. Throughput: 0: 11013.7. Samples: 335315456. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 19:57:06,094][1651596] Signal inference workers to resume experience collection... (34050 times) [2024-06-15 19:57:06,094][1653645] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-06-15 19:57:06,840][1653645] Updated weights for policy 0, policy_version 654832 (0.0012) [2024-06-15 19:57:08,157][1653645] Updated weights for policy 0, policy_version 654896 (0.0014) [2024-06-15 19:57:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1341259776. Throughput: 0: 11298.1. Samples: 335382528. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:57:11,870][1653645] Updated weights for policy 0, policy_version 654960 (0.0014) [2024-06-15 19:57:15,640][1653645] Updated weights for policy 0, policy_version 655040 (0.0014) [2024-06-15 19:57:15,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1341521920. Throughput: 0: 11002.3. Samples: 335443456. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:57:18,697][1653645] Updated weights for policy 0, policy_version 655091 (0.0013) [2024-06-15 19:57:20,096][1653645] Updated weights for policy 0, policy_version 655161 (0.0013) [2024-06-15 19:57:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1341784064. Throughput: 0: 11343.6. Samples: 335485440. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:57:23,131][1653645] Updated weights for policy 0, policy_version 655232 (0.0052) [2024-06-15 19:57:25,959][1648982] Fps is (10 sec: 39316.3, 60 sec: 43689.6, 300 sec: 44097.8). Total num frames: 1341915136. Throughput: 0: 11161.3. Samples: 335548928. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:25,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:57:29,450][1653645] Updated weights for policy 0, policy_version 655298 (0.0015) [2024-06-15 19:57:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.2, 300 sec: 43764.7). Total num frames: 1342177280. Throughput: 0: 11218.5. Samples: 335617024. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:57:31,010][1653645] Updated weights for policy 0, policy_version 655361 (0.0012) [2024-06-15 19:57:32,270][1653645] Updated weights for policy 0, policy_version 655415 (0.0044) [2024-06-15 19:57:35,367][1653645] Updated weights for policy 0, policy_version 655485 (0.0014) [2024-06-15 19:57:35,958][1648982] Fps is (10 sec: 52436.0, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1342439424. Throughput: 0: 11229.9. Samples: 335646720. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:57:40,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 1342570496. Throughput: 0: 11093.4. Samples: 335712768. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 19:57:41,849][1653645] Updated weights for policy 0, policy_version 655584 (0.0019) [2024-06-15 19:57:43,530][1653645] Updated weights for policy 0, policy_version 655648 (0.0011) [2024-06-15 19:57:45,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1342832640. Throughput: 0: 11150.2. Samples: 335778304. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 19:57:46,348][1653645] Updated weights for policy 0, policy_version 655697 (0.0012) [2024-06-15 19:57:47,506][1653645] Updated weights for policy 0, policy_version 655744 (0.0017) [2024-06-15 19:57:50,593][1651596] Signal inference workers to stop experience collection... (34100 times) [2024-06-15 19:57:50,632][1653645] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-06-15 19:57:50,851][1651596] Signal inference workers to resume experience collection... (34100 times) [2024-06-15 19:57:50,851][1653645] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-06-15 19:57:50,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1343029248. Throughput: 0: 10956.8. Samples: 335808512. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:50,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:57:53,503][1653645] Updated weights for policy 0, policy_version 655824 (0.0015) [2024-06-15 19:57:55,847][1653645] Updated weights for policy 0, policy_version 655907 (0.0013) [2024-06-15 19:57:55,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44783.2, 300 sec: 43764.7). Total num frames: 1343291392. Throughput: 0: 10979.6. Samples: 335876608. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:57:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:57:58,393][1653645] Updated weights for policy 0, policy_version 655954 (0.0013) [2024-06-15 19:58:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1343488000. Throughput: 0: 11002.3. Samples: 335938560. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 19:58:02,978][1653645] Updated weights for policy 0, policy_version 656032 (0.0014) [2024-06-15 19:58:05,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 44236.5, 300 sec: 43653.6). Total num frames: 1343651840. Throughput: 0: 10843.0. Samples: 335973376. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:05,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 19:58:06,038][1653645] Updated weights for policy 0, policy_version 656085 (0.0016) [2024-06-15 19:58:08,080][1653645] Updated weights for policy 0, policy_version 656182 (0.0013) [2024-06-15 19:58:10,171][1653645] Updated weights for policy 0, policy_version 656240 (0.0013) [2024-06-15 19:58:10,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45875.2, 300 sec: 44432.4). Total num frames: 1344012288. Throughput: 0: 10945.7. Samples: 336041472. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:58:14,071][1653645] Updated weights for policy 0, policy_version 656272 (0.0154) [2024-06-15 19:58:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.5, 300 sec: 43987.5). Total num frames: 1344143360. Throughput: 0: 10956.7. Samples: 336110080. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:58:17,700][1653645] Updated weights for policy 0, policy_version 656324 (0.0011) [2024-06-15 19:58:19,622][1653645] Updated weights for policy 0, policy_version 656400 (0.0152) [2024-06-15 19:58:20,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1344405504. Throughput: 0: 11093.3. Samples: 336145920. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 19:58:21,599][1653645] Updated weights for policy 0, policy_version 656480 (0.0013) [2024-06-15 19:58:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43691.5, 300 sec: 43986.8). Total num frames: 1344536576. Throughput: 0: 10945.4. Samples: 336205312. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:58:26,559][1653645] Updated weights for policy 0, policy_version 656531 (0.0013) [2024-06-15 19:58:30,360][1653645] Updated weights for policy 0, policy_version 656612 (0.0069) [2024-06-15 19:58:30,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1344765952. Throughput: 0: 11059.2. Samples: 336275968. Policy #0 lag: (min: 15.0, avg: 110.7, max: 271.0) [2024-06-15 19:58:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 19:58:32,012][1653645] Updated weights for policy 0, policy_version 656673 (0.0015) [2024-06-15 19:58:33,655][1651596] Signal inference workers to stop experience collection... (34150 times) [2024-06-15 19:58:33,717][1653645] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-06-15 19:58:33,930][1651596] Signal inference workers to resume experience collection... (34150 times) [2024-06-15 19:58:33,931][1653645] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-06-15 19:58:33,933][1653645] Updated weights for policy 0, policy_version 656752 (0.0014) [2024-06-15 19:58:35,959][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1345060864. Throughput: 0: 10911.3. Samples: 336299520. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:58:35,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:58:38,629][1653645] Updated weights for policy 0, policy_version 656800 (0.0013) [2024-06-15 19:58:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 1345191936. Throughput: 0: 11082.0. Samples: 336375296. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:58:40,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 19:58:42,072][1653645] Updated weights for policy 0, policy_version 656852 (0.0024) [2024-06-15 19:58:43,965][1653645] Updated weights for policy 0, policy_version 656928 (0.0013) [2024-06-15 19:58:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1345519616. Throughput: 0: 10968.2. Samples: 336432128. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:58:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:58:46,210][1653645] Updated weights for policy 0, policy_version 657017 (0.0013) [2024-06-15 19:58:50,950][1653645] Updated weights for policy 0, policy_version 657080 (0.0013) [2024-06-15 19:58:50,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1345683456. Throughput: 0: 11047.9. Samples: 336470528. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:58:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:58:54,345][1653645] Updated weights for policy 0, policy_version 657136 (0.0015) [2024-06-15 19:58:55,854][1653645] Updated weights for policy 0, policy_version 657185 (0.0012) [2024-06-15 19:58:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1345912832. Throughput: 0: 11036.5. Samples: 336538112. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:58:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:58:56,530][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000657216_1345978368.pth... [2024-06-15 19:58:56,622][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000651968_1335230464.pth [2024-06-15 19:58:57,503][1653645] Updated weights for policy 0, policy_version 657249 (0.0013) [2024-06-15 19:59:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 43987.0). Total num frames: 1346109440. Throughput: 0: 11047.9. Samples: 336607232. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 19:59:02,036][1653645] Updated weights for policy 0, policy_version 657296 (0.0103) [2024-06-15 19:59:05,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1346273280. Throughput: 0: 10979.5. Samples: 336640000. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:05,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 19:59:06,497][1653645] Updated weights for policy 0, policy_version 657377 (0.0013) [2024-06-15 19:59:08,886][1653645] Updated weights for policy 0, policy_version 657457 (0.0013) [2024-06-15 19:59:10,544][1653645] Updated weights for policy 0, policy_version 657528 (0.0031) [2024-06-15 19:59:10,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1346633728. Throughput: 0: 10877.2. Samples: 336694784. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:10,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 19:59:14,677][1653645] Updated weights for policy 0, policy_version 657588 (0.0012) [2024-06-15 19:59:15,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 1346764800. Throughput: 0: 10877.1. Samples: 336765440. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 19:59:19,384][1653645] Updated weights for policy 0, policy_version 657648 (0.0019) [2024-06-15 19:59:20,023][1651596] Signal inference workers to stop experience collection... (34200 times) [2024-06-15 19:59:20,100][1653645] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-06-15 19:59:20,282][1651596] Signal inference workers to resume experience collection... (34200 times) [2024-06-15 19:59:20,283][1653645] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-06-15 19:59:20,959][1648982] Fps is (10 sec: 32768.4, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1346961408. Throughput: 0: 11138.8. Samples: 336800768. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:20,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:59:20,971][1653645] Updated weights for policy 0, policy_version 657712 (0.0012) [2024-06-15 19:59:22,339][1653645] Updated weights for policy 0, policy_version 657776 (0.0012) [2024-06-15 19:59:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 1347223552. Throughput: 0: 10877.1. Samples: 336864768. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 19:59:26,212][1653645] Updated weights for policy 0, policy_version 657840 (0.0012) [2024-06-15 19:59:30,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1347387392. Throughput: 0: 11059.2. Samples: 336929792. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 19:59:31,108][1653645] Updated weights for policy 0, policy_version 657911 (0.0092) [2024-06-15 19:59:32,475][1653645] Updated weights for policy 0, policy_version 657975 (0.0010) [2024-06-15 19:59:33,102][1653645] Updated weights for policy 0, policy_version 658000 (0.0014) [2024-06-15 19:59:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1347682304. Throughput: 0: 11036.4. Samples: 336967168. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:35,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:59:36,870][1653645] Updated weights for policy 0, policy_version 658051 (0.0013) [2024-06-15 19:59:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1347813376. Throughput: 0: 11047.8. Samples: 337035264. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 19:59:41,169][1653645] Updated weights for policy 0, policy_version 658115 (0.0015) [2024-06-15 19:59:42,950][1653645] Updated weights for policy 0, policy_version 658192 (0.0041) [2024-06-15 19:59:44,068][1653645] Updated weights for policy 0, policy_version 658239 (0.0035) [2024-06-15 19:59:45,834][1653645] Updated weights for policy 0, policy_version 658301 (0.0012) [2024-06-15 19:59:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1348206592. Throughput: 0: 10888.5. Samples: 337097216. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 19:59:50,234][1653645] Updated weights for policy 0, policy_version 658362 (0.0013) [2024-06-15 19:59:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1348337664. Throughput: 0: 11082.0. Samples: 337138688. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 19:59:53,386][1653645] Updated weights for policy 0, policy_version 658401 (0.0012) [2024-06-15 19:59:55,638][1653645] Updated weights for policy 0, policy_version 658489 (0.0012) [2024-06-15 19:59:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1348599808. Throughput: 0: 11184.4. Samples: 337198080. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 19:59:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 19:59:57,403][1653645] Updated weights for policy 0, policy_version 658531 (0.0035) [2024-06-15 20:00:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1348730880. Throughput: 0: 11059.2. Samples: 337263104. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 20:00:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:00:02,268][1653645] Updated weights for policy 0, policy_version 658576 (0.0013) [2024-06-15 20:00:03,419][1653645] Updated weights for policy 0, policy_version 658624 (0.0014) [2024-06-15 20:00:04,527][1651596] Signal inference workers to stop experience collection... (34250 times) [2024-06-15 20:00:04,611][1653645] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-06-15 20:00:04,888][1651596] Signal inference workers to resume experience collection... (34250 times) [2024-06-15 20:00:04,889][1653645] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-06-15 20:00:05,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 1348960256. Throughput: 0: 10968.2. Samples: 337294336. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 20:00:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:00:06,320][1653645] Updated weights for policy 0, policy_version 658690 (0.0015) [2024-06-15 20:00:07,422][1653645] Updated weights for policy 0, policy_version 658750 (0.0016) [2024-06-15 20:00:09,512][1653645] Updated weights for policy 0, policy_version 658812 (0.0020) [2024-06-15 20:00:10,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.5, 300 sec: 44431.8). Total num frames: 1349255168. Throughput: 0: 10934.0. Samples: 337356800. Policy #0 lag: (min: 111.0, avg: 186.9, max: 367.0) [2024-06-15 20:00:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:00:15,745][1653645] Updated weights for policy 0, policy_version 658877 (0.0039) [2024-06-15 20:00:15,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1349386240. Throughput: 0: 11025.1. Samples: 337425920. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:00:18,597][1653645] Updated weights for policy 0, policy_version 658960 (0.0133) [2024-06-15 20:00:19,550][1653645] Updated weights for policy 0, policy_version 659008 (0.0010) [2024-06-15 20:00:20,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1349648384. Throughput: 0: 10877.2. Samples: 337456640. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:00:21,501][1653645] Updated weights for policy 0, policy_version 659045 (0.0016) [2024-06-15 20:00:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 1349812224. Throughput: 0: 11104.7. Samples: 337534976. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:00:26,569][1653645] Updated weights for policy 0, policy_version 659105 (0.0012) [2024-06-15 20:00:28,430][1653645] Updated weights for policy 0, policy_version 659154 (0.0046) [2024-06-15 20:00:30,365][1653645] Updated weights for policy 0, policy_version 659233 (0.0027) [2024-06-15 20:00:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 46421.4, 300 sec: 44209.0). Total num frames: 1350172672. Throughput: 0: 11059.2. Samples: 337594880. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:00:33,224][1653645] Updated weights for policy 0, policy_version 659324 (0.0013) [2024-06-15 20:00:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1350303744. Throughput: 0: 10865.8. Samples: 337627648. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:00:39,585][1653645] Updated weights for policy 0, policy_version 659393 (0.0015) [2024-06-15 20:00:40,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1350565888. Throughput: 0: 11104.7. Samples: 337697792. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:00:41,243][1653645] Updated weights for policy 0, policy_version 659472 (0.0012) [2024-06-15 20:00:41,974][1653645] Updated weights for policy 0, policy_version 659513 (0.0019) [2024-06-15 20:00:44,121][1653645] Updated weights for policy 0, policy_version 659554 (0.0013) [2024-06-15 20:00:45,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1350828032. Throughput: 0: 11241.2. Samples: 337768960. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:00:48,468][1653645] Updated weights for policy 0, policy_version 659588 (0.0012) [2024-06-15 20:00:48,888][1651596] Signal inference workers to stop experience collection... (34300 times) [2024-06-15 20:00:48,927][1653645] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-06-15 20:00:49,086][1651596] Signal inference workers to resume experience collection... (34300 times) [2024-06-15 20:00:49,087][1653645] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-06-15 20:00:50,620][1653645] Updated weights for policy 0, policy_version 659664 (0.0013) [2024-06-15 20:00:50,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1350991872. Throughput: 0: 11457.4. Samples: 337809920. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:00:52,945][1653645] Updated weights for policy 0, policy_version 659767 (0.0014) [2024-06-15 20:00:55,745][1653645] Updated weights for policy 0, policy_version 659810 (0.0026) [2024-06-15 20:00:55,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45328.8, 300 sec: 44320.1). Total num frames: 1351319552. Throughput: 0: 11457.4. Samples: 337872384. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:00:55,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:00:56,336][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000659840_1351352320.pth... [2024-06-15 20:00:56,394][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000654608_1340637184.pth [2024-06-15 20:01:00,037][1653645] Updated weights for policy 0, policy_version 659856 (0.0013) [2024-06-15 20:01:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45329.1, 300 sec: 44209.1). Total num frames: 1351450624. Throughput: 0: 11628.1. Samples: 337949184. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:01:02,040][1653645] Updated weights for policy 0, policy_version 659923 (0.0012) [2024-06-15 20:01:04,533][1653645] Updated weights for policy 0, policy_version 660003 (0.0011) [2024-06-15 20:01:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 46421.1, 300 sec: 44320.1). Total num frames: 1351745536. Throughput: 0: 11491.5. Samples: 337973760. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:01:07,981][1653645] Updated weights for policy 0, policy_version 660084 (0.0015) [2024-06-15 20:01:10,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1351876608. Throughput: 0: 11218.5. Samples: 338039808. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:01:13,044][1653645] Updated weights for policy 0, policy_version 660152 (0.0013) [2024-06-15 20:01:14,013][1653645] Updated weights for policy 0, policy_version 660192 (0.0012) [2024-06-15 20:01:15,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 46421.4, 300 sec: 44320.1). Total num frames: 1352171520. Throughput: 0: 11355.0. Samples: 338105856. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:01:15,960][1653645] Updated weights for policy 0, policy_version 660244 (0.0013) [2024-06-15 20:01:16,663][1653645] Updated weights for policy 0, policy_version 660288 (0.0012) [2024-06-15 20:01:19,503][1653645] Updated weights for policy 0, policy_version 660342 (0.0014) [2024-06-15 20:01:20,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45875.0, 300 sec: 44431.2). Total num frames: 1352400896. Throughput: 0: 11514.3. Samples: 338145792. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:01:24,238][1653645] Updated weights for policy 0, policy_version 660390 (0.0030) [2024-06-15 20:01:25,674][1653645] Updated weights for policy 0, policy_version 660470 (0.0013) [2024-06-15 20:01:25,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 47513.5, 300 sec: 44764.4). Total num frames: 1352663040. Throughput: 0: 11411.9. Samples: 338211328. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:01:28,369][1653645] Updated weights for policy 0, policy_version 660533 (0.0013) [2024-06-15 20:01:30,299][1653645] Updated weights for policy 0, policy_version 660561 (0.0012) [2024-06-15 20:01:30,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1352859648. Throughput: 0: 11446.1. Samples: 338284032. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:01:34,368][1651596] Signal inference workers to stop experience collection... (34350 times) [2024-06-15 20:01:34,389][1653645] Updated weights for policy 0, policy_version 660609 (0.0014) [2024-06-15 20:01:34,447][1653645] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-06-15 20:01:34,723][1651596] Signal inference workers to resume experience collection... (34350 times) [2024-06-15 20:01:34,727][1653645] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-06-15 20:01:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.1, 300 sec: 44653.4). Total num frames: 1353056256. Throughput: 0: 11298.1. Samples: 338318336. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:35,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:01:36,312][1653645] Updated weights for policy 0, policy_version 660673 (0.0013) [2024-06-15 20:01:38,675][1653645] Updated weights for policy 0, policy_version 660737 (0.0015) [2024-06-15 20:01:39,886][1653645] Updated weights for policy 0, policy_version 660799 (0.0013) [2024-06-15 20:01:40,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 45875.1, 300 sec: 44431.1). Total num frames: 1353318400. Throughput: 0: 11320.9. Samples: 338381824. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:01:43,370][1653645] Updated weights for policy 0, policy_version 660863 (0.0046) [2024-06-15 20:01:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1353449472. Throughput: 0: 11116.1. Samples: 338449408. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:45,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:01:47,938][1653645] Updated weights for policy 0, policy_version 660944 (0.0013) [2024-06-15 20:01:50,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1353711616. Throughput: 0: 11161.7. Samples: 338476032. Policy #0 lag: (min: 14.0, avg: 92.3, max: 270.0) [2024-06-15 20:01:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:01:51,592][1653645] Updated weights for policy 0, policy_version 661024 (0.0028) [2024-06-15 20:01:54,060][1653645] Updated weights for policy 0, policy_version 661058 (0.0011) [2024-06-15 20:01:55,395][1653645] Updated weights for policy 0, policy_version 661117 (0.0037) [2024-06-15 20:01:55,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 44236.8, 300 sec: 44431.1). Total num frames: 1353973760. Throughput: 0: 11241.2. Samples: 338545664. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:01:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:01:58,696][1653645] Updated weights for policy 0, policy_version 661168 (0.0016) [2024-06-15 20:01:59,748][1653645] Updated weights for policy 0, policy_version 661216 (0.0014) [2024-06-15 20:02:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46421.4, 300 sec: 44875.5). Total num frames: 1354235904. Throughput: 0: 11320.9. Samples: 338615296. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:02:02,981][1653645] Updated weights for policy 0, policy_version 661280 (0.0013) [2024-06-15 20:02:05,719][1653645] Updated weights for policy 0, policy_version 661314 (0.0021) [2024-06-15 20:02:05,958][1648982] Fps is (10 sec: 42600.1, 60 sec: 44237.1, 300 sec: 44542.3). Total num frames: 1354399744. Throughput: 0: 11173.0. Samples: 338648576. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:02:10,122][1653645] Updated weights for policy 0, policy_version 661392 (0.0012) [2024-06-15 20:02:10,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1354596352. Throughput: 0: 11252.6. Samples: 338717696. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:10,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:02:11,690][1653645] Updated weights for policy 0, policy_version 661472 (0.0013) [2024-06-15 20:02:14,870][1653645] Updated weights for policy 0, policy_version 661552 (0.0159) [2024-06-15 20:02:15,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1354891264. Throughput: 0: 10934.0. Samples: 338776064. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:02:19,440][1653645] Updated weights for policy 0, policy_version 661616 (0.0014) [2024-06-15 20:02:20,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.8, 300 sec: 44431.4). Total num frames: 1355022336. Throughput: 0: 11093.4. Samples: 338817536. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:20,958][1648982] Avg episode reward: [(0, '37.050')] [2024-06-15 20:02:22,162][1651596] Signal inference workers to stop experience collection... (34400 times) [2024-06-15 20:02:22,266][1653645] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-06-15 20:02:22,373][1651596] Signal inference workers to resume experience collection... (34400 times) [2024-06-15 20:02:22,374][1653645] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-06-15 20:02:22,872][1653645] Updated weights for policy 0, policy_version 661685 (0.0101) [2024-06-15 20:02:24,320][1653645] Updated weights for policy 0, policy_version 661750 (0.0013) [2024-06-15 20:02:25,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1355284480. Throughput: 0: 11036.5. Samples: 338878464. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:02:27,230][1653645] Updated weights for policy 0, policy_version 661808 (0.0012) [2024-06-15 20:02:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 1355448320. Throughput: 0: 11127.5. Samples: 338950144. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:02:31,303][1653645] Updated weights for policy 0, policy_version 661859 (0.0013) [2024-06-15 20:02:34,076][1653645] Updated weights for policy 0, policy_version 661936 (0.0034) [2024-06-15 20:02:35,750][1653645] Updated weights for policy 0, policy_version 662009 (0.0012) [2024-06-15 20:02:35,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1355808768. Throughput: 0: 11252.6. Samples: 338982400. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:35,959][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 20:02:38,400][1653645] Updated weights for policy 0, policy_version 662053 (0.0012) [2024-06-15 20:02:40,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1355939840. Throughput: 0: 11082.0. Samples: 339044352. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:02:42,750][1653645] Updated weights for policy 0, policy_version 662112 (0.0102) [2024-06-15 20:02:43,156][1653645] Updated weights for policy 0, policy_version 662144 (0.0033) [2024-06-15 20:02:45,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 45875.3, 300 sec: 44653.3). Total num frames: 1356201984. Throughput: 0: 11252.6. Samples: 339121664. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:02:46,142][1653645] Updated weights for policy 0, policy_version 662224 (0.0020) [2024-06-15 20:02:47,248][1653645] Updated weights for policy 0, policy_version 662269 (0.0013) [2024-06-15 20:02:49,920][1653645] Updated weights for policy 0, policy_version 662324 (0.0018) [2024-06-15 20:02:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.3, 300 sec: 44653.4). Total num frames: 1356464128. Throughput: 0: 11286.8. Samples: 339156480. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:02:55,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.7, 300 sec: 44431.1). Total num frames: 1356595200. Throughput: 0: 11252.5. Samples: 339224064. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:02:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:02:56,120][1653645] Updated weights for policy 0, policy_version 662402 (0.0015) [2024-06-15 20:02:56,269][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000662416_1356627968.pth... [2024-06-15 20:02:56,362][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000657216_1345978368.pth [2024-06-15 20:02:58,523][1653645] Updated weights for policy 0, policy_version 662512 (0.0030) [2024-06-15 20:03:00,460][1653645] Updated weights for policy 0, policy_version 662547 (0.0013) [2024-06-15 20:03:00,958][1648982] Fps is (10 sec: 45873.5, 60 sec: 44782.7, 300 sec: 44986.6). Total num frames: 1356922880. Throughput: 0: 11434.6. Samples: 339290624. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:03:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:03:01,388][1653645] Updated weights for policy 0, policy_version 662590 (0.0012) [2024-06-15 20:03:05,958][1648982] Fps is (10 sec: 42599.8, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1357021184. Throughput: 0: 11309.5. Samples: 339326464. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:03:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:03:06,141][1651596] Signal inference workers to stop experience collection... (34450 times) [2024-06-15 20:03:06,195][1653645] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-06-15 20:03:06,374][1651596] Signal inference workers to resume experience collection... (34450 times) [2024-06-15 20:03:06,375][1653645] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-06-15 20:03:06,430][1653645] Updated weights for policy 0, policy_version 662644 (0.0012) [2024-06-15 20:03:08,919][1653645] Updated weights for policy 0, policy_version 662722 (0.0095) [2024-06-15 20:03:10,105][1653645] Updated weights for policy 0, policy_version 662775 (0.0019) [2024-06-15 20:03:10,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 46421.1, 300 sec: 44875.5). Total num frames: 1357381632. Throughput: 0: 11389.1. Samples: 339390976. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:03:10,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:03:11,900][1653645] Updated weights for policy 0, policy_version 662816 (0.0011) [2024-06-15 20:03:15,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1357512704. Throughput: 0: 11343.6. Samples: 339460608. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:03:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:03:17,623][1653645] Updated weights for policy 0, policy_version 662851 (0.0063) [2024-06-15 20:03:19,325][1653645] Updated weights for policy 0, policy_version 662918 (0.0015) [2024-06-15 20:03:20,958][1648982] Fps is (10 sec: 42600.0, 60 sec: 46421.4, 300 sec: 44986.6). Total num frames: 1357807616. Throughput: 0: 11332.4. Samples: 339492352. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:03:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:03:21,374][1653645] Updated weights for policy 0, policy_version 663008 (0.0202) [2024-06-15 20:03:24,205][1653645] Updated weights for policy 0, policy_version 663056 (0.0030) [2024-06-15 20:03:25,970][1648982] Fps is (10 sec: 52363.8, 60 sec: 45865.8, 300 sec: 44984.7). Total num frames: 1358036992. Throughput: 0: 11329.1. Samples: 339554304. Policy #0 lag: (min: 54.0, avg: 172.4, max: 310.0) [2024-06-15 20:03:25,971][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:03:30,229][1653645] Updated weights for policy 0, policy_version 663109 (0.0013) [2024-06-15 20:03:30,958][1648982] Fps is (10 sec: 29490.9, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 1358102528. Throughput: 0: 11116.1. Samples: 339621888. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:03:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:03:32,713][1653645] Updated weights for policy 0, policy_version 663216 (0.0106) [2024-06-15 20:03:34,297][1653645] Updated weights for policy 0, policy_version 663264 (0.0011) [2024-06-15 20:03:35,958][1648982] Fps is (10 sec: 39369.3, 60 sec: 43690.6, 300 sec: 44875.4). Total num frames: 1358430208. Throughput: 0: 10865.7. Samples: 339645440. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:03:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:03:37,124][1653645] Updated weights for policy 0, policy_version 663319 (0.0011) [2024-06-15 20:03:37,941][1653645] Updated weights for policy 0, policy_version 663359 (0.0012) [2024-06-15 20:03:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1358561280. Throughput: 0: 10854.5. Samples: 339712512. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:03:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:03:43,212][1653645] Updated weights for policy 0, policy_version 663410 (0.0112) [2024-06-15 20:03:45,057][1653645] Updated weights for policy 0, policy_version 663480 (0.0082) [2024-06-15 20:03:45,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.7, 300 sec: 44653.3). Total num frames: 1358856192. Throughput: 0: 10808.9. Samples: 339777024. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:03:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:03:46,437][1653645] Updated weights for policy 0, policy_version 663536 (0.0013) [2024-06-15 20:03:49,317][1653645] Updated weights for policy 0, policy_version 663574 (0.0012) [2024-06-15 20:03:49,575][1651596] Signal inference workers to stop experience collection... (34500 times) [2024-06-15 20:03:49,615][1653645] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-06-15 20:03:49,811][1651596] Signal inference workers to resume experience collection... (34500 times) [2024-06-15 20:03:49,811][1653645] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-06-15 20:03:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44653.4). Total num frames: 1359085568. Throughput: 0: 10774.7. Samples: 339811328. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:03:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:03:54,866][1653645] Updated weights for policy 0, policy_version 663648 (0.0012) [2024-06-15 20:03:55,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1359216640. Throughput: 0: 10899.9. Samples: 339881472. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:03:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:03:57,356][1653645] Updated weights for policy 0, policy_version 663734 (0.0119) [2024-06-15 20:03:59,011][1653645] Updated weights for policy 0, policy_version 663803 (0.0020) [2024-06-15 20:04:00,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 42598.5, 300 sec: 44764.4). Total num frames: 1359478784. Throughput: 0: 10478.9. Samples: 339932160. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:04:03,025][1653645] Updated weights for policy 0, policy_version 663864 (0.0018) [2024-06-15 20:04:05,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1359609856. Throughput: 0: 10524.4. Samples: 339965952. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:04:07,628][1653645] Updated weights for policy 0, policy_version 663920 (0.0067) [2024-06-15 20:04:09,254][1653645] Updated weights for policy 0, policy_version 663984 (0.0029) [2024-06-15 20:04:10,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 42598.7, 300 sec: 44653.4). Total num frames: 1359937536. Throughput: 0: 10686.7. Samples: 340035072. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:04:11,253][1653645] Updated weights for policy 0, policy_version 664063 (0.0012) [2024-06-15 20:04:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44653.3). Total num frames: 1360134144. Throughput: 0: 10535.8. Samples: 340096000. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:04:19,071][1653645] Updated weights for policy 0, policy_version 664129 (0.0026) [2024-06-15 20:04:20,187][1653645] Updated weights for policy 0, policy_version 664178 (0.0011) [2024-06-15 20:04:20,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 41506.1, 300 sec: 44320.1). Total num frames: 1360297984. Throughput: 0: 10911.4. Samples: 340136448. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:04:22,598][1653645] Updated weights for policy 0, policy_version 664272 (0.0012) [2024-06-15 20:04:25,822][1653645] Updated weights for policy 0, policy_version 664328 (0.0069) [2024-06-15 20:04:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42061.0, 300 sec: 44653.3). Total num frames: 1360560128. Throughput: 0: 10695.1. Samples: 340193792. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:25,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:04:30,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 42598.3, 300 sec: 43986.8). Total num frames: 1360658432. Throughput: 0: 10979.5. Samples: 340271104. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:04:31,442][1653645] Updated weights for policy 0, policy_version 664400 (0.0015) [2024-06-15 20:04:33,004][1653645] Updated weights for policy 0, policy_version 664464 (0.0013) [2024-06-15 20:04:33,990][1651596] Signal inference workers to stop experience collection... (34550 times) [2024-06-15 20:04:34,041][1653645] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-06-15 20:04:34,232][1651596] Signal inference workers to resume experience collection... (34550 times) [2024-06-15 20:04:34,232][1653645] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-06-15 20:04:34,888][1653645] Updated weights for policy 0, policy_version 664544 (0.0010) [2024-06-15 20:04:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 1361051648. Throughput: 0: 10831.6. Samples: 340298752. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:04:39,206][1653645] Updated weights for policy 0, policy_version 664636 (0.0016) [2024-06-15 20:04:40,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1361182720. Throughput: 0: 10570.0. Samples: 340357120. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:40,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:04:44,926][1653645] Updated weights for policy 0, policy_version 664724 (0.0103) [2024-06-15 20:04:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 1361444864. Throughput: 0: 11013.7. Samples: 340427776. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:04:46,440][1653645] Updated weights for policy 0, policy_version 664793 (0.0012) [2024-06-15 20:04:50,226][1653645] Updated weights for policy 0, policy_version 664837 (0.0013) [2024-06-15 20:04:50,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1361641472. Throughput: 0: 10979.6. Samples: 340460032. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:04:51,615][1653645] Updated weights for policy 0, policy_version 664892 (0.0012) [2024-06-15 20:04:55,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 1361805312. Throughput: 0: 11013.6. Samples: 340530688. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:04:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:04:56,298][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000664960_1361838080.pth... [2024-06-15 20:04:56,299][1653645] Updated weights for policy 0, policy_version 664960 (0.0013) [2024-06-15 20:04:56,419][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000659840_1351352320.pth [2024-06-15 20:04:57,934][1653645] Updated weights for policy 0, policy_version 665026 (0.0012) [2024-06-15 20:05:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 1362100224. Throughput: 0: 10934.0. Samples: 340588032. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:05:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:05:02,329][1653645] Updated weights for policy 0, policy_version 665089 (0.0012) [2024-06-15 20:05:03,448][1653645] Updated weights for policy 0, policy_version 665139 (0.0018) [2024-06-15 20:05:05,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1362231296. Throughput: 0: 10843.0. Samples: 340624384. Policy #0 lag: (min: 15.0, avg: 80.6, max: 271.0) [2024-06-15 20:05:05,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:05:07,004][1653645] Updated weights for policy 0, policy_version 665168 (0.0024) [2024-06-15 20:05:08,881][1653645] Updated weights for policy 0, policy_version 665250 (0.0016) [2024-06-15 20:05:10,480][1653645] Updated weights for policy 0, policy_version 665315 (0.0013) [2024-06-15 20:05:10,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44236.6, 300 sec: 44764.4). Total num frames: 1362591744. Throughput: 0: 11070.5. Samples: 340691968. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:10,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:05:15,590][1653645] Updated weights for policy 0, policy_version 665376 (0.0011) [2024-06-15 20:05:15,957][1648982] Fps is (10 sec: 45876.4, 60 sec: 42598.5, 300 sec: 44209.0). Total num frames: 1362690048. Throughput: 0: 10797.6. Samples: 340756992. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:15,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:05:19,046][1653645] Updated weights for policy 0, policy_version 665414 (0.0013) [2024-06-15 20:05:19,347][1651596] Signal inference workers to stop experience collection... (34600 times) [2024-06-15 20:05:19,405][1653645] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-06-15 20:05:19,581][1651596] Signal inference workers to resume experience collection... (34600 times) [2024-06-15 20:05:19,582][1653645] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-06-15 20:05:20,958][1648982] Fps is (10 sec: 32768.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1362919424. Throughput: 0: 10888.5. Samples: 340788736. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:05:21,061][1653645] Updated weights for policy 0, policy_version 665504 (0.0012) [2024-06-15 20:05:23,081][1653645] Updated weights for policy 0, policy_version 665595 (0.0016) [2024-06-15 20:05:25,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1363148800. Throughput: 0: 10808.9. Samples: 340843520. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:05:28,332][1653645] Updated weights for policy 0, policy_version 665664 (0.0013) [2024-06-15 20:05:30,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 43690.7, 300 sec: 43986.8). Total num frames: 1363279872. Throughput: 0: 10820.2. Samples: 340914688. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:05:32,563][1653645] Updated weights for policy 0, policy_version 665712 (0.0012) [2024-06-15 20:05:34,037][1653645] Updated weights for policy 0, policy_version 665768 (0.0017) [2024-06-15 20:05:35,407][1653645] Updated weights for policy 0, policy_version 665824 (0.0137) [2024-06-15 20:05:35,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1363640320. Throughput: 0: 10888.5. Samples: 340950016. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:05:40,367][1653645] Updated weights for policy 0, policy_version 665904 (0.0013) [2024-06-15 20:05:40,973][1648982] Fps is (10 sec: 52349.9, 60 sec: 43679.6, 300 sec: 43984.6). Total num frames: 1363804160. Throughput: 0: 10714.2. Samples: 341012992. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:40,974][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:05:44,415][1653645] Updated weights for policy 0, policy_version 665984 (0.0015) [2024-06-15 20:05:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43144.7, 300 sec: 44209.0). Total num frames: 1364033536. Throughput: 0: 10695.1. Samples: 341069312. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:05:47,164][1653645] Updated weights for policy 0, policy_version 666096 (0.0089) [2024-06-15 20:05:50,958][1648982] Fps is (10 sec: 39381.2, 60 sec: 42598.3, 300 sec: 43653.7). Total num frames: 1364197376. Throughput: 0: 10535.8. Samples: 341098496. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:50,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:05:52,969][1653645] Updated weights for policy 0, policy_version 666129 (0.0014) [2024-06-15 20:05:55,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 1364328448. Throughput: 0: 10547.2. Samples: 341166592. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:05:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:05:57,041][1653645] Updated weights for policy 0, policy_version 666225 (0.0099) [2024-06-15 20:05:59,164][1653645] Updated weights for policy 0, policy_version 666320 (0.0012) [2024-06-15 20:05:59,287][1651596] Signal inference workers to stop experience collection... (34650 times) [2024-06-15 20:05:59,338][1653645] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-06-15 20:05:59,428][1651596] Signal inference workers to resume experience collection... (34650 times) [2024-06-15 20:05:59,429][1653645] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-06-15 20:06:00,048][1653645] Updated weights for policy 0, policy_version 666367 (0.0075) [2024-06-15 20:06:00,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1364721664. Throughput: 0: 10308.2. Samples: 341220864. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:00,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:06:05,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 42598.6, 300 sec: 43764.7). Total num frames: 1364787200. Throughput: 0: 10456.2. Samples: 341259264. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:06:06,458][1653645] Updated weights for policy 0, policy_version 666432 (0.0012) [2024-06-15 20:06:09,928][1653645] Updated weights for policy 0, policy_version 666484 (0.0012) [2024-06-15 20:06:10,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 40960.1, 300 sec: 43653.6). Total num frames: 1365049344. Throughput: 0: 10672.4. Samples: 341323776. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:06:11,219][1653645] Updated weights for policy 0, policy_version 666550 (0.0011) [2024-06-15 20:06:12,658][1653645] Updated weights for policy 0, policy_version 666614 (0.0013) [2024-06-15 20:06:15,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 42598.1, 300 sec: 43542.6). Total num frames: 1365245952. Throughput: 0: 10433.4. Samples: 341384192. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:06:18,235][1653645] Updated weights for policy 0, policy_version 666659 (0.0017) [2024-06-15 20:06:20,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 41506.1, 300 sec: 43209.3). Total num frames: 1365409792. Throughput: 0: 10524.4. Samples: 341423616. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:06:21,256][1653645] Updated weights for policy 0, policy_version 666736 (0.0014) [2024-06-15 20:06:22,860][1653645] Updated weights for policy 0, policy_version 666816 (0.0013) [2024-06-15 20:06:25,966][1648982] Fps is (10 sec: 52389.0, 60 sec: 43685.0, 300 sec: 43763.6). Total num frames: 1365770240. Throughput: 0: 10355.5. Samples: 341478912. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:25,969][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:06:30,108][1653645] Updated weights for policy 0, policy_version 666912 (0.0016) [2024-06-15 20:06:30,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1365901312. Throughput: 0: 10752.0. Samples: 341553152. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:06:32,637][1653645] Updated weights for policy 0, policy_version 666964 (0.0012) [2024-06-15 20:06:34,209][1653645] Updated weights for policy 0, policy_version 667042 (0.0012) [2024-06-15 20:06:35,873][1653645] Updated weights for policy 0, policy_version 667123 (0.0015) [2024-06-15 20:06:35,958][1648982] Fps is (10 sec: 49190.4, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1366261760. Throughput: 0: 10820.3. Samples: 341585408. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:06:40,958][1648982] Fps is (10 sec: 39318.9, 60 sec: 41516.2, 300 sec: 43542.5). Total num frames: 1366294528. Throughput: 0: 10877.0. Samples: 341656064. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:06:42,421][1653645] Updated weights for policy 0, policy_version 667195 (0.0021) [2024-06-15 20:06:44,917][1651596] Signal inference workers to stop experience collection... (34700 times) [2024-06-15 20:06:44,962][1653645] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-06-15 20:06:45,208][1651596] Signal inference workers to resume experience collection... (34700 times) [2024-06-15 20:06:45,209][1653645] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-06-15 20:06:45,337][1653645] Updated weights for policy 0, policy_version 667266 (0.0013) [2024-06-15 20:06:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.5, 300 sec: 43764.7). Total num frames: 1366622208. Throughput: 0: 11013.7. Samples: 341716480. Policy #0 lag: (min: 69.0, avg: 135.4, max: 325.0) [2024-06-15 20:06:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:06:46,404][1653645] Updated weights for policy 0, policy_version 667325 (0.0015) [2024-06-15 20:06:48,101][1653645] Updated weights for policy 0, policy_version 667387 (0.0012) [2024-06-15 20:06:50,958][1648982] Fps is (10 sec: 52432.3, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1366818816. Throughput: 0: 10797.5. Samples: 341745152. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:06:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:06:55,434][1653645] Updated weights for policy 0, policy_version 667473 (0.0013) [2024-06-15 20:06:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44782.9, 300 sec: 43320.4). Total num frames: 1367015424. Throughput: 0: 11127.5. Samples: 341824512. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:06:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:06:56,271][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000667520_1367080960.pth... [2024-06-15 20:06:56,521][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000662416_1356627968.pth [2024-06-15 20:06:56,857][1653645] Updated weights for policy 0, policy_version 667536 (0.0013) [2024-06-15 20:06:57,897][1653645] Updated weights for policy 0, policy_version 667577 (0.0014) [2024-06-15 20:06:59,634][1653645] Updated weights for policy 0, policy_version 667648 (0.0022) [2024-06-15 20:07:00,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1367343104. Throughput: 0: 11093.4. Samples: 341883392. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:00,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:07:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.8, 300 sec: 43653.6). Total num frames: 1367474176. Throughput: 0: 11161.6. Samples: 341925888. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:07:06,893][1653645] Updated weights for policy 0, policy_version 667716 (0.0014) [2024-06-15 20:07:08,652][1653645] Updated weights for policy 0, policy_version 667792 (0.0101) [2024-06-15 20:07:10,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.0, 300 sec: 43542.6). Total num frames: 1367736320. Throughput: 0: 11186.3. Samples: 341982208. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:07:11,408][1653645] Updated weights for policy 0, policy_version 667859 (0.0013) [2024-06-15 20:07:15,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 1367867392. Throughput: 0: 11093.3. Samples: 342052352. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:15,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:07:17,319][1653645] Updated weights for policy 0, policy_version 667920 (0.0011) [2024-06-15 20:07:18,546][1653645] Updated weights for policy 0, policy_version 667962 (0.0012) [2024-06-15 20:07:20,265][1653645] Updated weights for policy 0, policy_version 668025 (0.0082) [2024-06-15 20:07:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 43653.7). Total num frames: 1368162304. Throughput: 0: 11127.5. Samples: 342086144. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:07:21,778][1653645] Updated weights for policy 0, policy_version 668093 (0.0012) [2024-06-15 20:07:24,614][1653645] Updated weights for policy 0, policy_version 668153 (0.0084) [2024-06-15 20:07:25,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43696.3, 300 sec: 43875.8). Total num frames: 1368391680. Throughput: 0: 10752.2. Samples: 342139904. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:07:30,829][1653645] Updated weights for policy 0, policy_version 668208 (0.0012) [2024-06-15 20:07:30,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1368489984. Throughput: 0: 11127.5. Samples: 342217216. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:07:30,959][1651596] Signal inference workers to stop experience collection... (34750 times) [2024-06-15 20:07:31,009][1653645] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-06-15 20:07:31,200][1651596] Signal inference workers to resume experience collection... (34750 times) [2024-06-15 20:07:31,201][1653645] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-06-15 20:07:33,727][1653645] Updated weights for policy 0, policy_version 668336 (0.0118) [2024-06-15 20:07:35,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 42052.1, 300 sec: 43542.5). Total num frames: 1368784896. Throughput: 0: 10945.4. Samples: 342237696. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:35,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:07:36,955][1653645] Updated weights for policy 0, policy_version 668406 (0.0014) [2024-06-15 20:07:40,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43691.2, 300 sec: 43098.2). Total num frames: 1368915968. Throughput: 0: 10774.8. Samples: 342309376. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:40,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:07:42,994][1653645] Updated weights for policy 0, policy_version 668466 (0.0096) [2024-06-15 20:07:45,055][1653645] Updated weights for policy 0, policy_version 668548 (0.0103) [2024-06-15 20:07:45,962][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 1369243648. Throughput: 0: 10774.8. Samples: 342368256. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:45,962][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:07:48,276][1653645] Updated weights for policy 0, policy_version 668624 (0.0011) [2024-06-15 20:07:49,294][1653645] Updated weights for policy 0, policy_version 668672 (0.0012) [2024-06-15 20:07:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1369440256. Throughput: 0: 10547.2. Samples: 342400512. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:07:55,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1369604096. Throughput: 0: 10934.0. Samples: 342474240. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:07:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:07:56,165][1653645] Updated weights for policy 0, policy_version 668756 (0.0014) [2024-06-15 20:07:57,804][1653645] Updated weights for policy 0, policy_version 668817 (0.0013) [2024-06-15 20:07:59,940][1653645] Updated weights for policy 0, policy_version 668869 (0.0013) [2024-06-15 20:08:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 43653.6). Total num frames: 1369899008. Throughput: 0: 10649.7. Samples: 342531584. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:08:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:08:01,253][1653645] Updated weights for policy 0, policy_version 668926 (0.0012) [2024-06-15 20:08:05,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 41506.2, 300 sec: 42654.0). Total num frames: 1369964544. Throughput: 0: 10604.1. Samples: 342563328. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:08:05,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:08:07,558][1653645] Updated weights for policy 0, policy_version 668980 (0.0012) [2024-06-15 20:08:09,147][1653645] Updated weights for policy 0, policy_version 669049 (0.0012) [2024-06-15 20:08:10,479][1653645] Updated weights for policy 0, policy_version 669091 (0.0025) [2024-06-15 20:08:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 1370324992. Throughput: 0: 10820.3. Samples: 342626816. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:08:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:08:12,495][1651596] Signal inference workers to stop experience collection... (34800 times) [2024-06-15 20:08:12,536][1653645] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-06-15 20:08:12,858][1651596] Signal inference workers to resume experience collection... (34800 times) [2024-06-15 20:08:12,859][1653645] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-06-15 20:08:12,861][1653645] Updated weights for policy 0, policy_version 669136 (0.0028) [2024-06-15 20:08:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.9, 300 sec: 42987.2). Total num frames: 1370488832. Throughput: 0: 10456.2. Samples: 342687744. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:08:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:08:19,697][1653645] Updated weights for policy 0, policy_version 669232 (0.0012) [2024-06-15 20:08:20,463][1653645] Updated weights for policy 0, policy_version 669257 (0.0013) [2024-06-15 20:08:20,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 41506.0, 300 sec: 42766.8). Total num frames: 1370652672. Throughput: 0: 10865.8. Samples: 342726656. Policy #0 lag: (min: 81.0, avg: 180.2, max: 337.0) [2024-06-15 20:08:20,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 20:08:21,723][1653645] Updated weights for policy 0, policy_version 669307 (0.0011) [2024-06-15 20:08:23,427][1653645] Updated weights for policy 0, policy_version 669365 (0.0013) [2024-06-15 20:08:25,435][1653645] Updated weights for policy 0, policy_version 669395 (0.0013) [2024-06-15 20:08:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 43542.6). Total num frames: 1370947584. Throughput: 0: 10513.0. Samples: 342782464. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:08:30,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1371013120. Throughput: 0: 10752.0. Samples: 342852096. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:08:32,516][1653645] Updated weights for policy 0, policy_version 669476 (0.0015) [2024-06-15 20:08:34,098][1653645] Updated weights for policy 0, policy_version 669527 (0.0082) [2024-06-15 20:08:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 43320.4). Total num frames: 1371340800. Throughput: 0: 10706.5. Samples: 342882304. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:08:35,980][1653645] Updated weights for policy 0, policy_version 669601 (0.0014) [2024-06-15 20:08:37,906][1653645] Updated weights for policy 0, policy_version 669664 (0.0123) [2024-06-15 20:08:38,510][1653645] Updated weights for policy 0, policy_version 669696 (0.0024) [2024-06-15 20:08:40,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 43690.4, 300 sec: 42987.1). Total num frames: 1371537408. Throughput: 0: 10365.1. Samples: 342940672. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:40,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:08:45,942][1653645] Updated weights for policy 0, policy_version 669763 (0.0013) [2024-06-15 20:08:45,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 40413.9, 300 sec: 42653.9). Total num frames: 1371668480. Throughput: 0: 10615.5. Samples: 343009280. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:08:47,868][1653645] Updated weights for policy 0, policy_version 669841 (0.0113) [2024-06-15 20:08:50,419][1653645] Updated weights for policy 0, policy_version 669905 (0.0012) [2024-06-15 20:08:50,958][1648982] Fps is (10 sec: 45877.1, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 1371996160. Throughput: 0: 10456.2. Samples: 343033856. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:08:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 42654.0). Total num frames: 1372061696. Throughput: 0: 10547.2. Samples: 343101440. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:08:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:08:55,967][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000669952_1372061696.pth... [2024-06-15 20:08:56,021][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000664960_1361838080.pth [2024-06-15 20:08:56,683][1653645] Updated weights for policy 0, policy_version 669968 (0.0012) [2024-06-15 20:08:59,110][1651596] Signal inference workers to stop experience collection... (34850 times) [2024-06-15 20:08:59,162][1653645] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-06-15 20:08:59,176][1653645] Updated weights for policy 0, policy_version 670053 (0.0014) [2024-06-15 20:08:59,296][1651596] Signal inference workers to resume experience collection... (34850 times) [2024-06-15 20:08:59,297][1653645] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-06-15 20:09:00,928][1653645] Updated weights for policy 0, policy_version 670131 (0.0012) [2024-06-15 20:09:00,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 42052.1, 300 sec: 43431.5). Total num frames: 1372422144. Throughput: 0: 10387.9. Samples: 343155200. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:09:03,506][1653645] Updated weights for policy 0, policy_version 670176 (0.0011) [2024-06-15 20:09:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 1372585984. Throughput: 0: 10410.7. Samples: 343195136. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:05,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 20:09:09,459][1653645] Updated weights for policy 0, policy_version 670228 (0.0027) [2024-06-15 20:09:10,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 40413.7, 300 sec: 42765.0). Total num frames: 1372749824. Throughput: 0: 10672.3. Samples: 343262720. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:09:11,395][1653645] Updated weights for policy 0, policy_version 670304 (0.0052) [2024-06-15 20:09:13,063][1653645] Updated weights for policy 0, policy_version 670371 (0.0012) [2024-06-15 20:09:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 43098.3). Total num frames: 1373011968. Throughput: 0: 10467.6. Samples: 343323136. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:09:16,190][1653645] Updated weights for policy 0, policy_version 670432 (0.0012) [2024-06-15 20:09:20,958][1648982] Fps is (10 sec: 36045.7, 60 sec: 40960.1, 300 sec: 42542.9). Total num frames: 1373110272. Throughput: 0: 10456.2. Samples: 343352832. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:09:21,990][1653645] Updated weights for policy 0, policy_version 670483 (0.0013) [2024-06-15 20:09:23,316][1653645] Updated weights for policy 0, policy_version 670547 (0.0012) [2024-06-15 20:09:25,391][1653645] Updated weights for policy 0, policy_version 670626 (0.0121) [2024-06-15 20:09:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 1373470720. Throughput: 0: 10501.8. Samples: 343413248. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:09:26,051][1653645] Updated weights for policy 0, policy_version 670655 (0.0013) [2024-06-15 20:09:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 42653.9). Total num frames: 1373634560. Throughput: 0: 10490.3. Samples: 343481344. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:09:33,329][1653645] Updated weights for policy 0, policy_version 670725 (0.0013) [2024-06-15 20:09:34,554][1653645] Updated weights for policy 0, policy_version 670784 (0.0019) [2024-06-15 20:09:35,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42987.2). Total num frames: 1373863936. Throughput: 0: 10797.5. Samples: 343519744. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:09:36,832][1653645] Updated weights for policy 0, policy_version 670865 (0.0012) [2024-06-15 20:09:37,903][1653645] Updated weights for policy 0, policy_version 670912 (0.0102) [2024-06-15 20:09:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 41506.4, 300 sec: 42654.0). Total num frames: 1374027776. Throughput: 0: 10592.7. Samples: 343578112. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:09:41,807][1653645] Updated weights for policy 0, policy_version 670960 (0.0045) [2024-06-15 20:09:44,904][1651596] Signal inference workers to stop experience collection... (34900 times) [2024-06-15 20:09:44,941][1653645] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-06-15 20:09:45,273][1651596] Signal inference workers to resume experience collection... (34900 times) [2024-06-15 20:09:45,275][1653645] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-06-15 20:09:45,848][1653645] Updated weights for policy 0, policy_version 671012 (0.0019) [2024-06-15 20:09:45,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1374224384. Throughput: 0: 10877.2. Samples: 343644672. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:09:47,481][1653645] Updated weights for policy 0, policy_version 671076 (0.0012) [2024-06-15 20:09:50,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 1374552064. Throughput: 0: 10649.6. Samples: 343674368. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:09:52,864][1653645] Updated weights for policy 0, policy_version 671169 (0.0013) [2024-06-15 20:09:55,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 1374683136. Throughput: 0: 10410.7. Samples: 343731200. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:09:55,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:09:57,412][1653645] Updated weights for policy 0, policy_version 671239 (0.0021) [2024-06-15 20:10:00,014][1653645] Updated weights for policy 0, policy_version 671299 (0.0015) [2024-06-15 20:10:00,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 41506.3, 300 sec: 42987.2). Total num frames: 1374912512. Throughput: 0: 10615.5. Samples: 343800832. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:10:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:10:02,044][1653645] Updated weights for policy 0, policy_version 671378 (0.0013) [2024-06-15 20:10:05,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 1375076352. Throughput: 0: 10478.9. Samples: 343824384. Policy #0 lag: (min: 15.0, avg: 155.3, max: 271.0) [2024-06-15 20:10:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:10:07,538][1653645] Updated weights for policy 0, policy_version 671482 (0.0131) [2024-06-15 20:10:10,771][1653645] Updated weights for policy 0, policy_version 671536 (0.0013) [2024-06-15 20:10:10,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1375305728. Throughput: 0: 10706.5. Samples: 343895040. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:10:13,024][1653645] Updated weights for policy 0, policy_version 671585 (0.0014) [2024-06-15 20:10:14,823][1653645] Updated weights for policy 0, policy_version 671664 (0.0015) [2024-06-15 20:10:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 1375600640. Throughput: 0: 10535.8. Samples: 343955456. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:10:19,290][1653645] Updated weights for policy 0, policy_version 671742 (0.0011) [2024-06-15 20:10:20,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.5, 300 sec: 42653.9). Total num frames: 1375731712. Throughput: 0: 10433.4. Samples: 343989248. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:10:21,946][1653645] Updated weights for policy 0, policy_version 671797 (0.0020) [2024-06-15 20:10:25,575][1653645] Updated weights for policy 0, policy_version 671858 (0.0015) [2024-06-15 20:10:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 43098.3). Total num frames: 1375993856. Throughput: 0: 10752.0. Samples: 344061952. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:10:26,903][1653645] Updated weights for policy 0, policy_version 671920 (0.0019) [2024-06-15 20:10:30,696][1651596] Signal inference workers to stop experience collection... (34950 times) [2024-06-15 20:10:30,755][1653645] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-06-15 20:10:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1376157696. Throughput: 0: 10729.2. Samples: 344127488. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:30,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:10:30,974][1651596] Signal inference workers to resume experience collection... (34950 times) [2024-06-15 20:10:30,975][1653645] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-06-15 20:10:31,329][1653645] Updated weights for policy 0, policy_version 671984 (0.0012) [2024-06-15 20:10:32,826][1653645] Updated weights for policy 0, policy_version 672048 (0.0014) [2024-06-15 20:10:35,960][1648982] Fps is (10 sec: 39314.3, 60 sec: 42051.0, 300 sec: 42655.9). Total num frames: 1376387072. Throughput: 0: 10706.0. Samples: 344156160. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:35,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:10:37,723][1653645] Updated weights for policy 0, policy_version 672113 (0.0045) [2024-06-15 20:10:39,382][1653645] Updated weights for policy 0, policy_version 672185 (0.0128) [2024-06-15 20:10:40,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1376649216. Throughput: 0: 10888.5. Samples: 344221184. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:10:43,475][1653645] Updated weights for policy 0, policy_version 672248 (0.0032) [2024-06-15 20:10:45,464][1653645] Updated weights for policy 0, policy_version 672313 (0.0012) [2024-06-15 20:10:45,958][1648982] Fps is (10 sec: 52438.0, 60 sec: 44782.9, 300 sec: 43098.3). Total num frames: 1376911360. Throughput: 0: 10763.3. Samples: 344285184. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:10:49,581][1653645] Updated weights for policy 0, policy_version 672355 (0.0027) [2024-06-15 20:10:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 43209.3). Total num frames: 1377075200. Throughput: 0: 11036.5. Samples: 344321024. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 20:10:51,084][1653645] Updated weights for policy 0, policy_version 672416 (0.0012) [2024-06-15 20:10:54,780][1653645] Updated weights for policy 0, policy_version 672464 (0.0012) [2024-06-15 20:10:55,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1377271808. Throughput: 0: 10945.4. Samples: 344387584. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:10:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:10:56,611][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000672528_1377337344.pth... [2024-06-15 20:10:56,626][1653645] Updated weights for policy 0, policy_version 672528 (0.0015) [2024-06-15 20:10:56,773][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000667520_1367080960.pth [2024-06-15 20:10:56,778][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000672528_1377337344.pth [2024-06-15 20:11:00,958][1648982] Fps is (10 sec: 36043.7, 60 sec: 42052.0, 300 sec: 42876.0). Total num frames: 1377435648. Throughput: 0: 10968.1. Samples: 344449024. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:11:01,328][1653645] Updated weights for policy 0, policy_version 672582 (0.0013) [2024-06-15 20:11:03,028][1653645] Updated weights for policy 0, policy_version 672656 (0.0012) [2024-06-15 20:11:05,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 1377697792. Throughput: 0: 10854.5. Samples: 344477696. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:05,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:11:06,897][1653645] Updated weights for policy 0, policy_version 672705 (0.0011) [2024-06-15 20:11:08,249][1653645] Updated weights for policy 0, policy_version 672753 (0.0034) [2024-06-15 20:11:09,836][1653645] Updated weights for policy 0, policy_version 672831 (0.0014) [2024-06-15 20:11:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44236.7, 300 sec: 43098.2). Total num frames: 1377959936. Throughput: 0: 10683.7. Samples: 344542720. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:10,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:11:13,941][1653645] Updated weights for policy 0, policy_version 672886 (0.0010) [2024-06-15 20:11:14,172][1651596] Signal inference workers to stop experience collection... (35000 times) [2024-06-15 20:11:14,207][1653645] InferenceWorker_p0-w0: stopping experience collection (35000 times) [2024-06-15 20:11:14,207][1651596] Signal inference workers to resume experience collection... (35000 times) [2024-06-15 20:11:14,230][1653645] InferenceWorker_p0-w0: resuming experience collection (35000 times) [2024-06-15 20:11:15,215][1653645] Updated weights for policy 0, policy_version 672937 (0.0011) [2024-06-15 20:11:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1378222080. Throughput: 0: 10729.3. Samples: 344610304. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:11:18,763][1653645] Updated weights for policy 0, policy_version 672977 (0.0027) [2024-06-15 20:11:20,916][1653645] Updated weights for policy 0, policy_version 673086 (0.0015) [2024-06-15 20:11:20,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 45875.5, 300 sec: 43099.4). Total num frames: 1378484224. Throughput: 0: 11036.9. Samples: 344652800. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:20,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:11:25,153][1653645] Updated weights for policy 0, policy_version 673151 (0.0019) [2024-06-15 20:11:25,957][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 1378615296. Throughput: 0: 10934.1. Samples: 344713216. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:11:27,815][1653645] Updated weights for policy 0, policy_version 673208 (0.0020) [2024-06-15 20:11:30,958][1648982] Fps is (10 sec: 29490.3, 60 sec: 43690.6, 300 sec: 42431.7). Total num frames: 1378779136. Throughput: 0: 11002.3. Samples: 344780288. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:30,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 20:11:32,219][1653645] Updated weights for policy 0, policy_version 673282 (0.0036) [2024-06-15 20:11:33,602][1653645] Updated weights for policy 0, policy_version 673341 (0.0013) [2024-06-15 20:11:35,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43692.1, 300 sec: 43098.4). Total num frames: 1379008512. Throughput: 0: 10706.5. Samples: 344802816. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:11:38,408][1653645] Updated weights for policy 0, policy_version 673424 (0.0135) [2024-06-15 20:11:40,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43690.6, 300 sec: 42876.1). Total num frames: 1379270656. Throughput: 0: 10649.6. Samples: 344866816. Policy #0 lag: (min: 15.0, avg: 129.5, max: 271.0) [2024-06-15 20:11:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:11:44,271][1653645] Updated weights for policy 0, policy_version 673504 (0.0012) [2024-06-15 20:11:45,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 1379467264. Throughput: 0: 10820.4. Samples: 344935936. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:11:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:11:46,123][1653645] Updated weights for policy 0, policy_version 673569 (0.0014) [2024-06-15 20:11:48,219][1653645] Updated weights for policy 0, policy_version 673602 (0.0011) [2024-06-15 20:11:50,676][1653645] Updated weights for policy 0, policy_version 673665 (0.0012) [2024-06-15 20:11:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 1379696640. Throughput: 0: 10945.4. Samples: 344970240. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:11:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:11:51,956][1653645] Updated weights for policy 0, policy_version 673723 (0.0011) [2024-06-15 20:11:55,958][1648982] Fps is (10 sec: 32765.9, 60 sec: 42052.0, 300 sec: 42209.6). Total num frames: 1379794944. Throughput: 0: 10945.4. Samples: 345035264. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:11:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:11:57,618][1653645] Updated weights for policy 0, policy_version 673795 (0.0091) [2024-06-15 20:11:58,912][1653645] Updated weights for policy 0, policy_version 673851 (0.0014) [2024-06-15 20:12:00,755][1651596] Signal inference workers to stop experience collection... (35050 times) [2024-06-15 20:12:00,833][1653645] InferenceWorker_p0-w0: stopping experience collection (35050 times) [2024-06-15 20:12:00,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44237.1, 300 sec: 42765.0). Total num frames: 1380089856. Throughput: 0: 10786.1. Samples: 345095680. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:12:01,102][1651596] Signal inference workers to resume experience collection... (35050 times) [2024-06-15 20:12:01,102][1653645] InferenceWorker_p0-w0: resuming experience collection (35050 times) [2024-06-15 20:12:03,689][1653645] Updated weights for policy 0, policy_version 673922 (0.0124) [2024-06-15 20:12:04,681][1653645] Updated weights for policy 0, policy_version 673983 (0.0078) [2024-06-15 20:12:05,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.3, 300 sec: 42653.9). Total num frames: 1380319232. Throughput: 0: 10501.6. Samples: 345125376. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:05,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:12:10,375][1653645] Updated weights for policy 0, policy_version 674064 (0.0015) [2024-06-15 20:12:10,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 42598.6, 300 sec: 42876.1). Total num frames: 1380515840. Throughput: 0: 10672.3. Samples: 345193472. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:12:13,315][1653645] Updated weights for policy 0, policy_version 674129 (0.0013) [2024-06-15 20:12:14,402][1653645] Updated weights for policy 0, policy_version 674176 (0.0029) [2024-06-15 20:12:15,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 41506.0, 300 sec: 42542.8). Total num frames: 1380712448. Throughput: 0: 10490.3. Samples: 345252352. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:12:16,910][1653645] Updated weights for policy 0, policy_version 674238 (0.0012) [2024-06-15 20:12:20,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 39867.7, 300 sec: 42320.7). Total num frames: 1380876288. Throughput: 0: 10831.6. Samples: 345290240. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:12:22,398][1653645] Updated weights for policy 0, policy_version 674320 (0.0025) [2024-06-15 20:12:24,797][1653645] Updated weights for policy 0, policy_version 674374 (0.0012) [2024-06-15 20:12:25,898][1653645] Updated weights for policy 0, policy_version 674424 (0.0011) [2024-06-15 20:12:25,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 43144.4, 300 sec: 43098.3). Total num frames: 1381203968. Throughput: 0: 10729.3. Samples: 345349632. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:25,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:12:29,769][1653645] Updated weights for policy 0, policy_version 674486 (0.0013) [2024-06-15 20:12:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 1381367808. Throughput: 0: 10626.8. Samples: 345414144. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:12:32,411][1653645] Updated weights for policy 0, policy_version 674513 (0.0010) [2024-06-15 20:12:34,040][1653645] Updated weights for policy 0, policy_version 674576 (0.0012) [2024-06-15 20:12:35,002][1653645] Updated weights for policy 0, policy_version 674616 (0.0014) [2024-06-15 20:12:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 1381629952. Throughput: 0: 10626.8. Samples: 345448448. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:12:37,565][1653645] Updated weights for policy 0, policy_version 674675 (0.0108) [2024-06-15 20:12:40,958][1648982] Fps is (10 sec: 42596.6, 60 sec: 42052.0, 300 sec: 42542.8). Total num frames: 1381793792. Throughput: 0: 10695.1. Samples: 345516544. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:12:41,624][1653645] Updated weights for policy 0, policy_version 674736 (0.0024) [2024-06-15 20:12:44,680][1653645] Updated weights for policy 0, policy_version 674785 (0.0013) [2024-06-15 20:12:45,549][1653645] Updated weights for policy 0, policy_version 674821 (0.0022) [2024-06-15 20:12:45,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1382055936. Throughput: 0: 10763.3. Samples: 345580032. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:12:46,185][1651596] Signal inference workers to stop experience collection... (35100 times) [2024-06-15 20:12:46,242][1653645] InferenceWorker_p0-w0: stopping experience collection (35100 times) [2024-06-15 20:12:46,482][1651596] Signal inference workers to resume experience collection... (35100 times) [2024-06-15 20:12:46,483][1653645] InferenceWorker_p0-w0: resuming experience collection (35100 times) [2024-06-15 20:12:46,756][1653645] Updated weights for policy 0, policy_version 674879 (0.0013) [2024-06-15 20:12:50,011][1653645] Updated weights for policy 0, policy_version 674940 (0.0015) [2024-06-15 20:12:50,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1382285312. Throughput: 0: 10877.2. Samples: 345614848. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:12:53,504][1653645] Updated weights for policy 0, policy_version 674992 (0.0011) [2024-06-15 20:12:55,802][1653645] Updated weights for policy 0, policy_version 675033 (0.0123) [2024-06-15 20:12:55,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 44783.1, 300 sec: 42653.9). Total num frames: 1382481920. Throughput: 0: 10934.0. Samples: 345685504. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:12:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:12:56,291][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000675056_1382514688.pth... [2024-06-15 20:12:56,447][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000669952_1372061696.pth [2024-06-15 20:12:57,108][1653645] Updated weights for policy 0, policy_version 675088 (0.0028) [2024-06-15 20:13:00,876][1653645] Updated weights for policy 0, policy_version 675152 (0.0013) [2024-06-15 20:13:00,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 1382711296. Throughput: 0: 11093.4. Samples: 345751552. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:13:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:13:04,220][1653645] Updated weights for policy 0, policy_version 675216 (0.0017) [2024-06-15 20:13:05,322][1653645] Updated weights for policy 0, policy_version 675264 (0.0020) [2024-06-15 20:13:05,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 43691.0, 300 sec: 42765.0). Total num frames: 1382940672. Throughput: 0: 10922.7. Samples: 345781760. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:13:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:13:08,573][1653645] Updated weights for policy 0, policy_version 675323 (0.0013) [2024-06-15 20:13:10,021][1653645] Updated weights for policy 0, policy_version 675387 (0.0012) [2024-06-15 20:13:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 44782.9, 300 sec: 43098.2). Total num frames: 1383202816. Throughput: 0: 11036.4. Samples: 345846272. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:13:10,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:13:14,035][1653645] Updated weights for policy 0, policy_version 675446 (0.0013) [2024-06-15 20:13:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.9, 300 sec: 42987.2). Total num frames: 1383333888. Throughput: 0: 11207.1. Samples: 345918464. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:13:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:13:17,040][1653645] Updated weights for policy 0, policy_version 675504 (0.0012) [2024-06-15 20:13:19,945][1653645] Updated weights for policy 0, policy_version 675560 (0.0015) [2024-06-15 20:13:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45875.2, 300 sec: 42987.2). Total num frames: 1383628800. Throughput: 0: 11150.2. Samples: 345950208. Policy #0 lag: (min: 68.0, avg: 151.6, max: 324.0) [2024-06-15 20:13:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:13:21,698][1653645] Updated weights for policy 0, policy_version 675639 (0.0019) [2024-06-15 20:13:25,654][1653645] Updated weights for policy 0, policy_version 675696 (0.0020) [2024-06-15 20:13:25,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1383825408. Throughput: 0: 11013.8. Samples: 346012160. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:13:28,013][1653645] Updated weights for policy 0, policy_version 675728 (0.0011) [2024-06-15 20:13:30,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 43690.5, 300 sec: 42876.1). Total num frames: 1383989248. Throughput: 0: 11207.1. Samples: 346084352. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:30,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:13:31,342][1653645] Updated weights for policy 0, policy_version 675779 (0.0012) [2024-06-15 20:13:32,731][1653645] Updated weights for policy 0, policy_version 675840 (0.0012) [2024-06-15 20:13:32,871][1651596] Signal inference workers to stop experience collection... (35150 times) [2024-06-15 20:13:32,904][1653645] InferenceWorker_p0-w0: stopping experience collection (35150 times) [2024-06-15 20:13:33,062][1651596] Signal inference workers to resume experience collection... (35150 times) [2024-06-15 20:13:33,063][1653645] InferenceWorker_p0-w0: resuming experience collection (35150 times) [2024-06-15 20:13:33,898][1653645] Updated weights for policy 0, policy_version 675894 (0.0012) [2024-06-15 20:13:35,957][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 43098.3). Total num frames: 1384251392. Throughput: 0: 11127.5. Samples: 346115584. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:13:36,879][1653645] Updated weights for policy 0, policy_version 675952 (0.0013) [2024-06-15 20:13:38,751][1653645] Updated weights for policy 0, policy_version 676000 (0.0011) [2024-06-15 20:13:40,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 45329.4, 300 sec: 43542.6). Total num frames: 1384513536. Throughput: 0: 11116.2. Samples: 346185728. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:13:43,854][1653645] Updated weights for policy 0, policy_version 676080 (0.0121) [2024-06-15 20:13:45,364][1653645] Updated weights for policy 0, policy_version 676148 (0.0013) [2024-06-15 20:13:45,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 45328.9, 300 sec: 43320.4). Total num frames: 1384775680. Throughput: 0: 11184.3. Samples: 346254848. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:45,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 20:13:47,591][1653645] Updated weights for policy 0, policy_version 676179 (0.0011) [2024-06-15 20:13:49,116][1653645] Updated weights for policy 0, policy_version 676225 (0.0017) [2024-06-15 20:13:50,555][1653645] Updated weights for policy 0, policy_version 676288 (0.0101) [2024-06-15 20:13:50,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.3, 300 sec: 43986.9). Total num frames: 1385037824. Throughput: 0: 11286.7. Samples: 346289664. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:13:55,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 44236.9, 300 sec: 43098.3). Total num frames: 1385136128. Throughput: 0: 11366.4. Samples: 346357760. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:13:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:13:56,445][1653645] Updated weights for policy 0, policy_version 676356 (0.0062) [2024-06-15 20:13:57,731][1653645] Updated weights for policy 0, policy_version 676413 (0.0012) [2024-06-15 20:14:00,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1385365504. Throughput: 0: 11127.4. Samples: 346419200. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:14:01,407][1653645] Updated weights for policy 0, policy_version 676478 (0.0013) [2024-06-15 20:14:02,859][1653645] Updated weights for policy 0, policy_version 676539 (0.0013) [2024-06-15 20:14:05,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.5, 300 sec: 43431.5). Total num frames: 1385562112. Throughput: 0: 11002.3. Samples: 346445312. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:05,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:14:08,340][1653645] Updated weights for policy 0, policy_version 676594 (0.0015) [2024-06-15 20:14:10,191][1653645] Updated weights for policy 0, policy_version 676663 (0.0013) [2024-06-15 20:14:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 43431.5). Total num frames: 1385824256. Throughput: 0: 11093.3. Samples: 346511360. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:14:13,861][1653645] Updated weights for policy 0, policy_version 676722 (0.0132) [2024-06-15 20:14:14,920][1653645] Updated weights for policy 0, policy_version 676772 (0.0045) [2024-06-15 20:14:15,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 1386086400. Throughput: 0: 10934.1. Samples: 346576384. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:14:18,614][1651596] Signal inference workers to stop experience collection... (35200 times) [2024-06-15 20:14:18,689][1653645] InferenceWorker_p0-w0: stopping experience collection (35200 times) [2024-06-15 20:14:18,893][1651596] Signal inference workers to resume experience collection... (35200 times) [2024-06-15 20:14:18,894][1653645] InferenceWorker_p0-w0: resuming experience collection (35200 times) [2024-06-15 20:14:19,552][1653645] Updated weights for policy 0, policy_version 676834 (0.0019) [2024-06-15 20:14:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 1386250240. Throughput: 0: 11161.6. Samples: 346617856. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:14:21,257][1653645] Updated weights for policy 0, policy_version 676897 (0.0012) [2024-06-15 20:14:24,543][1653645] Updated weights for policy 0, policy_version 676929 (0.0014) [2024-06-15 20:14:25,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 44236.5, 300 sec: 43542.5). Total num frames: 1386479616. Throughput: 0: 11036.3. Samples: 346682368. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:25,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:14:26,580][1653645] Updated weights for policy 0, policy_version 677024 (0.0014) [2024-06-15 20:14:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.9, 300 sec: 43320.4). Total num frames: 1386643456. Throughput: 0: 11013.8. Samples: 346750464. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:14:31,349][1653645] Updated weights for policy 0, policy_version 677090 (0.0137) [2024-06-15 20:14:33,143][1653645] Updated weights for policy 0, policy_version 677168 (0.0012) [2024-06-15 20:14:35,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.5, 300 sec: 43542.6). Total num frames: 1386872832. Throughput: 0: 10729.2. Samples: 346772480. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:14:37,611][1653645] Updated weights for policy 0, policy_version 677220 (0.0047) [2024-06-15 20:14:38,849][1653645] Updated weights for policy 0, policy_version 677284 (0.0013) [2024-06-15 20:14:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1387134976. Throughput: 0: 10763.4. Samples: 346842112. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:14:43,336][1653645] Updated weights for policy 0, policy_version 677345 (0.0021) [2024-06-15 20:14:44,670][1653645] Updated weights for policy 0, policy_version 677397 (0.0012) [2024-06-15 20:14:45,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1387397120. Throughput: 0: 10865.7. Samples: 346908160. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:14:49,514][1653645] Updated weights for policy 0, policy_version 677501 (0.0013) [2024-06-15 20:14:50,958][1648982] Fps is (10 sec: 49149.9, 60 sec: 43144.2, 300 sec: 43875.7). Total num frames: 1387626496. Throughput: 0: 11093.3. Samples: 346944512. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:14:54,860][1653645] Updated weights for policy 0, policy_version 677574 (0.0119) [2024-06-15 20:14:55,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1387757568. Throughput: 0: 11081.9. Samples: 347010048. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:14:55,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:14:56,580][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000677648_1387823104.pth... [2024-06-15 20:14:56,695][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000672528_1377337344.pth [2024-06-15 20:14:56,993][1653645] Updated weights for policy 0, policy_version 677664 (0.0017) [2024-06-15 20:15:00,958][1648982] Fps is (10 sec: 29492.4, 60 sec: 42598.4, 300 sec: 43542.6). Total num frames: 1387921408. Throughput: 0: 11047.8. Samples: 347073536. Policy #0 lag: (min: 5.0, avg: 120.8, max: 261.0) [2024-06-15 20:15:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:15:01,173][1653645] Updated weights for policy 0, policy_version 677712 (0.0013) [2024-06-15 20:15:01,286][1651596] Signal inference workers to stop experience collection... (35250 times) [2024-06-15 20:15:01,325][1653645] InferenceWorker_p0-w0: stopping experience collection (35250 times) [2024-06-15 20:15:01,453][1651596] Signal inference workers to resume experience collection... (35250 times) [2024-06-15 20:15:01,454][1653645] InferenceWorker_p0-w0: resuming experience collection (35250 times) [2024-06-15 20:15:02,752][1653645] Updated weights for policy 0, policy_version 677776 (0.0025) [2024-06-15 20:15:03,889][1653645] Updated weights for policy 0, policy_version 677824 (0.0011) [2024-06-15 20:15:05,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 1388183552. Throughput: 0: 10843.0. Samples: 347105792. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:15:08,526][1653645] Updated weights for policy 0, policy_version 677904 (0.0144) [2024-06-15 20:15:09,562][1653645] Updated weights for policy 0, policy_version 677952 (0.0013) [2024-06-15 20:15:10,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1388445696. Throughput: 0: 10763.5. Samples: 347166720. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:15:13,377][1653645] Updated weights for policy 0, policy_version 678002 (0.0013) [2024-06-15 20:15:15,124][1653645] Updated weights for policy 0, policy_version 678070 (0.0012) [2024-06-15 20:15:15,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1388707840. Throughput: 0: 10706.5. Samples: 347232256. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:15:19,887][1653645] Updated weights for policy 0, policy_version 678144 (0.0107) [2024-06-15 20:15:20,962][1648982] Fps is (10 sec: 49129.8, 60 sec: 44779.6, 300 sec: 43875.1). Total num frames: 1388937216. Throughput: 0: 11228.8. Samples: 347277824. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:20,963][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:15:21,102][1653645] Updated weights for policy 0, policy_version 678205 (0.0024) [2024-06-15 20:15:25,569][1653645] Updated weights for policy 0, policy_version 678271 (0.0014) [2024-06-15 20:15:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 1389101056. Throughput: 0: 11025.1. Samples: 347338240. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:15:27,363][1653645] Updated weights for policy 0, policy_version 678336 (0.0012) [2024-06-15 20:15:30,958][1648982] Fps is (10 sec: 29504.3, 60 sec: 43144.5, 300 sec: 43542.8). Total num frames: 1389232128. Throughput: 0: 10968.2. Samples: 347401728. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:15:32,632][1653645] Updated weights for policy 0, policy_version 678418 (0.0015) [2024-06-15 20:15:33,396][1653645] Updated weights for policy 0, policy_version 678460 (0.0015) [2024-06-15 20:15:35,957][1648982] Fps is (10 sec: 39322.5, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1389494272. Throughput: 0: 10854.6. Samples: 347432960. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:15:37,544][1653645] Updated weights for policy 0, policy_version 678512 (0.0012) [2024-06-15 20:15:40,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1389756416. Throughput: 0: 10797.5. Samples: 347495936. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:15:42,410][1653645] Updated weights for policy 0, policy_version 678593 (0.0013) [2024-06-15 20:15:43,391][1651596] Signal inference workers to stop experience collection... (35300 times) [2024-06-15 20:15:43,456][1653645] InferenceWorker_p0-w0: stopping experience collection (35300 times) [2024-06-15 20:15:43,541][1651596] Signal inference workers to resume experience collection... (35300 times) [2024-06-15 20:15:43,549][1653645] InferenceWorker_p0-w0: resuming experience collection (35300 times) [2024-06-15 20:15:43,890][1653645] Updated weights for policy 0, policy_version 678672 (0.0072) [2024-06-15 20:15:44,829][1653645] Updated weights for policy 0, policy_version 678712 (0.0021) [2024-06-15 20:15:45,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1390018560. Throughput: 0: 11002.3. Samples: 347568640. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:15:48,106][1653645] Updated weights for policy 0, policy_version 678752 (0.0012) [2024-06-15 20:15:50,507][1653645] Updated weights for policy 0, policy_version 678832 (0.0012) [2024-06-15 20:15:50,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 44237.0, 300 sec: 44097.9). Total num frames: 1390280704. Throughput: 0: 11025.1. Samples: 347601920. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:15:55,135][1653645] Updated weights for policy 0, policy_version 678899 (0.0013) [2024-06-15 20:15:55,958][1648982] Fps is (10 sec: 42596.0, 60 sec: 44782.5, 300 sec: 44097.9). Total num frames: 1390444544. Throughput: 0: 11206.9. Samples: 347671040. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:15:55,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:15:56,517][1653645] Updated weights for policy 0, policy_version 678960 (0.0012) [2024-06-15 20:16:00,318][1653645] Updated weights for policy 0, policy_version 679013 (0.0013) [2024-06-15 20:16:00,957][1648982] Fps is (10 sec: 39323.0, 60 sec: 45875.4, 300 sec: 43986.9). Total num frames: 1390673920. Throughput: 0: 11161.6. Samples: 347734528. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:16:01,506][1653645] Updated weights for policy 0, policy_version 679062 (0.0020) [2024-06-15 20:16:05,777][1653645] Updated weights for policy 0, policy_version 679120 (0.0012) [2024-06-15 20:16:05,958][1648982] Fps is (10 sec: 39323.9, 60 sec: 44236.9, 300 sec: 43653.7). Total num frames: 1390837760. Throughput: 0: 10957.9. Samples: 347770880. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:05,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 20:16:07,399][1653645] Updated weights for policy 0, policy_version 679188 (0.0213) [2024-06-15 20:16:08,395][1653645] Updated weights for policy 0, policy_version 679229 (0.0013) [2024-06-15 20:16:10,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1391067136. Throughput: 0: 11104.7. Samples: 347837952. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:16:12,167][1653645] Updated weights for policy 0, policy_version 679284 (0.0017) [2024-06-15 20:16:12,905][1653645] Updated weights for policy 0, policy_version 679317 (0.0013) [2024-06-15 20:16:13,647][1653645] Updated weights for policy 0, policy_version 679360 (0.0026) [2024-06-15 20:16:15,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1391329280. Throughput: 0: 11207.1. Samples: 347906048. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:15,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:16:17,939][1653645] Updated weights for policy 0, policy_version 679420 (0.0017) [2024-06-15 20:16:20,347][1653645] Updated weights for policy 0, policy_version 679475 (0.0013) [2024-06-15 20:16:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44240.1, 300 sec: 43986.9). Total num frames: 1391591424. Throughput: 0: 11309.5. Samples: 347941888. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:16:22,177][1653645] Updated weights for policy 0, policy_version 679504 (0.0012) [2024-06-15 20:16:24,261][1653645] Updated weights for policy 0, policy_version 679559 (0.0013) [2024-06-15 20:16:25,577][1653645] Updated weights for policy 0, policy_version 679615 (0.0146) [2024-06-15 20:16:25,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 45875.3, 300 sec: 44320.2). Total num frames: 1391853568. Throughput: 0: 11332.4. Samples: 348005888. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:16:28,523][1651596] Signal inference workers to stop experience collection... (35350 times) [2024-06-15 20:16:28,565][1653645] InferenceWorker_p0-w0: stopping experience collection (35350 times) [2024-06-15 20:16:28,838][1651596] Signal inference workers to resume experience collection... (35350 times) [2024-06-15 20:16:28,839][1653645] InferenceWorker_p0-w0: resuming experience collection (35350 times) [2024-06-15 20:16:29,372][1653645] Updated weights for policy 0, policy_version 679670 (0.0013) [2024-06-15 20:16:30,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 45875.1, 300 sec: 43986.8). Total num frames: 1391984640. Throughput: 0: 11320.9. Samples: 348078080. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:30,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:16:31,704][1653645] Updated weights for policy 0, policy_version 679712 (0.0013) [2024-06-15 20:16:33,279][1653645] Updated weights for policy 0, policy_version 679760 (0.0026) [2024-06-15 20:16:35,913][1653645] Updated weights for policy 0, policy_version 679810 (0.0146) [2024-06-15 20:16:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.0, 300 sec: 43986.9). Total num frames: 1392246784. Throughput: 0: 11332.3. Samples: 348111872. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:16:37,246][1653645] Updated weights for policy 0, policy_version 679872 (0.0014) [2024-06-15 20:16:40,960][1648982] Fps is (10 sec: 45875.8, 60 sec: 44783.1, 300 sec: 43986.8). Total num frames: 1392443392. Throughput: 0: 11298.3. Samples: 348179456. Policy #0 lag: (min: 111.0, avg: 193.2, max: 305.0) [2024-06-15 20:16:40,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:16:41,655][1653645] Updated weights for policy 0, policy_version 679934 (0.0018) [2024-06-15 20:16:44,294][1653645] Updated weights for policy 0, policy_version 679996 (0.0013) [2024-06-15 20:16:45,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 1392771072. Throughput: 0: 11218.4. Samples: 348239360. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:16:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:16:47,957][1653645] Updated weights for policy 0, policy_version 680080 (0.0016) [2024-06-15 20:16:50,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 44431.3). Total num frames: 1392902144. Throughput: 0: 11093.3. Samples: 348270080. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:16:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:16:53,214][1653645] Updated weights for policy 0, policy_version 680160 (0.0014) [2024-06-15 20:16:54,579][1653645] Updated weights for policy 0, policy_version 680208 (0.0031) [2024-06-15 20:16:55,518][1653645] Updated weights for policy 0, policy_version 680252 (0.0012) [2024-06-15 20:16:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.5, 300 sec: 44320.1). Total num frames: 1393164288. Throughput: 0: 11173.0. Samples: 348340736. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:16:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:16:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000680256_1393164288.pth... [2024-06-15 20:16:56,018][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000675056_1382514688.pth [2024-06-15 20:16:57,906][1653645] Updated weights for policy 0, policy_version 680310 (0.0023) [2024-06-15 20:17:00,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45328.8, 300 sec: 44320.2). Total num frames: 1393393664. Throughput: 0: 11161.6. Samples: 348408320. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:00,959][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 20:17:01,137][1653645] Updated weights for policy 0, policy_version 680376 (0.0109) [2024-06-15 20:17:05,457][1653645] Updated weights for policy 0, policy_version 680433 (0.0034) [2024-06-15 20:17:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 1393557504. Throughput: 0: 11184.4. Samples: 348445184. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:05,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:17:07,020][1653645] Updated weights for policy 0, policy_version 680511 (0.0015) [2024-06-15 20:17:09,711][1653645] Updated weights for policy 0, policy_version 680575 (0.0018) [2024-06-15 20:17:10,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1393819648. Throughput: 0: 11229.8. Samples: 348511232. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:17:12,588][1653645] Updated weights for policy 0, policy_version 680638 (0.0017) [2024-06-15 20:17:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 1393950720. Throughput: 0: 11195.8. Samples: 348581888. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:17:16,917][1651596] Signal inference workers to stop experience collection... (35400 times) [2024-06-15 20:17:16,983][1653645] InferenceWorker_p0-w0: stopping experience collection (35400 times) [2024-06-15 20:17:17,164][1651596] Signal inference workers to resume experience collection... (35400 times) [2024-06-15 20:17:17,165][1653645] InferenceWorker_p0-w0: resuming experience collection (35400 times) [2024-06-15 20:17:17,385][1653645] Updated weights for policy 0, policy_version 680692 (0.0028) [2024-06-15 20:17:19,168][1653645] Updated weights for policy 0, policy_version 680765 (0.0014) [2024-06-15 20:17:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1394278400. Throughput: 0: 11127.5. Samples: 348612608. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:17:21,285][1653645] Updated weights for policy 0, policy_version 680821 (0.0013) [2024-06-15 20:17:24,407][1653645] Updated weights for policy 0, policy_version 680887 (0.0022) [2024-06-15 20:17:25,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1394475008. Throughput: 0: 11093.3. Samples: 348678656. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:25,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:17:29,493][1653645] Updated weights for policy 0, policy_version 680982 (0.0014) [2024-06-15 20:17:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1394737152. Throughput: 0: 11229.9. Samples: 348744704. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:17:32,792][1653645] Updated weights for policy 0, policy_version 681061 (0.0169) [2024-06-15 20:17:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1394900992. Throughput: 0: 11264.0. Samples: 348776960. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:17:36,270][1653645] Updated weights for policy 0, policy_version 681123 (0.0013) [2024-06-15 20:17:40,973][1648982] Fps is (10 sec: 35990.7, 60 sec: 44225.8, 300 sec: 44206.8). Total num frames: 1395097600. Throughput: 0: 11237.5. Samples: 348846592. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:40,973][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:17:41,030][1653645] Updated weights for policy 0, policy_version 681206 (0.0014) [2024-06-15 20:17:42,349][1653645] Updated weights for policy 0, policy_version 681265 (0.0015) [2024-06-15 20:17:45,296][1653645] Updated weights for policy 0, policy_version 681328 (0.0014) [2024-06-15 20:17:45,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1395392512. Throughput: 0: 11138.9. Samples: 348909568. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:17:48,043][1653645] Updated weights for policy 0, policy_version 681404 (0.0012) [2024-06-15 20:17:50,958][1648982] Fps is (10 sec: 42661.7, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1395523584. Throughput: 0: 11081.9. Samples: 348943872. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:17:52,465][1653645] Updated weights for policy 0, policy_version 681458 (0.0014) [2024-06-15 20:17:54,049][1653645] Updated weights for policy 0, policy_version 681528 (0.0013) [2024-06-15 20:17:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1395785728. Throughput: 0: 11059.2. Samples: 349008896. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:17:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:17:57,027][1653645] Updated weights for policy 0, policy_version 681584 (0.0012) [2024-06-15 20:17:59,085][1653645] Updated weights for policy 0, policy_version 681632 (0.0156) [2024-06-15 20:18:00,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1396047872. Throughput: 0: 11025.1. Samples: 349078016. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:18:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:18:03,567][1651596] Signal inference workers to stop experience collection... (35450 times) [2024-06-15 20:18:03,598][1653645] InferenceWorker_p0-w0: stopping experience collection (35450 times) [2024-06-15 20:18:03,808][1651596] Signal inference workers to resume experience collection... (35450 times) [2024-06-15 20:18:03,809][1653645] InferenceWorker_p0-w0: resuming experience collection (35450 times) [2024-06-15 20:18:04,426][1653645] Updated weights for policy 0, policy_version 681697 (0.0014) [2024-06-15 20:18:05,693][1653645] Updated weights for policy 0, policy_version 681750 (0.0012) [2024-06-15 20:18:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 1396244480. Throughput: 0: 11116.1. Samples: 349112832. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:18:05,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 20:18:08,840][1653645] Updated weights for policy 0, policy_version 681840 (0.0014) [2024-06-15 20:18:10,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1396441088. Throughput: 0: 10945.4. Samples: 349171200. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:18:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:18:12,039][1653645] Updated weights for policy 0, policy_version 681904 (0.0012) [2024-06-15 20:18:15,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1396572160. Throughput: 0: 11093.3. Samples: 349243904. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:18:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:18:16,213][1653645] Updated weights for policy 0, policy_version 681941 (0.0011) [2024-06-15 20:18:18,162][1653645] Updated weights for policy 0, policy_version 682016 (0.0182) [2024-06-15 20:18:18,938][1653645] Updated weights for policy 0, policy_version 682046 (0.0014) [2024-06-15 20:18:20,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1396899840. Throughput: 0: 10888.5. Samples: 349266944. Policy #0 lag: (min: 15.0, avg: 119.0, max: 271.0) [2024-06-15 20:18:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:18:21,319][1653645] Updated weights for policy 0, policy_version 682112 (0.0064) [2024-06-15 20:18:23,756][1653645] Updated weights for policy 0, policy_version 682170 (0.0018) [2024-06-15 20:18:25,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1397096448. Throughput: 0: 10994.6. Samples: 349341184. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:18:29,764][1653645] Updated weights for policy 0, policy_version 682257 (0.0114) [2024-06-15 20:18:30,958][1648982] Fps is (10 sec: 45876.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1397358592. Throughput: 0: 10945.4. Samples: 349402112. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:30,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:18:32,575][1653645] Updated weights for policy 0, policy_version 682326 (0.0110) [2024-06-15 20:18:33,487][1653645] Updated weights for policy 0, policy_version 682368 (0.0012) [2024-06-15 20:18:35,831][1653645] Updated weights for policy 0, policy_version 682426 (0.0013) [2024-06-15 20:18:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1397620736. Throughput: 0: 10854.5. Samples: 349432320. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:18:40,960][1648982] Fps is (10 sec: 39312.1, 60 sec: 44246.2, 300 sec: 43986.6). Total num frames: 1397751808. Throughput: 0: 11172.4. Samples: 349511680. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:40,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:18:41,309][1653645] Updated weights for policy 0, policy_version 682513 (0.0179) [2024-06-15 20:18:44,674][1653645] Updated weights for policy 0, policy_version 682592 (0.0016) [2024-06-15 20:18:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1398013952. Throughput: 0: 10820.2. Samples: 349564928. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:18:46,657][1651596] Signal inference workers to stop experience collection... (35500 times) [2024-06-15 20:18:46,698][1653645] InferenceWorker_p0-w0: stopping experience collection (35500 times) [2024-06-15 20:18:46,989][1651596] Signal inference workers to resume experience collection... (35500 times) [2024-06-15 20:18:46,990][1653645] InferenceWorker_p0-w0: resuming experience collection (35500 times) [2024-06-15 20:18:47,465][1653645] Updated weights for policy 0, policy_version 682681 (0.0013) [2024-06-15 20:18:50,958][1648982] Fps is (10 sec: 39330.7, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1398145024. Throughput: 0: 10843.0. Samples: 349600768. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:18:52,191][1653645] Updated weights for policy 0, policy_version 682724 (0.0013) [2024-06-15 20:18:54,093][1653645] Updated weights for policy 0, policy_version 682809 (0.0014) [2024-06-15 20:18:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1398407168. Throughput: 0: 10945.4. Samples: 349663744. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:18:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:18:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000682816_1398407168.pth... [2024-06-15 20:18:56,101][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000677648_1387823104.pth [2024-06-15 20:18:57,462][1653645] Updated weights for policy 0, policy_version 682879 (0.0012) [2024-06-15 20:18:59,199][1653645] Updated weights for policy 0, policy_version 682935 (0.0016) [2024-06-15 20:19:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1398669312. Throughput: 0: 10854.4. Samples: 349732352. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:19:04,517][1653645] Updated weights for policy 0, policy_version 682979 (0.0012) [2024-06-15 20:19:05,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 44097.9). Total num frames: 1398833152. Throughput: 0: 11173.0. Samples: 349769728. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:19:06,269][1653645] Updated weights for policy 0, policy_version 683056 (0.0013) [2024-06-15 20:19:08,749][1653645] Updated weights for policy 0, policy_version 683104 (0.0012) [2024-06-15 20:19:10,645][1653645] Updated weights for policy 0, policy_version 683158 (0.0012) [2024-06-15 20:19:10,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1399128064. Throughput: 0: 10831.6. Samples: 349828608. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:19:11,608][1653645] Updated weights for policy 0, policy_version 683198 (0.0011) [2024-06-15 20:19:15,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44783.0, 300 sec: 44097.9). Total num frames: 1399259136. Throughput: 0: 11127.4. Samples: 349902848. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:15,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:19:18,097][1653645] Updated weights for policy 0, policy_version 683312 (0.0012) [2024-06-15 20:19:20,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1399521280. Throughput: 0: 10888.5. Samples: 349922304. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:19:21,369][1653645] Updated weights for policy 0, policy_version 683387 (0.0016) [2024-06-15 20:19:23,760][1653645] Updated weights for policy 0, policy_version 683450 (0.0018) [2024-06-15 20:19:25,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1399717888. Throughput: 0: 10684.3. Samples: 349992448. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:25,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:19:28,431][1653645] Updated weights for policy 0, policy_version 683504 (0.0017) [2024-06-15 20:19:29,505][1653645] Updated weights for policy 0, policy_version 683552 (0.0035) [2024-06-15 20:19:30,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1399980032. Throughput: 0: 10888.5. Samples: 350054912. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:19:32,659][1653645] Updated weights for policy 0, policy_version 683600 (0.0017) [2024-06-15 20:19:33,220][1651596] Signal inference workers to stop experience collection... (35550 times) [2024-06-15 20:19:33,271][1653645] InferenceWorker_p0-w0: stopping experience collection (35550 times) [2024-06-15 20:19:33,611][1651596] Signal inference workers to resume experience collection... (35550 times) [2024-06-15 20:19:33,612][1653645] InferenceWorker_p0-w0: resuming experience collection (35550 times) [2024-06-15 20:19:33,976][1653645] Updated weights for policy 0, policy_version 683648 (0.0013) [2024-06-15 20:19:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 44098.0). Total num frames: 1400143872. Throughput: 0: 10808.9. Samples: 350087168. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:35,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:19:40,453][1653645] Updated weights for policy 0, policy_version 683744 (0.0015) [2024-06-15 20:19:40,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43146.2, 300 sec: 43875.8). Total num frames: 1400340480. Throughput: 0: 10820.3. Samples: 350150656. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:19:42,089][1653645] Updated weights for policy 0, policy_version 683797 (0.0013) [2024-06-15 20:19:43,004][1653645] Updated weights for policy 0, policy_version 683840 (0.0017) [2024-06-15 20:19:45,979][1648982] Fps is (10 sec: 39239.7, 60 sec: 42037.7, 300 sec: 43761.7). Total num frames: 1400537088. Throughput: 0: 10747.0. Samples: 350216192. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:45,980][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:19:46,733][1653645] Updated weights for policy 0, policy_version 683899 (0.0017) [2024-06-15 20:19:48,433][1653645] Updated weights for policy 0, policy_version 683941 (0.0014) [2024-06-15 20:19:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1400766464. Throughput: 0: 10570.0. Samples: 350245376. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:19:52,712][1653645] Updated weights for policy 0, policy_version 684016 (0.0012) [2024-06-15 20:19:54,157][1653645] Updated weights for policy 0, policy_version 684080 (0.0013) [2024-06-15 20:19:55,958][1648982] Fps is (10 sec: 49254.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1401028608. Throughput: 0: 10752.0. Samples: 350312448. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:19:55,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:19:57,785][1653645] Updated weights for policy 0, policy_version 684115 (0.0014) [2024-06-15 20:19:59,394][1653645] Updated weights for policy 0, policy_version 684161 (0.0037) [2024-06-15 20:20:00,945][1653645] Updated weights for policy 0, policy_version 684224 (0.0097) [2024-06-15 20:20:00,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1401290752. Throughput: 0: 10581.3. Samples: 350379008. Policy #0 lag: (min: 9.0, avg: 137.4, max: 265.0) [2024-06-15 20:20:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:20:05,335][1653645] Updated weights for policy 0, policy_version 684308 (0.0100) [2024-06-15 20:20:05,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44783.1, 300 sec: 44320.1). Total num frames: 1401520128. Throughput: 0: 10991.0. Samples: 350416896. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:20:09,897][1653645] Updated weights for policy 0, policy_version 684368 (0.0013) [2024-06-15 20:20:10,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 42052.2, 300 sec: 43875.7). Total num frames: 1401651200. Throughput: 0: 10808.8. Samples: 350478848. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:10,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:20:12,804][1653645] Updated weights for policy 0, policy_version 684419 (0.0013) [2024-06-15 20:20:15,916][1653645] Updated weights for policy 0, policy_version 684485 (0.0013) [2024-06-15 20:20:15,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 42598.4, 300 sec: 43654.3). Total num frames: 1401815040. Throughput: 0: 10808.9. Samples: 350541312. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:20:17,353][1653645] Updated weights for policy 0, policy_version 684547 (0.0015) [2024-06-15 20:20:18,500][1653645] Updated weights for policy 0, policy_version 684602 (0.0011) [2024-06-15 20:20:20,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 42598.6, 300 sec: 43986.9). Total num frames: 1402077184. Throughput: 0: 10729.2. Samples: 350569984. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:20:21,568][1651596] Signal inference workers to stop experience collection... (35600 times) [2024-06-15 20:20:21,613][1653645] InferenceWorker_p0-w0: stopping experience collection (35600 times) [2024-06-15 20:20:21,785][1651596] Signal inference workers to resume experience collection... (35600 times) [2024-06-15 20:20:21,786][1653645] InferenceWorker_p0-w0: resuming experience collection (35600 times) [2024-06-15 20:20:22,517][1653645] Updated weights for policy 0, policy_version 684672 (0.0014) [2024-06-15 20:20:25,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 43144.4, 300 sec: 44320.1). Total num frames: 1402306560. Throughput: 0: 10934.0. Samples: 350642688. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:20:28,102][1653645] Updated weights for policy 0, policy_version 684756 (0.0117) [2024-06-15 20:20:29,492][1653645] Updated weights for policy 0, policy_version 684816 (0.0013) [2024-06-15 20:20:30,596][1653645] Updated weights for policy 0, policy_version 684863 (0.0012) [2024-06-15 20:20:30,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1402601472. Throughput: 0: 10677.3. Samples: 350696448. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:20:33,813][1653645] Updated weights for policy 0, policy_version 684912 (0.0013) [2024-06-15 20:20:35,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 1402732544. Throughput: 0: 11070.6. Samples: 350743552. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:20:37,060][1653645] Updated weights for policy 0, policy_version 684960 (0.0015) [2024-06-15 20:20:39,621][1653645] Updated weights for policy 0, policy_version 685009 (0.0014) [2024-06-15 20:20:40,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 1402994688. Throughput: 0: 11127.5. Samples: 350813184. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:20:41,058][1653645] Updated weights for policy 0, policy_version 685059 (0.0011) [2024-06-15 20:20:42,273][1653645] Updated weights for policy 0, policy_version 685115 (0.0024) [2024-06-15 20:20:44,926][1653645] Updated weights for policy 0, policy_version 685168 (0.0013) [2024-06-15 20:20:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45344.8, 300 sec: 43986.9). Total num frames: 1403256832. Throughput: 0: 11093.3. Samples: 350878208. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:20:48,240][1653645] Updated weights for policy 0, policy_version 685200 (0.0012) [2024-06-15 20:20:49,337][1653645] Updated weights for policy 0, policy_version 685245 (0.0011) [2024-06-15 20:20:50,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.5, 300 sec: 43875.9). Total num frames: 1403387904. Throughput: 0: 11081.9. Samples: 350915584. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:20:52,286][1653645] Updated weights for policy 0, policy_version 685312 (0.0014) [2024-06-15 20:20:53,649][1653645] Updated weights for policy 0, policy_version 685367 (0.0012) [2024-06-15 20:20:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 1403650048. Throughput: 0: 11036.5. Samples: 350975488. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:20:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:20:56,580][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000685408_1403715584.pth... [2024-06-15 20:20:56,581][1653645] Updated weights for policy 0, policy_version 685408 (0.0016) [2024-06-15 20:20:56,719][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000680256_1393164288.pth [2024-06-15 20:21:00,305][1653645] Updated weights for policy 0, policy_version 685472 (0.0013) [2024-06-15 20:21:00,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43144.4, 300 sec: 44209.0). Total num frames: 1403879424. Throughput: 0: 11184.3. Samples: 351044608. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:21:03,243][1653645] Updated weights for policy 0, policy_version 685524 (0.0015) [2024-06-15 20:21:04,729][1653645] Updated weights for policy 0, policy_version 685586 (0.0010) [2024-06-15 20:21:05,169][1651596] Signal inference workers to stop experience collection... (35650 times) [2024-06-15 20:21:05,203][1653645] InferenceWorker_p0-w0: stopping experience collection (35650 times) [2024-06-15 20:21:05,306][1651596] Signal inference workers to resume experience collection... (35650 times) [2024-06-15 20:21:05,307][1653645] InferenceWorker_p0-w0: resuming experience collection (35650 times) [2024-06-15 20:21:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1404174336. Throughput: 0: 11298.1. Samples: 351078400. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:21:07,696][1653645] Updated weights for policy 0, policy_version 685648 (0.0012) [2024-06-15 20:21:10,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 1404305408. Throughput: 0: 11104.7. Samples: 351142400. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:21:12,928][1653645] Updated weights for policy 0, policy_version 685754 (0.0134) [2024-06-15 20:21:15,847][1653645] Updated weights for policy 0, policy_version 685796 (0.0011) [2024-06-15 20:21:15,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 1404502016. Throughput: 0: 11355.0. Samples: 351207424. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:21:17,409][1653645] Updated weights for policy 0, policy_version 685859 (0.0011) [2024-06-15 20:21:19,925][1653645] Updated weights for policy 0, policy_version 685904 (0.0030) [2024-06-15 20:21:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1404829696. Throughput: 0: 10990.9. Samples: 351238144. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:21:24,317][1653645] Updated weights for policy 0, policy_version 685974 (0.0012) [2024-06-15 20:21:25,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 1404960768. Throughput: 0: 10990.9. Samples: 351307776. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:25,958][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 20:21:26,057][1653645] Updated weights for policy 0, policy_version 686021 (0.0023) [2024-06-15 20:21:28,491][1653645] Updated weights for policy 0, policy_version 686098 (0.0012) [2024-06-15 20:21:29,290][1653645] Updated weights for policy 0, policy_version 686142 (0.0012) [2024-06-15 20:21:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1405222912. Throughput: 0: 11025.0. Samples: 351374336. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:21:31,966][1653645] Updated weights for policy 0, policy_version 686200 (0.0011) [2024-06-15 20:21:35,960][1648982] Fps is (10 sec: 45874.6, 60 sec: 44782.8, 300 sec: 43986.9). Total num frames: 1405419520. Throughput: 0: 10922.7. Samples: 351407104. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:35,961][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:21:36,155][1653645] Updated weights for policy 0, policy_version 686258 (0.0013) [2024-06-15 20:21:38,696][1653645] Updated weights for policy 0, policy_version 686304 (0.0011) [2024-06-15 20:21:40,364][1653645] Updated weights for policy 0, policy_version 686368 (0.0015) [2024-06-15 20:21:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 1405714432. Throughput: 0: 11195.7. Samples: 351479296. Policy #0 lag: (min: 15.0, avg: 113.7, max: 271.0) [2024-06-15 20:21:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:21:42,195][1653645] Updated weights for policy 0, policy_version 686406 (0.0012) [2024-06-15 20:21:45,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1405878272. Throughput: 0: 11173.0. Samples: 351547392. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:21:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:21:47,152][1653645] Updated weights for policy 0, policy_version 686467 (0.0012) [2024-06-15 20:21:48,308][1653645] Updated weights for policy 0, policy_version 686528 (0.0012) [2024-06-15 20:21:50,937][1653645] Updated weights for policy 0, policy_version 686592 (0.0013) [2024-06-15 20:21:50,964][1648982] Fps is (10 sec: 42570.9, 60 sec: 45870.4, 300 sec: 43985.9). Total num frames: 1406140416. Throughput: 0: 11171.4. Samples: 351581184. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:21:50,965][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:21:52,427][1651596] Signal inference workers to stop experience collection... (35700 times) [2024-06-15 20:21:52,520][1653645] InferenceWorker_p0-w0: stopping experience collection (35700 times) [2024-06-15 20:21:52,673][1651596] Signal inference workers to resume experience collection... (35700 times) [2024-06-15 20:21:52,674][1653645] InferenceWorker_p0-w0: resuming experience collection (35700 times) [2024-06-15 20:21:52,947][1653645] Updated weights for policy 0, policy_version 686656 (0.0013) [2024-06-15 20:21:54,941][1653645] Updated weights for policy 0, policy_version 686716 (0.0088) [2024-06-15 20:21:55,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 45875.0, 300 sec: 44097.9). Total num frames: 1406402560. Throughput: 0: 11070.5. Samples: 351640576. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:21:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:22:00,425][1653645] Updated weights for policy 0, policy_version 686781 (0.0013) [2024-06-15 20:22:00,958][1648982] Fps is (10 sec: 39347.1, 60 sec: 44237.0, 300 sec: 43986.9). Total num frames: 1406533632. Throughput: 0: 11138.9. Samples: 351708672. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:22:02,872][1653645] Updated weights for policy 0, policy_version 686848 (0.0014) [2024-06-15 20:22:05,026][1653645] Updated weights for policy 0, policy_version 686905 (0.0014) [2024-06-15 20:22:05,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1406828544. Throughput: 0: 11195.7. Samples: 351741952. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:22:07,131][1653645] Updated weights for policy 0, policy_version 686972 (0.0044) [2024-06-15 20:22:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1406926848. Throughput: 0: 11093.3. Samples: 351806976. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:22:11,929][1653645] Updated weights for policy 0, policy_version 687025 (0.0012) [2024-06-15 20:22:14,314][1653645] Updated weights for policy 0, policy_version 687074 (0.0011) [2024-06-15 20:22:15,942][1653645] Updated weights for policy 0, policy_version 687105 (0.0010) [2024-06-15 20:22:15,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 1407188992. Throughput: 0: 11070.6. Samples: 351872512. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:22:17,250][1653645] Updated weights for policy 0, policy_version 687166 (0.0013) [2024-06-15 20:22:19,083][1653645] Updated weights for policy 0, policy_version 687232 (0.0016) [2024-06-15 20:22:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1407451136. Throughput: 0: 11047.8. Samples: 351904256. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:22:25,738][1653645] Updated weights for policy 0, policy_version 687312 (0.0014) [2024-06-15 20:22:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 1407614976. Throughput: 0: 11002.3. Samples: 351974400. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:22:26,659][1653645] Updated weights for policy 0, policy_version 687360 (0.0013) [2024-06-15 20:22:28,858][1653645] Updated weights for policy 0, policy_version 687418 (0.0036) [2024-06-15 20:22:30,358][1653645] Updated weights for policy 0, policy_version 687480 (0.0011) [2024-06-15 20:22:30,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 1407975424. Throughput: 0: 10911.3. Samples: 352038400. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:22:35,298][1653645] Updated weights for policy 0, policy_version 687536 (0.0011) [2024-06-15 20:22:35,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 44782.9, 300 sec: 44100.2). Total num frames: 1408106496. Throughput: 0: 11072.1. Samples: 352079360. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:35,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:22:37,573][1653645] Updated weights for policy 0, policy_version 687587 (0.0011) [2024-06-15 20:22:39,511][1653645] Updated weights for policy 0, policy_version 687648 (0.0012) [2024-06-15 20:22:40,864][1651596] Signal inference workers to stop experience collection... (35750 times) [2024-06-15 20:22:40,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.7, 300 sec: 43986.8). Total num frames: 1408368640. Throughput: 0: 11229.9. Samples: 352145920. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:22:40,965][1653645] InferenceWorker_p0-w0: stopping experience collection (35750 times) [2024-06-15 20:22:40,994][1653645] Updated weights for policy 0, policy_version 687686 (0.0036) [2024-06-15 20:22:41,152][1651596] Signal inference workers to resume experience collection... (35750 times) [2024-06-15 20:22:41,156][1653645] InferenceWorker_p0-w0: resuming experience collection (35750 times) [2024-06-15 20:22:45,627][1653645] Updated weights for policy 0, policy_version 687747 (0.0016) [2024-06-15 20:22:45,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1408532480. Throughput: 0: 11355.0. Samples: 352219648. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:22:46,834][1653645] Updated weights for policy 0, policy_version 687806 (0.0013) [2024-06-15 20:22:49,278][1653645] Updated weights for policy 0, policy_version 687870 (0.0115) [2024-06-15 20:22:50,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44241.6, 300 sec: 44098.0). Total num frames: 1408794624. Throughput: 0: 11355.0. Samples: 352252928. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:22:51,886][1653645] Updated weights for policy 0, policy_version 687936 (0.0020) [2024-06-15 20:22:53,382][1653645] Updated weights for policy 0, policy_version 688000 (0.0141) [2024-06-15 20:22:55,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1409024000. Throughput: 0: 11275.4. Samples: 352314368. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:22:55,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 20:22:55,988][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000688000_1409024000.pth... [2024-06-15 20:22:56,027][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000682816_1398407168.pth [2024-06-15 20:22:58,242][1653645] Updated weights for policy 0, policy_version 688063 (0.0014) [2024-06-15 20:23:00,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 44782.7, 300 sec: 43986.8). Total num frames: 1409220608. Throughput: 0: 11457.4. Samples: 352388096. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:23:00,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:23:01,409][1653645] Updated weights for policy 0, policy_version 688120 (0.0142) [2024-06-15 20:23:03,197][1653645] Updated weights for policy 0, policy_version 688185 (0.0012) [2024-06-15 20:23:04,940][1653645] Updated weights for policy 0, policy_version 688250 (0.0012) [2024-06-15 20:23:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1409548288. Throughput: 0: 11343.7. Samples: 352414720. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:23:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:23:09,798][1653645] Updated weights for policy 0, policy_version 688316 (0.0022) [2024-06-15 20:23:10,957][1648982] Fps is (10 sec: 45877.2, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1409679360. Throughput: 0: 11332.3. Samples: 352484352. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:23:10,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:23:13,122][1653645] Updated weights for policy 0, policy_version 688378 (0.0113) [2024-06-15 20:23:15,058][1653645] Updated weights for policy 0, policy_version 688444 (0.0013) [2024-06-15 20:23:15,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 44209.1). Total num frames: 1409941504. Throughput: 0: 11320.9. Samples: 352547840. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:23:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:23:17,354][1653645] Updated weights for policy 0, policy_version 688506 (0.0012) [2024-06-15 20:23:20,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1410105344. Throughput: 0: 11150.3. Samples: 352581120. Policy #0 lag: (min: 45.0, avg: 181.8, max: 301.0) [2024-06-15 20:23:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:23:21,074][1653645] Updated weights for policy 0, policy_version 688544 (0.0013) [2024-06-15 20:23:25,276][1653645] Updated weights for policy 0, policy_version 688624 (0.0015) [2024-06-15 20:23:25,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 45328.9, 300 sec: 43986.8). Total num frames: 1410334720. Throughput: 0: 11173.0. Samples: 352648704. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:23:26,314][1653645] Updated weights for policy 0, policy_version 688644 (0.0054) [2024-06-15 20:23:28,672][1651596] Signal inference workers to stop experience collection... (35800 times) [2024-06-15 20:23:28,755][1653645] InferenceWorker_p0-w0: stopping experience collection (35800 times) [2024-06-15 20:23:28,949][1651596] Signal inference workers to resume experience collection... (35800 times) [2024-06-15 20:23:28,966][1653645] InferenceWorker_p0-w0: resuming experience collection (35800 times) [2024-06-15 20:23:28,968][1653645] Updated weights for policy 0, policy_version 688736 (0.0014) [2024-06-15 20:23:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1410596864. Throughput: 0: 10808.9. Samples: 352706048. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:23:34,158][1653645] Updated weights for policy 0, policy_version 688831 (0.0013) [2024-06-15 20:23:35,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 43690.8, 300 sec: 43987.2). Total num frames: 1410727936. Throughput: 0: 10854.4. Samples: 352741376. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:23:37,968][1653645] Updated weights for policy 0, policy_version 688889 (0.0019) [2024-06-15 20:23:40,282][1653645] Updated weights for policy 0, policy_version 688958 (0.0036) [2024-06-15 20:23:40,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1411022848. Throughput: 0: 10911.3. Samples: 352805376. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:23:41,845][1653645] Updated weights for policy 0, policy_version 689013 (0.0013) [2024-06-15 20:23:45,516][1653645] Updated weights for policy 0, policy_version 689056 (0.0012) [2024-06-15 20:23:45,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44782.8, 300 sec: 44320.1). Total num frames: 1411219456. Throughput: 0: 10740.7. Samples: 352871424. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:45,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:23:49,447][1653645] Updated weights for policy 0, policy_version 689120 (0.0014) [2024-06-15 20:23:50,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 1411383296. Throughput: 0: 10911.3. Samples: 352905728. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:23:51,907][1653645] Updated weights for policy 0, policy_version 689186 (0.0014) [2024-06-15 20:23:53,443][1653645] Updated weights for policy 0, policy_version 689252 (0.0014) [2024-06-15 20:23:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1411645440. Throughput: 0: 10752.0. Samples: 352968192. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:23:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:23:57,836][1653645] Updated weights for policy 0, policy_version 689328 (0.0164) [2024-06-15 20:24:00,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1411842048. Throughput: 0: 10854.4. Samples: 353036288. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:24:01,509][1653645] Updated weights for policy 0, policy_version 689405 (0.0016) [2024-06-15 20:24:04,142][1653645] Updated weights for policy 0, policy_version 689456 (0.0012) [2024-06-15 20:24:05,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 1412136960. Throughput: 0: 10854.4. Samples: 353069568. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:24:06,149][1653645] Updated weights for policy 0, policy_version 689533 (0.0013) [2024-06-15 20:24:10,386][1653645] Updated weights for policy 0, policy_version 689584 (0.0014) [2024-06-15 20:24:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1412300800. Throughput: 0: 10740.7. Samples: 353132032. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:24:13,363][1653645] Updated weights for policy 0, policy_version 689664 (0.0014) [2024-06-15 20:24:15,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1412497408. Throughput: 0: 10990.9. Samples: 353200640. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:24:16,969][1651596] Signal inference workers to stop experience collection... (35850 times) [2024-06-15 20:24:17,041][1653645] InferenceWorker_p0-w0: stopping experience collection (35850 times) [2024-06-15 20:24:17,360][1651596] Signal inference workers to resume experience collection... (35850 times) [2024-06-15 20:24:17,360][1653645] InferenceWorker_p0-w0: resuming experience collection (35850 times) [2024-06-15 20:24:17,655][1653645] Updated weights for policy 0, policy_version 689760 (0.0106) [2024-06-15 20:24:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1412694016. Throughput: 0: 10740.6. Samples: 353224704. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:24:21,535][1653645] Updated weights for policy 0, policy_version 689808 (0.0013) [2024-06-15 20:24:22,816][1653645] Updated weights for policy 0, policy_version 689856 (0.0020) [2024-06-15 20:24:25,295][1653645] Updated weights for policy 0, policy_version 689912 (0.0013) [2024-06-15 20:24:25,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 1412956160. Throughput: 0: 10877.1. Samples: 353294848. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:24:28,743][1653645] Updated weights for policy 0, policy_version 689968 (0.0023) [2024-06-15 20:24:30,153][1653645] Updated weights for policy 0, policy_version 690018 (0.0014) [2024-06-15 20:24:30,719][1653645] Updated weights for policy 0, policy_version 690048 (0.0010) [2024-06-15 20:24:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1413218304. Throughput: 0: 10854.4. Samples: 353359872. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:24:35,958][1648982] Fps is (10 sec: 39322.9, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1413349376. Throughput: 0: 10990.9. Samples: 353400320. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:35,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:24:36,320][1653645] Updated weights for policy 0, policy_version 690115 (0.0016) [2024-06-15 20:24:39,409][1653645] Updated weights for policy 0, policy_version 690197 (0.0014) [2024-06-15 20:24:40,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 44434.3). Total num frames: 1413644288. Throughput: 0: 10956.8. Samples: 353461248. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:24:41,370][1653645] Updated weights for policy 0, policy_version 690272 (0.0011) [2024-06-15 20:24:45,868][1653645] Updated weights for policy 0, policy_version 690322 (0.0018) [2024-06-15 20:24:45,966][1648982] Fps is (10 sec: 42564.5, 60 sec: 42592.8, 300 sec: 44096.8). Total num frames: 1413775360. Throughput: 0: 11034.5. Samples: 353532928. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:45,969][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:24:48,084][1653645] Updated weights for policy 0, policy_version 690400 (0.0014) [2024-06-15 20:24:50,868][1653645] Updated weights for policy 0, policy_version 690464 (0.0013) [2024-06-15 20:24:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1414070272. Throughput: 0: 10956.8. Samples: 353562624. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:24:52,103][1653645] Updated weights for policy 0, policy_version 690497 (0.0019) [2024-06-15 20:24:53,334][1653645] Updated weights for policy 0, policy_version 690556 (0.0017) [2024-06-15 20:24:55,958][1648982] Fps is (10 sec: 49189.6, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1414266880. Throughput: 0: 11025.0. Samples: 353628160. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:24:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:24:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000690560_1414266880.pth... [2024-06-15 20:24:56,038][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000685408_1403715584.pth [2024-06-15 20:24:58,800][1653645] Updated weights for policy 0, policy_version 690622 (0.0014) [2024-06-15 20:25:00,716][1653645] Updated weights for policy 0, policy_version 690681 (0.0013) [2024-06-15 20:25:00,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 1414529024. Throughput: 0: 10899.9. Samples: 353691136. Policy #0 lag: (min: 7.0, avg: 94.5, max: 263.0) [2024-06-15 20:25:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:25:03,678][1653645] Updated weights for policy 0, policy_version 690736 (0.0013) [2024-06-15 20:25:03,824][1651596] Signal inference workers to stop experience collection... (35900 times) [2024-06-15 20:25:03,891][1653645] InferenceWorker_p0-w0: stopping experience collection (35900 times) [2024-06-15 20:25:04,094][1651596] Signal inference workers to resume experience collection... (35900 times) [2024-06-15 20:25:04,106][1653645] InferenceWorker_p0-w0: resuming experience collection (35900 times) [2024-06-15 20:25:05,552][1653645] Updated weights for policy 0, policy_version 690813 (0.0014) [2024-06-15 20:25:05,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 1414791168. Throughput: 0: 11104.7. Samples: 353724416. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:25:10,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1414856704. Throughput: 0: 11093.4. Samples: 353794048. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:25:11,246][1653645] Updated weights for policy 0, policy_version 690864 (0.0015) [2024-06-15 20:25:12,690][1653645] Updated weights for policy 0, policy_version 690914 (0.0015) [2024-06-15 20:25:15,372][1653645] Updated weights for policy 0, policy_version 690977 (0.0012) [2024-06-15 20:25:15,960][1648982] Fps is (10 sec: 39322.1, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1415184384. Throughput: 0: 10945.4. Samples: 353852416. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:15,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:25:16,864][1653645] Updated weights for policy 0, policy_version 691040 (0.0012) [2024-06-15 20:25:20,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1415315456. Throughput: 0: 10752.0. Samples: 353884160. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:25:22,541][1653645] Updated weights for policy 0, policy_version 691092 (0.0035) [2024-06-15 20:25:24,503][1653645] Updated weights for policy 0, policy_version 691153 (0.0013) [2024-06-15 20:25:25,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1415577600. Throughput: 0: 10843.0. Samples: 353949184. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:25:27,586][1653645] Updated weights for policy 0, policy_version 691220 (0.0017) [2024-06-15 20:25:29,485][1653645] Updated weights for policy 0, policy_version 691312 (0.0013) [2024-06-15 20:25:30,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1415839744. Throughput: 0: 10651.5. Samples: 354012160. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:25:35,783][1653645] Updated weights for policy 0, policy_version 691385 (0.0022) [2024-06-15 20:25:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1415970816. Throughput: 0: 10808.9. Samples: 354049024. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:25:37,584][1653645] Updated weights for policy 0, policy_version 691447 (0.0053) [2024-06-15 20:25:40,959][1648982] Fps is (10 sec: 39318.6, 60 sec: 43143.9, 300 sec: 43986.8). Total num frames: 1416232960. Throughput: 0: 10854.3. Samples: 354116608. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:25:41,023][1653645] Updated weights for policy 0, policy_version 691536 (0.0015) [2024-06-15 20:25:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43150.3, 300 sec: 43986.9). Total num frames: 1416364032. Throughput: 0: 10877.1. Samples: 354180608. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:25:46,748][1653645] Updated weights for policy 0, policy_version 691616 (0.0126) [2024-06-15 20:25:49,605][1653645] Updated weights for policy 0, policy_version 691681 (0.0021) [2024-06-15 20:25:50,237][1653645] Updated weights for policy 0, policy_version 691709 (0.0010) [2024-06-15 20:25:50,958][1648982] Fps is (10 sec: 39324.8, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1416626176. Throughput: 0: 10774.8. Samples: 354209280. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:25:51,606][1651596] Signal inference workers to stop experience collection... (35950 times) [2024-06-15 20:25:51,652][1653645] InferenceWorker_p0-w0: stopping experience collection (35950 times) [2024-06-15 20:25:51,790][1651596] Signal inference workers to resume experience collection... (35950 times) [2024-06-15 20:25:51,791][1653645] InferenceWorker_p0-w0: resuming experience collection (35950 times) [2024-06-15 20:25:52,838][1653645] Updated weights for policy 0, policy_version 691792 (0.0012) [2024-06-15 20:25:55,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 1416888320. Throughput: 0: 10649.6. Samples: 354273280. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:25:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:25:58,122][1653645] Updated weights for policy 0, policy_version 691843 (0.0134) [2024-06-15 20:26:00,658][1653645] Updated weights for policy 0, policy_version 691905 (0.0013) [2024-06-15 20:26:00,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 43653.6). Total num frames: 1417052160. Throughput: 0: 10979.6. Samples: 354346496. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:26:02,758][1653645] Updated weights for policy 0, policy_version 691971 (0.0012) [2024-06-15 20:26:04,825][1653645] Updated weights for policy 0, policy_version 692065 (0.0015) [2024-06-15 20:26:05,958][1648982] Fps is (10 sec: 52427.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1417412608. Throughput: 0: 10956.7. Samples: 354377216. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:26:09,673][1653645] Updated weights for policy 0, policy_version 692104 (0.0016) [2024-06-15 20:26:10,946][1653645] Updated weights for policy 0, policy_version 692152 (0.0012) [2024-06-15 20:26:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1417510912. Throughput: 0: 11138.8. Samples: 354450432. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:26:12,836][1653645] Updated weights for policy 0, policy_version 692208 (0.0013) [2024-06-15 20:26:15,130][1653645] Updated weights for policy 0, policy_version 692285 (0.0016) [2024-06-15 20:26:15,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1417838592. Throughput: 0: 11161.6. Samples: 354514432. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:26:16,625][1653645] Updated weights for policy 0, policy_version 692352 (0.0015) [2024-06-15 20:26:20,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 1417969664. Throughput: 0: 11138.8. Samples: 354550272. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:26:21,857][1653645] Updated weights for policy 0, policy_version 692400 (0.0012) [2024-06-15 20:26:23,941][1653645] Updated weights for policy 0, policy_version 692448 (0.0011) [2024-06-15 20:26:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1418231808. Throughput: 0: 11184.6. Samples: 354619904. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:25,961][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:26:26,141][1653645] Updated weights for policy 0, policy_version 692514 (0.0013) [2024-06-15 20:26:28,146][1653645] Updated weights for policy 0, policy_version 692598 (0.0115) [2024-06-15 20:26:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1418461184. Throughput: 0: 11264.0. Samples: 354687488. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:26:32,869][1653645] Updated weights for policy 0, policy_version 692640 (0.0014) [2024-06-15 20:26:34,890][1653645] Updated weights for policy 0, policy_version 692688 (0.0012) [2024-06-15 20:26:35,290][1651596] Signal inference workers to stop experience collection... (36000 times) [2024-06-15 20:26:35,369][1653645] InferenceWorker_p0-w0: stopping experience collection (36000 times) [2024-06-15 20:26:35,483][1651596] Signal inference workers to resume experience collection... (36000 times) [2024-06-15 20:26:35,484][1653645] InferenceWorker_p0-w0: resuming experience collection (36000 times) [2024-06-15 20:26:35,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45875.3, 300 sec: 44098.0). Total num frames: 1418723328. Throughput: 0: 11309.5. Samples: 354718208. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:35,960][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:26:37,747][1653645] Updated weights for policy 0, policy_version 692739 (0.0123) [2024-06-15 20:26:39,380][1653645] Updated weights for policy 0, policy_version 692799 (0.0012) [2024-06-15 20:26:40,917][1653645] Updated weights for policy 0, policy_version 692864 (0.0012) [2024-06-15 20:26:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.8, 300 sec: 44431.2). Total num frames: 1418985472. Throughput: 0: 11309.5. Samples: 354782208. Policy #0 lag: (min: 57.0, avg: 171.7, max: 313.0) [2024-06-15 20:26:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:26:45,448][1653645] Updated weights for policy 0, policy_version 692921 (0.0098) [2024-06-15 20:26:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.2, 300 sec: 43987.8). Total num frames: 1419116544. Throughput: 0: 11150.2. Samples: 354848256. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:26:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:26:48,402][1653645] Updated weights for policy 0, policy_version 692992 (0.0014) [2024-06-15 20:26:50,792][1653645] Updated weights for policy 0, policy_version 693056 (0.0013) [2024-06-15 20:26:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1419378688. Throughput: 0: 11127.5. Samples: 354877952. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:26:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:26:53,124][1653645] Updated weights for policy 0, policy_version 693120 (0.0014) [2024-06-15 20:26:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1419509760. Throughput: 0: 11047.8. Samples: 354947584. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:26:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:26:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000693120_1419509760.pth... [2024-06-15 20:26:56,190][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000688000_1409024000.pth [2024-06-15 20:26:57,515][1653645] Updated weights for policy 0, policy_version 693184 (0.0014) [2024-06-15 20:26:59,561][1653645] Updated weights for policy 0, policy_version 693241 (0.0015) [2024-06-15 20:27:00,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 45329.1, 300 sec: 43875.8). Total num frames: 1419771904. Throughput: 0: 11229.9. Samples: 355019776. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:27:01,676][1653645] Updated weights for policy 0, policy_version 693283 (0.0012) [2024-06-15 20:27:03,917][1653645] Updated weights for policy 0, policy_version 693344 (0.0017) [2024-06-15 20:27:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1420034048. Throughput: 0: 11161.6. Samples: 355052544. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:27:07,812][1653645] Updated weights for policy 0, policy_version 693392 (0.0013) [2024-06-15 20:27:09,020][1653645] Updated weights for policy 0, policy_version 693456 (0.0014) [2024-06-15 20:27:10,035][1653645] Updated weights for policy 0, policy_version 693504 (0.0014) [2024-06-15 20:27:10,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 46421.2, 300 sec: 44431.2). Total num frames: 1420296192. Throughput: 0: 11241.2. Samples: 355125760. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:27:13,499][1653645] Updated weights for policy 0, policy_version 693563 (0.0033) [2024-06-15 20:27:15,772][1653645] Updated weights for policy 0, policy_version 693624 (0.0018) [2024-06-15 20:27:15,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1420558336. Throughput: 0: 11241.2. Samples: 355193344. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:27:20,351][1651596] Signal inference workers to stop experience collection... (36050 times) [2024-06-15 20:27:20,401][1653645] InferenceWorker_p0-w0: stopping experience collection (36050 times) [2024-06-15 20:27:20,600][1651596] Signal inference workers to resume experience collection... (36050 times) [2024-06-15 20:27:20,600][1653645] InferenceWorker_p0-w0: resuming experience collection (36050 times) [2024-06-15 20:27:20,602][1653645] Updated weights for policy 0, policy_version 693728 (0.0106) [2024-06-15 20:27:20,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 46421.2, 300 sec: 44542.2). Total num frames: 1420754944. Throughput: 0: 11377.7. Samples: 355230208. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:27:24,292][1653645] Updated weights for policy 0, policy_version 693776 (0.0013) [2024-06-15 20:27:25,959][1648982] Fps is (10 sec: 39318.2, 60 sec: 45328.4, 300 sec: 43986.7). Total num frames: 1420951552. Throughput: 0: 11423.1. Samples: 355296256. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:27:26,763][1653645] Updated weights for policy 0, policy_version 693832 (0.0119) [2024-06-15 20:27:30,833][1653645] Updated weights for policy 0, policy_version 693904 (0.0077) [2024-06-15 20:27:30,958][1648982] Fps is (10 sec: 36045.3, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 1421115392. Throughput: 0: 11411.9. Samples: 355361792. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:30,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:27:32,491][1653645] Updated weights for policy 0, policy_version 693953 (0.0020) [2024-06-15 20:27:33,958][1653645] Updated weights for policy 0, policy_version 694015 (0.0012) [2024-06-15 20:27:35,958][1648982] Fps is (10 sec: 42602.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1421377536. Throughput: 0: 11423.3. Samples: 355392000. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:27:38,932][1653645] Updated weights for policy 0, policy_version 694096 (0.0013) [2024-06-15 20:27:39,950][1653645] Updated weights for policy 0, policy_version 694139 (0.0012) [2024-06-15 20:27:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1421606912. Throughput: 0: 11377.8. Samples: 355459584. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:27:43,484][1653645] Updated weights for policy 0, policy_version 694200 (0.0014) [2024-06-15 20:27:45,258][1653645] Updated weights for policy 0, policy_version 694246 (0.0012) [2024-06-15 20:27:45,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 1421869056. Throughput: 0: 11298.1. Samples: 355528192. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:27:47,117][1653645] Updated weights for policy 0, policy_version 694288 (0.0015) [2024-06-15 20:27:49,994][1653645] Updated weights for policy 0, policy_version 694352 (0.0013) [2024-06-15 20:27:50,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1422098432. Throughput: 0: 11275.4. Samples: 355559936. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:50,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:27:51,035][1653645] Updated weights for policy 0, policy_version 694400 (0.0012) [2024-06-15 20:27:55,676][1653645] Updated weights for policy 0, policy_version 694459 (0.0120) [2024-06-15 20:27:55,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45875.2, 300 sec: 44209.1). Total num frames: 1422262272. Throughput: 0: 11264.0. Samples: 355632640. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:27:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:27:57,017][1653645] Updated weights for policy 0, policy_version 694498 (0.0012) [2024-06-15 20:28:00,733][1653645] Updated weights for policy 0, policy_version 694576 (0.0016) [2024-06-15 20:28:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 1422491648. Throughput: 0: 11036.5. Samples: 355689984. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:28:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:28:03,260][1653645] Updated weights for policy 0, policy_version 694647 (0.0013) [2024-06-15 20:28:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1422655488. Throughput: 0: 10911.3. Samples: 355721216. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:28:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:28:08,150][1653645] Updated weights for policy 0, policy_version 694709 (0.0012) [2024-06-15 20:28:08,898][1651596] Signal inference workers to stop experience collection... (36100 times) [2024-06-15 20:28:08,971][1653645] InferenceWorker_p0-w0: stopping experience collection (36100 times) [2024-06-15 20:28:09,047][1651596] Signal inference workers to resume experience collection... (36100 times) [2024-06-15 20:28:09,047][1653645] InferenceWorker_p0-w0: resuming experience collection (36100 times) [2024-06-15 20:28:09,468][1653645] Updated weights for policy 0, policy_version 694776 (0.0016) [2024-06-15 20:28:10,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1422917632. Throughput: 0: 10797.7. Samples: 355782144. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:28:10,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:28:12,451][1653645] Updated weights for policy 0, policy_version 694818 (0.0012) [2024-06-15 20:28:15,290][1653645] Updated weights for policy 0, policy_version 694870 (0.0012) [2024-06-15 20:28:15,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43144.6, 300 sec: 44209.0). Total num frames: 1423147008. Throughput: 0: 10945.4. Samples: 355854336. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:28:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:28:18,854][1653645] Updated weights for policy 0, policy_version 694921 (0.0012) [2024-06-15 20:28:20,261][1653645] Updated weights for policy 0, policy_version 694992 (0.0014) [2024-06-15 20:28:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1423376384. Throughput: 0: 11070.5. Samples: 355890176. Policy #0 lag: (min: 15.0, avg: 108.8, max: 271.0) [2024-06-15 20:28:20,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:28:23,746][1653645] Updated weights for policy 0, policy_version 695072 (0.0015) [2024-06-15 20:28:25,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43691.3, 300 sec: 43986.9). Total num frames: 1423572992. Throughput: 0: 10877.2. Samples: 355949056. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:28:27,394][1653645] Updated weights for policy 0, policy_version 695136 (0.0015) [2024-06-15 20:28:30,660][1653645] Updated weights for policy 0, policy_version 695184 (0.0013) [2024-06-15 20:28:30,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1423769600. Throughput: 0: 10956.8. Samples: 356021248. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:28:32,287][1653645] Updated weights for policy 0, policy_version 695248 (0.0013) [2024-06-15 20:28:33,706][1653645] Updated weights for policy 0, policy_version 695296 (0.0023) [2024-06-15 20:28:35,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1423998976. Throughput: 0: 10797.5. Samples: 356045824. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:28:39,129][1653645] Updated weights for policy 0, policy_version 695365 (0.0014) [2024-06-15 20:28:40,577][1653645] Updated weights for policy 0, policy_version 695424 (0.0140) [2024-06-15 20:28:40,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 1424228352. Throughput: 0: 10706.5. Samples: 356114432. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:40,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 20:28:43,990][1653645] Updated weights for policy 0, policy_version 695485 (0.0013) [2024-06-15 20:28:45,815][1653645] Updated weights for policy 0, policy_version 695550 (0.0128) [2024-06-15 20:28:45,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1424490496. Throughput: 0: 10774.8. Samples: 356174848. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:28:49,567][1653645] Updated weights for policy 0, policy_version 695616 (0.0013) [2024-06-15 20:28:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 43986.9). Total num frames: 1424621568. Throughput: 0: 10843.0. Samples: 356209152. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:28:55,340][1653645] Updated weights for policy 0, policy_version 695696 (0.0013) [2024-06-15 20:28:55,958][1648982] Fps is (10 sec: 32766.7, 60 sec: 42598.2, 300 sec: 43986.8). Total num frames: 1424818176. Throughput: 0: 11082.0. Samples: 356280832. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:28:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:28:56,549][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000695744_1424883712.pth... [2024-06-15 20:28:56,550][1653645] Updated weights for policy 0, policy_version 695744 (0.0115) [2024-06-15 20:28:56,707][1651596] Signal inference workers to stop experience collection... (36150 times) [2024-06-15 20:28:56,744][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000690560_1414266880.pth [2024-06-15 20:28:56,771][1653645] InferenceWorker_p0-w0: stopping experience collection (36150 times) [2024-06-15 20:28:57,050][1651596] Signal inference workers to resume experience collection... (36150 times) [2024-06-15 20:28:57,052][1653645] InferenceWorker_p0-w0: resuming experience collection (36150 times) [2024-06-15 20:28:58,060][1653645] Updated weights for policy 0, policy_version 695804 (0.0014) [2024-06-15 20:29:00,947][1653645] Updated weights for policy 0, policy_version 695871 (0.0015) [2024-06-15 20:29:00,957][1648982] Fps is (10 sec: 52429.5, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1425145856. Throughput: 0: 10808.9. Samples: 356340736. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:29:03,971][1653645] Updated weights for policy 0, policy_version 695928 (0.0014) [2024-06-15 20:29:05,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1425276928. Throughput: 0: 10843.0. Samples: 356378112. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:29:07,251][1653645] Updated weights for policy 0, policy_version 695971 (0.0011) [2024-06-15 20:29:09,060][1653645] Updated weights for policy 0, policy_version 696052 (0.0012) [2024-06-15 20:29:10,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43690.9, 300 sec: 44209.0). Total num frames: 1425539072. Throughput: 0: 11025.1. Samples: 356445184. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:29:11,690][1653645] Updated weights for policy 0, policy_version 696096 (0.0012) [2024-06-15 20:29:14,714][1653645] Updated weights for policy 0, policy_version 696147 (0.0013) [2024-06-15 20:29:15,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1425801216. Throughput: 0: 10797.5. Samples: 356507136. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:29:18,895][1653645] Updated weights for policy 0, policy_version 696224 (0.0012) [2024-06-15 20:29:20,523][1653645] Updated weights for policy 0, policy_version 696278 (0.0012) [2024-06-15 20:29:20,986][1648982] Fps is (10 sec: 45745.0, 60 sec: 43670.1, 300 sec: 44204.8). Total num frames: 1425997824. Throughput: 0: 11165.9. Samples: 356548608. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:20,987][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:29:23,807][1653645] Updated weights for policy 0, policy_version 696346 (0.0013) [2024-06-15 20:29:25,959][1648982] Fps is (10 sec: 39317.8, 60 sec: 43690.0, 300 sec: 43986.7). Total num frames: 1426194432. Throughput: 0: 10968.0. Samples: 356608000. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:29:26,501][1653645] Updated weights for policy 0, policy_version 696400 (0.0014) [2024-06-15 20:29:29,715][1653645] Updated weights for policy 0, policy_version 696449 (0.0010) [2024-06-15 20:29:30,958][1648982] Fps is (10 sec: 42717.8, 60 sec: 44236.4, 300 sec: 44320.0). Total num frames: 1426423808. Throughput: 0: 11195.6. Samples: 356678656. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:29:31,209][1653645] Updated weights for policy 0, policy_version 696510 (0.0011) [2024-06-15 20:29:33,680][1653645] Updated weights for policy 0, policy_version 696572 (0.0077) [2024-06-15 20:29:35,958][1648982] Fps is (10 sec: 42602.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1426620416. Throughput: 0: 10968.2. Samples: 356702720. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:29:36,672][1653645] Updated weights for policy 0, policy_version 696630 (0.0024) [2024-06-15 20:29:38,651][1653645] Updated weights for policy 0, policy_version 696672 (0.0011) [2024-06-15 20:29:40,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 43690.6, 300 sec: 44321.3). Total num frames: 1426849792. Throughput: 0: 10968.2. Samples: 356774400. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:29:42,081][1653645] Updated weights for policy 0, policy_version 696723 (0.0014) [2024-06-15 20:29:44,155][1651596] Signal inference workers to stop experience collection... (36200 times) [2024-06-15 20:29:44,192][1653645] Updated weights for policy 0, policy_version 696770 (0.0013) [2024-06-15 20:29:44,213][1653645] InferenceWorker_p0-w0: stopping experience collection (36200 times) [2024-06-15 20:29:44,356][1651596] Signal inference workers to resume experience collection... (36200 times) [2024-06-15 20:29:44,357][1653645] InferenceWorker_p0-w0: resuming experience collection (36200 times) [2024-06-15 20:29:45,339][1653645] Updated weights for policy 0, policy_version 696831 (0.0127) [2024-06-15 20:29:45,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1427111936. Throughput: 0: 11104.7. Samples: 356840448. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:29:47,839][1653645] Updated weights for policy 0, policy_version 696890 (0.0041) [2024-06-15 20:29:50,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 44782.9, 300 sec: 44209.1). Total num frames: 1427308544. Throughput: 0: 11059.2. Samples: 356875776. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:29:51,258][1653645] Updated weights for policy 0, policy_version 696947 (0.0012) [2024-06-15 20:29:53,786][1653645] Updated weights for policy 0, policy_version 696981 (0.0013) [2024-06-15 20:29:54,542][1653645] Updated weights for policy 0, policy_version 697024 (0.0013) [2024-06-15 20:29:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.4, 300 sec: 44209.0). Total num frames: 1427570688. Throughput: 0: 11081.9. Samples: 356943872. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:29:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:29:58,483][1653645] Updated weights for policy 0, policy_version 697105 (0.0101) [2024-06-15 20:29:59,442][1653645] Updated weights for policy 0, policy_version 697152 (0.0055) [2024-06-15 20:30:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1427767296. Throughput: 0: 11275.4. Samples: 357014528. Policy #0 lag: (min: 63.0, avg: 194.0, max: 319.0) [2024-06-15 20:30:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:30:02,795][1653645] Updated weights for policy 0, policy_version 697211 (0.0013) [2024-06-15 20:30:05,958][1648982] Fps is (10 sec: 42597.5, 60 sec: 45329.0, 300 sec: 44542.2). Total num frames: 1427996672. Throughput: 0: 11111.7. Samples: 357048320. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:05,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:30:06,130][1653645] Updated weights for policy 0, policy_version 697276 (0.0111) [2024-06-15 20:30:07,847][1653645] Updated weights for policy 0, policy_version 697328 (0.0014) [2024-06-15 20:30:09,617][1653645] Updated weights for policy 0, policy_version 697360 (0.0033) [2024-06-15 20:30:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1428291584. Throughput: 0: 11366.6. Samples: 357119488. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:30:13,086][1653645] Updated weights for policy 0, policy_version 697409 (0.0014) [2024-06-15 20:30:15,970][1648982] Fps is (10 sec: 42547.9, 60 sec: 43681.8, 300 sec: 44429.3). Total num frames: 1428422656. Throughput: 0: 11215.6. Samples: 357183488. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:15,971][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:30:16,836][1653645] Updated weights for policy 0, policy_version 697488 (0.0036) [2024-06-15 20:30:18,659][1653645] Updated weights for policy 0, policy_version 697540 (0.0013) [2024-06-15 20:30:19,955][1653645] Updated weights for policy 0, policy_version 697600 (0.0015) [2024-06-15 20:30:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44804.2, 300 sec: 44431.2). Total num frames: 1428684800. Throughput: 0: 11377.8. Samples: 357214720. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:30:22,417][1653645] Updated weights for policy 0, policy_version 697664 (0.0014) [2024-06-15 20:30:25,958][1648982] Fps is (10 sec: 45929.5, 60 sec: 44783.4, 300 sec: 44209.0). Total num frames: 1428881408. Throughput: 0: 11366.4. Samples: 357285888. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:25,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:30:28,551][1653645] Updated weights for policy 0, policy_version 697729 (0.0026) [2024-06-15 20:30:29,473][1653645] Updated weights for policy 0, policy_version 697783 (0.0016) [2024-06-15 20:30:30,799][1651596] Signal inference workers to stop experience collection... (36250 times) [2024-06-15 20:30:30,857][1653645] InferenceWorker_p0-w0: stopping experience collection (36250 times) [2024-06-15 20:30:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.5, 300 sec: 44653.3). Total num frames: 1429143552. Throughput: 0: 11389.2. Samples: 357352960. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:30:30,997][1651596] Signal inference workers to resume experience collection... (36250 times) [2024-06-15 20:30:30,998][1653645] InferenceWorker_p0-w0: resuming experience collection (36250 times) [2024-06-15 20:30:31,078][1653645] Updated weights for policy 0, policy_version 697843 (0.0011) [2024-06-15 20:30:32,958][1653645] Updated weights for policy 0, policy_version 697892 (0.0146) [2024-06-15 20:30:35,958][1648982] Fps is (10 sec: 45876.5, 60 sec: 45329.1, 300 sec: 44431.3). Total num frames: 1429340160. Throughput: 0: 11400.5. Samples: 357388800. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:30:36,363][1653645] Updated weights for policy 0, policy_version 697936 (0.0012) [2024-06-15 20:30:39,247][1653645] Updated weights for policy 0, policy_version 697990 (0.0014) [2024-06-15 20:30:40,356][1653645] Updated weights for policy 0, policy_version 698046 (0.0033) [2024-06-15 20:30:40,958][1648982] Fps is (10 sec: 45872.5, 60 sec: 45875.0, 300 sec: 44875.4). Total num frames: 1429602304. Throughput: 0: 11491.4. Samples: 357460992. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:30:42,147][1653645] Updated weights for policy 0, policy_version 698106 (0.0012) [2024-06-15 20:30:44,900][1653645] Updated weights for policy 0, policy_version 698164 (0.0013) [2024-06-15 20:30:45,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 45874.9, 300 sec: 44875.4). Total num frames: 1429864448. Throughput: 0: 11525.6. Samples: 357533184. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:45,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:30:47,770][1653645] Updated weights for policy 0, policy_version 698210 (0.0012) [2024-06-15 20:30:50,958][1648982] Fps is (10 sec: 49154.6, 60 sec: 46421.4, 300 sec: 44764.4). Total num frames: 1430093824. Throughput: 0: 11605.4. Samples: 357570560. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:50,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 20:30:51,250][1653645] Updated weights for policy 0, policy_version 698302 (0.0017) [2024-06-15 20:30:53,536][1653645] Updated weights for policy 0, policy_version 698358 (0.0124) [2024-06-15 20:30:55,925][1653645] Updated weights for policy 0, policy_version 698416 (0.0012) [2024-06-15 20:30:55,959][1648982] Fps is (10 sec: 49153.4, 60 sec: 46421.3, 300 sec: 45097.6). Total num frames: 1430355968. Throughput: 0: 11502.9. Samples: 357637120. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:30:55,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:30:56,210][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000698432_1430388736.pth... [2024-06-15 20:30:56,299][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000693120_1419509760.pth [2024-06-15 20:30:56,304][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000698432_1430388736.pth [2024-06-15 20:30:58,855][1653645] Updated weights for policy 0, policy_version 698480 (0.0013) [2024-06-15 20:31:00,531][1653645] Updated weights for policy 0, policy_version 698497 (0.0013) [2024-06-15 20:31:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 46421.4, 300 sec: 44542.3). Total num frames: 1430552576. Throughput: 0: 11779.2. Samples: 357713408. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:31:01,561][1653645] Updated weights for policy 0, policy_version 698551 (0.0012) [2024-06-15 20:31:04,153][1653645] Updated weights for policy 0, policy_version 698608 (0.0093) [2024-06-15 20:31:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46967.7, 300 sec: 45097.6). Total num frames: 1430814720. Throughput: 0: 11889.8. Samples: 357749760. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:31:06,471][1653645] Updated weights for policy 0, policy_version 698677 (0.0012) [2024-06-15 20:31:10,392][1653645] Updated weights for policy 0, policy_version 698745 (0.0016) [2024-06-15 20:31:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1431044096. Throughput: 0: 11889.8. Samples: 357820928. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:31:12,664][1653645] Updated weights for policy 0, policy_version 698800 (0.0013) [2024-06-15 20:31:15,463][1653645] Updated weights for policy 0, policy_version 698849 (0.0012) [2024-06-15 20:31:15,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 47523.3, 300 sec: 45097.7). Total num frames: 1431273472. Throughput: 0: 11855.6. Samples: 357886464. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:31:17,217][1653645] Updated weights for policy 0, policy_version 698896 (0.0013) [2024-06-15 20:31:17,241][1651596] Signal inference workers to stop experience collection... (36300 times) [2024-06-15 20:31:17,309][1653645] InferenceWorker_p0-w0: stopping experience collection (36300 times) [2024-06-15 20:31:17,409][1651596] Signal inference workers to resume experience collection... (36300 times) [2024-06-15 20:31:17,409][1653645] InferenceWorker_p0-w0: resuming experience collection (36300 times) [2024-06-15 20:31:18,312][1653645] Updated weights for policy 0, policy_version 698944 (0.0013) [2024-06-15 20:31:20,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1431437312. Throughput: 0: 11776.0. Samples: 357918720. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:31:22,991][1653645] Updated weights for policy 0, policy_version 699024 (0.0016) [2024-06-15 20:31:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 46967.7, 300 sec: 44875.5). Total num frames: 1431699456. Throughput: 0: 11651.0. Samples: 357985280. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:25,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 20:31:26,973][1653645] Updated weights for policy 0, policy_version 699088 (0.0013) [2024-06-15 20:31:28,901][1653645] Updated weights for policy 0, policy_version 699152 (0.0013) [2024-06-15 20:31:30,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 46967.3, 300 sec: 44875.5). Total num frames: 1431961600. Throughput: 0: 11434.7. Samples: 358047744. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:31:33,249][1653645] Updated weights for policy 0, policy_version 699216 (0.0095) [2024-06-15 20:31:34,034][1653645] Updated weights for policy 0, policy_version 699264 (0.0014) [2024-06-15 20:31:35,930][1653645] Updated weights for policy 0, policy_version 699323 (0.0012) [2024-06-15 20:31:35,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 47513.6, 300 sec: 44764.4). Total num frames: 1432190976. Throughput: 0: 11514.3. Samples: 358088704. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:31:39,234][1653645] Updated weights for policy 0, policy_version 699362 (0.0012) [2024-06-15 20:31:40,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 46967.6, 300 sec: 45097.6). Total num frames: 1432420352. Throughput: 0: 11582.5. Samples: 358158336. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:31:41,286][1653645] Updated weights for policy 0, policy_version 699440 (0.0011) [2024-06-15 20:31:44,832][1653645] Updated weights for policy 0, policy_version 699510 (0.0024) [2024-06-15 20:31:45,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45875.5, 300 sec: 44875.5). Total num frames: 1432616960. Throughput: 0: 11411.9. Samples: 358226944. Policy #0 lag: (min: 4.0, avg: 103.2, max: 260.0) [2024-06-15 20:31:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:31:46,546][1653645] Updated weights for policy 0, policy_version 699542 (0.0053) [2024-06-15 20:31:50,240][1653645] Updated weights for policy 0, policy_version 699586 (0.0012) [2024-06-15 20:31:50,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 45329.1, 300 sec: 45097.7). Total num frames: 1432813568. Throughput: 0: 11389.2. Samples: 358262272. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:31:50,967][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:31:52,346][1653645] Updated weights for policy 0, policy_version 699680 (0.0077) [2024-06-15 20:31:53,010][1653645] Updated weights for policy 0, policy_version 699712 (0.0013) [2024-06-15 20:31:55,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 45097.6). Total num frames: 1433075712. Throughput: 0: 11389.2. Samples: 358333440. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:31:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:31:56,285][1653645] Updated weights for policy 0, policy_version 699771 (0.0015) [2024-06-15 20:31:58,259][1653645] Updated weights for policy 0, policy_version 699840 (0.0012) [2024-06-15 20:32:00,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 44986.6). Total num frames: 1433305088. Throughput: 0: 11491.6. Samples: 358403584. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:32:02,014][1653645] Updated weights for policy 0, policy_version 699897 (0.0012) [2024-06-15 20:32:03,039][1651596] Signal inference workers to stop experience collection... (36350 times) [2024-06-15 20:32:03,092][1653645] InferenceWorker_p0-w0: stopping experience collection (36350 times) [2024-06-15 20:32:03,169][1651596] Signal inference workers to resume experience collection... (36350 times) [2024-06-15 20:32:03,170][1653645] InferenceWorker_p0-w0: resuming experience collection (36350 times) [2024-06-15 20:32:03,388][1653645] Updated weights for policy 0, policy_version 699944 (0.0023) [2024-06-15 20:32:05,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 1433534464. Throughput: 0: 11423.3. Samples: 358432768. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:32:07,570][1653645] Updated weights for policy 0, policy_version 700016 (0.0012) [2024-06-15 20:32:08,635][1653645] Updated weights for policy 0, policy_version 700048 (0.0011) [2024-06-15 20:32:10,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1433796608. Throughput: 0: 11491.6. Samples: 358502400. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:32:12,403][1653645] Updated weights for policy 0, policy_version 700098 (0.0020) [2024-06-15 20:32:14,174][1653645] Updated weights for policy 0, policy_version 700176 (0.0018) [2024-06-15 20:32:15,004][1653645] Updated weights for policy 0, policy_version 700218 (0.0028) [2024-06-15 20:32:15,963][1648982] Fps is (10 sec: 52403.7, 60 sec: 46417.5, 300 sec: 45096.9). Total num frames: 1434058752. Throughput: 0: 11638.2. Samples: 358571520. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:15,964][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:32:19,221][1653645] Updated weights for policy 0, policy_version 700284 (0.0013) [2024-06-15 20:32:20,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 47513.6, 300 sec: 45208.9). Total num frames: 1434288128. Throughput: 0: 11605.3. Samples: 358610944. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:32:21,161][1653645] Updated weights for policy 0, policy_version 700345 (0.0012) [2024-06-15 20:32:24,279][1653645] Updated weights for policy 0, policy_version 700401 (0.0014) [2024-06-15 20:32:25,960][1648982] Fps is (10 sec: 45887.2, 60 sec: 46965.7, 300 sec: 45430.6). Total num frames: 1434517504. Throughput: 0: 11570.7. Samples: 358679040. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:25,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:32:26,283][1653645] Updated weights for policy 0, policy_version 700474 (0.0099) [2024-06-15 20:32:30,222][1653645] Updated weights for policy 0, policy_version 700532 (0.0030) [2024-06-15 20:32:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.3, 300 sec: 45208.7). Total num frames: 1434714112. Throughput: 0: 11594.0. Samples: 358748672. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:32:32,316][1653645] Updated weights for policy 0, policy_version 700592 (0.0013) [2024-06-15 20:32:35,495][1653645] Updated weights for policy 0, policy_version 700643 (0.0014) [2024-06-15 20:32:35,958][1648982] Fps is (10 sec: 45885.7, 60 sec: 46421.3, 300 sec: 45319.8). Total num frames: 1434976256. Throughput: 0: 11559.8. Samples: 358782464. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:35,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:32:38,036][1653645] Updated weights for policy 0, policy_version 700720 (0.0016) [2024-06-15 20:32:40,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.2, 300 sec: 44875.5). Total num frames: 1435107328. Throughput: 0: 11411.9. Samples: 358846976. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:32:42,580][1653645] Updated weights for policy 0, policy_version 700793 (0.0015) [2024-06-15 20:32:44,229][1653645] Updated weights for policy 0, policy_version 700854 (0.0012) [2024-06-15 20:32:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1435369472. Throughput: 0: 11218.5. Samples: 358908416. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:32:47,751][1653645] Updated weights for policy 0, policy_version 700899 (0.0014) [2024-06-15 20:32:50,204][1653645] Updated weights for policy 0, policy_version 700951 (0.0015) [2024-06-15 20:32:50,425][1651596] Signal inference workers to stop experience collection... (36400 times) [2024-06-15 20:32:50,468][1653645] InferenceWorker_p0-w0: stopping experience collection (36400 times) [2024-06-15 20:32:50,742][1651596] Signal inference workers to resume experience collection... (36400 times) [2024-06-15 20:32:50,742][1653645] InferenceWorker_p0-w0: resuming experience collection (36400 times) [2024-06-15 20:32:50,958][1648982] Fps is (10 sec: 49150.4, 60 sec: 46421.1, 300 sec: 45208.7). Total num frames: 1435598848. Throughput: 0: 11275.3. Samples: 358940160. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:50,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:32:53,479][1653645] Updated weights for policy 0, policy_version 701011 (0.0010) [2024-06-15 20:32:55,154][1653645] Updated weights for policy 0, policy_version 701072 (0.0012) [2024-06-15 20:32:55,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 46421.2, 300 sec: 45319.8). Total num frames: 1435860992. Throughput: 0: 11252.6. Samples: 359008768. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:32:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:32:56,312][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000701120_1435893760.pth... [2024-06-15 20:32:56,367][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000695744_1424883712.pth [2024-06-15 20:32:56,420][1653645] Updated weights for policy 0, policy_version 701120 (0.0010) [2024-06-15 20:32:59,253][1653645] Updated weights for policy 0, policy_version 701155 (0.0013) [2024-06-15 20:33:00,970][1648982] Fps is (10 sec: 42547.2, 60 sec: 45319.8, 300 sec: 45317.9). Total num frames: 1436024832. Throughput: 0: 11182.5. Samples: 359074816. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:33:00,971][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:33:03,314][1653645] Updated weights for policy 0, policy_version 701232 (0.0015) [2024-06-15 20:33:05,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1436188672. Throughput: 0: 10956.8. Samples: 359104000. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:33:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:33:06,623][1653645] Updated weights for policy 0, policy_version 701296 (0.0012) [2024-06-15 20:33:07,995][1653645] Updated weights for policy 0, policy_version 701344 (0.0109) [2024-06-15 20:33:10,908][1653645] Updated weights for policy 0, policy_version 701377 (0.0013) [2024-06-15 20:33:10,958][1648982] Fps is (10 sec: 39370.2, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1436418048. Throughput: 0: 10741.2. Samples: 359162368. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:33:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:33:14,767][1653645] Updated weights for policy 0, policy_version 701456 (0.0011) [2024-06-15 20:33:15,958][1648982] Fps is (10 sec: 49150.6, 60 sec: 43694.0, 300 sec: 45097.6). Total num frames: 1436680192. Throughput: 0: 10877.1. Samples: 359238144. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:33:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:33:17,505][1653645] Updated weights for policy 0, policy_version 701505 (0.0015) [2024-06-15 20:33:18,908][1653645] Updated weights for policy 0, policy_version 701568 (0.0023) [2024-06-15 20:33:20,387][1653645] Updated weights for policy 0, policy_version 701629 (0.0014) [2024-06-15 20:33:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 1436942336. Throughput: 0: 10911.3. Samples: 359273472. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:33:20,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:33:23,401][1653645] Updated weights for policy 0, policy_version 701694 (0.0014) [2024-06-15 20:33:25,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43146.1, 300 sec: 45208.7). Total num frames: 1437106176. Throughput: 0: 10979.5. Samples: 359341056. Policy #0 lag: (min: 15.0, avg: 139.5, max: 271.0) [2024-06-15 20:33:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:33:26,635][1653645] Updated weights for policy 0, policy_version 701744 (0.0012) [2024-06-15 20:33:29,601][1653645] Updated weights for policy 0, policy_version 701794 (0.0013) [2024-06-15 20:33:30,912][1653645] Updated weights for policy 0, policy_version 701856 (0.0013) [2024-06-15 20:33:30,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 44782.9, 300 sec: 45430.9). Total num frames: 1437401088. Throughput: 0: 11127.5. Samples: 359409152. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:33:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:33:34,894][1653645] Updated weights for policy 0, policy_version 701920 (0.0013) [2024-06-15 20:33:35,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1437597696. Throughput: 0: 11207.2. Samples: 359444480. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:33:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:33:37,058][1653645] Updated weights for policy 0, policy_version 701955 (0.0012) [2024-06-15 20:33:37,371][1651596] Signal inference workers to stop experience collection... (36450 times) [2024-06-15 20:33:37,432][1653645] InferenceWorker_p0-w0: stopping experience collection (36450 times) [2024-06-15 20:33:37,640][1651596] Signal inference workers to resume experience collection... (36450 times) [2024-06-15 20:33:37,641][1653645] InferenceWorker_p0-w0: resuming experience collection (36450 times) [2024-06-15 20:33:38,505][1653645] Updated weights for policy 0, policy_version 702016 (0.0010) [2024-06-15 20:33:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 1437761536. Throughput: 0: 11195.8. Samples: 359512576. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:33:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:33:42,467][1653645] Updated weights for policy 0, policy_version 702098 (0.0014) [2024-06-15 20:33:43,401][1653645] Updated weights for policy 0, policy_version 702144 (0.0013) [2024-06-15 20:33:45,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.6, 300 sec: 45319.8). Total num frames: 1437990912. Throughput: 0: 11119.1. Samples: 359575040. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:33:45,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:33:47,862][1653645] Updated weights for policy 0, policy_version 702205 (0.0017) [2024-06-15 20:33:50,349][1653645] Updated weights for policy 0, policy_version 702272 (0.0081) [2024-06-15 20:33:50,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44237.0, 300 sec: 45542.0). Total num frames: 1438253056. Throughput: 0: 11252.6. Samples: 359610368. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:33:50,958][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 20:33:54,841][1653645] Updated weights for policy 0, policy_version 702355 (0.0170) [2024-06-15 20:33:55,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44237.0, 300 sec: 45319.8). Total num frames: 1438515200. Throughput: 0: 11332.3. Samples: 359672320. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:33:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:33:58,642][1653645] Updated weights for policy 0, policy_version 702433 (0.0012) [2024-06-15 20:34:00,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43699.4, 300 sec: 45319.8). Total num frames: 1438646272. Throughput: 0: 11298.1. Samples: 359746560. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:34:01,188][1653645] Updated weights for policy 0, policy_version 702471 (0.0011) [2024-06-15 20:34:02,181][1653645] Updated weights for policy 0, policy_version 702518 (0.0014) [2024-06-15 20:34:04,257][1653645] Updated weights for policy 0, policy_version 702587 (0.0013) [2024-06-15 20:34:05,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 46421.3, 300 sec: 45542.0). Total num frames: 1438973952. Throughput: 0: 11252.6. Samples: 359779840. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:34:06,400][1653645] Updated weights for policy 0, policy_version 702649 (0.0014) [2024-06-15 20:34:09,694][1653645] Updated weights for policy 0, policy_version 702718 (0.0013) [2024-06-15 20:34:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.0, 300 sec: 45319.8). Total num frames: 1439170560. Throughput: 0: 11286.7. Samples: 359848960. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:34:13,740][1653645] Updated weights for policy 0, policy_version 702784 (0.0017) [2024-06-15 20:34:15,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45329.3, 300 sec: 45435.3). Total num frames: 1439399936. Throughput: 0: 11218.5. Samples: 359913984. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:34:17,739][1653645] Updated weights for policy 0, policy_version 702864 (0.0012) [2024-06-15 20:34:20,914][1653645] Updated weights for policy 0, policy_version 702928 (0.0014) [2024-06-15 20:34:20,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 44236.7, 300 sec: 45431.0). Total num frames: 1439596544. Throughput: 0: 11104.7. Samples: 359944192. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:34:23,812][1653645] Updated weights for policy 0, policy_version 702978 (0.0013) [2024-06-15 20:34:24,619][1651596] Signal inference workers to stop experience collection... (36500 times) [2024-06-15 20:34:24,681][1653645] InferenceWorker_p0-w0: stopping experience collection (36500 times) [2024-06-15 20:34:25,035][1651596] Signal inference workers to resume experience collection... (36500 times) [2024-06-15 20:34:25,056][1653645] InferenceWorker_p0-w0: resuming experience collection (36500 times) [2024-06-15 20:34:25,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.1, 300 sec: 45431.0). Total num frames: 1439825920. Throughput: 0: 11309.5. Samples: 360021504. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:34:26,919][1653645] Updated weights for policy 0, policy_version 703056 (0.0013) [2024-06-15 20:34:29,074][1653645] Updated weights for policy 0, policy_version 703105 (0.0013) [2024-06-15 20:34:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 44783.0, 300 sec: 45653.0). Total num frames: 1440088064. Throughput: 0: 11059.2. Samples: 360072704. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:34:33,098][1653645] Updated weights for policy 0, policy_version 703189 (0.0017) [2024-06-15 20:34:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.5, 300 sec: 45319.8). Total num frames: 1440219136. Throughput: 0: 11138.8. Samples: 360111616. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:35,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 20:34:36,590][1653645] Updated weights for policy 0, policy_version 703249 (0.0012) [2024-06-15 20:34:38,430][1653645] Updated weights for policy 0, policy_version 703312 (0.0019) [2024-06-15 20:34:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1440481280. Throughput: 0: 11207.1. Samples: 360176640. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:40,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:34:41,196][1653645] Updated weights for policy 0, policy_version 703376 (0.0014) [2024-06-15 20:34:45,064][1653645] Updated weights for policy 0, policy_version 703443 (0.0050) [2024-06-15 20:34:45,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45329.0, 300 sec: 45430.9). Total num frames: 1440710656. Throughput: 0: 11036.4. Samples: 360243200. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:45,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:34:49,028][1653645] Updated weights for policy 0, policy_version 703489 (0.0012) [2024-06-15 20:34:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1440907264. Throughput: 0: 11047.8. Samples: 360276992. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:34:51,074][1653645] Updated weights for policy 0, policy_version 703572 (0.0015) [2024-06-15 20:34:53,460][1653645] Updated weights for policy 0, policy_version 703624 (0.0045) [2024-06-15 20:34:54,476][1653645] Updated weights for policy 0, policy_version 703673 (0.0014) [2024-06-15 20:34:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.4, 300 sec: 45319.8). Total num frames: 1441136640. Throughput: 0: 10877.1. Samples: 360338432. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:34:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:34:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000703680_1441136640.pth... [2024-06-15 20:34:56,008][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000698432_1430388736.pth [2024-06-15 20:34:57,327][1653645] Updated weights for policy 0, policy_version 703713 (0.0011) [2024-06-15 20:35:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.9, 300 sec: 44986.6). Total num frames: 1441267712. Throughput: 0: 11127.5. Samples: 360414720. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:35:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:35:01,844][1653645] Updated weights for policy 0, policy_version 703792 (0.0014) [2024-06-15 20:35:03,518][1653645] Updated weights for policy 0, policy_version 703866 (0.0027) [2024-06-15 20:35:05,651][1653645] Updated weights for policy 0, policy_version 703920 (0.0028) [2024-06-15 20:35:05,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1441660928. Throughput: 0: 11036.5. Samples: 360440832. Policy #0 lag: (min: 33.0, avg: 130.8, max: 289.0) [2024-06-15 20:35:05,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:35:08,563][1653645] Updated weights for policy 0, policy_version 703968 (0.0023) [2024-06-15 20:35:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.8, 300 sec: 45321.7). Total num frames: 1441792000. Throughput: 0: 10899.9. Samples: 360512000. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:35:12,351][1653645] Updated weights for policy 0, policy_version 704016 (0.0085) [2024-06-15 20:35:12,918][1651596] Signal inference workers to stop experience collection... (36550 times) [2024-06-15 20:35:12,938][1653645] InferenceWorker_p0-w0: stopping experience collection (36550 times) [2024-06-15 20:35:13,074][1651596] Signal inference workers to resume experience collection... (36550 times) [2024-06-15 20:35:13,077][1653645] InferenceWorker_p0-w0: resuming experience collection (36550 times) [2024-06-15 20:35:14,076][1653645] Updated weights for policy 0, policy_version 704096 (0.0012) [2024-06-15 20:35:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 1442054144. Throughput: 0: 11252.6. Samples: 360579072. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:35:16,879][1653645] Updated weights for policy 0, policy_version 704146 (0.0072) [2024-06-15 20:35:17,920][1653645] Updated weights for policy 0, policy_version 704192 (0.0010) [2024-06-15 20:35:20,825][1653645] Updated weights for policy 0, policy_version 704253 (0.0012) [2024-06-15 20:35:20,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 45328.8, 300 sec: 45542.0). Total num frames: 1442316288. Throughput: 0: 11138.8. Samples: 360612864. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:20,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:35:25,626][1653645] Updated weights for policy 0, policy_version 704336 (0.0013) [2024-06-15 20:35:25,959][1648982] Fps is (10 sec: 42591.0, 60 sec: 44235.6, 300 sec: 45208.5). Total num frames: 1442480128. Throughput: 0: 11309.1. Samples: 360685568. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:25,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:35:26,797][1653645] Updated weights for policy 0, policy_version 704384 (0.0012) [2024-06-15 20:35:29,168][1653645] Updated weights for policy 0, policy_version 704439 (0.0119) [2024-06-15 20:35:30,959][1648982] Fps is (10 sec: 39319.4, 60 sec: 43690.0, 300 sec: 45319.7). Total num frames: 1442709504. Throughput: 0: 11252.5. Samples: 360749568. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:35:32,365][1653645] Updated weights for policy 0, policy_version 704496 (0.0013) [2024-06-15 20:35:35,958][1648982] Fps is (10 sec: 36051.2, 60 sec: 43690.8, 300 sec: 44875.6). Total num frames: 1442840576. Throughput: 0: 11207.1. Samples: 360781312. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:35:37,184][1653645] Updated weights for policy 0, policy_version 704546 (0.0014) [2024-06-15 20:35:39,155][1653645] Updated weights for policy 0, policy_version 704630 (0.0270) [2024-06-15 20:35:40,316][1653645] Updated weights for policy 0, policy_version 704659 (0.0012) [2024-06-15 20:35:40,958][1648982] Fps is (10 sec: 45878.5, 60 sec: 44782.8, 300 sec: 45097.7). Total num frames: 1443168256. Throughput: 0: 11252.6. Samples: 360844800. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:35:41,306][1653645] Updated weights for policy 0, policy_version 704702 (0.0012) [2024-06-15 20:35:44,349][1653645] Updated weights for policy 0, policy_version 704768 (0.0015) [2024-06-15 20:35:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44237.0, 300 sec: 44986.6). Total num frames: 1443364864. Throughput: 0: 11025.1. Samples: 360910848. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:35:49,840][1653645] Updated weights for policy 0, policy_version 704848 (0.0179) [2024-06-15 20:35:50,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 45329.1, 300 sec: 44986.6). Total num frames: 1443627008. Throughput: 0: 11377.8. Samples: 360952832. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:35:52,532][1653645] Updated weights for policy 0, policy_version 704944 (0.0012) [2024-06-15 20:35:55,153][1653645] Updated weights for policy 0, policy_version 704992 (0.0012) [2024-06-15 20:35:55,958][1648982] Fps is (10 sec: 52424.6, 60 sec: 45874.9, 300 sec: 45208.6). Total num frames: 1443889152. Throughput: 0: 11104.5. Samples: 361011712. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:35:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:35:59,964][1651596] Signal inference workers to stop experience collection... (36600 times) [2024-06-15 20:36:00,021][1653645] InferenceWorker_p0-w0: stopping experience collection (36600 times) [2024-06-15 20:36:00,221][1651596] Signal inference workers to resume experience collection... (36600 times) [2024-06-15 20:36:00,221][1653645] InferenceWorker_p0-w0: resuming experience collection (36600 times) [2024-06-15 20:36:00,412][1653645] Updated weights for policy 0, policy_version 705045 (0.0016) [2024-06-15 20:36:00,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 1443954688. Throughput: 0: 11264.0. Samples: 361085952. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:36:01,894][1653645] Updated weights for policy 0, policy_version 705109 (0.0014) [2024-06-15 20:36:03,212][1653645] Updated weights for policy 0, policy_version 705154 (0.0012) [2024-06-15 20:36:04,553][1653645] Updated weights for policy 0, policy_version 705216 (0.0013) [2024-06-15 20:36:05,958][1648982] Fps is (10 sec: 39324.3, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1444282368. Throughput: 0: 11082.0. Samples: 361111552. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:36:08,051][1653645] Updated weights for policy 0, policy_version 705280 (0.0014) [2024-06-15 20:36:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1444413440. Throughput: 0: 10934.5. Samples: 361177600. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 20:36:13,515][1653645] Updated weights for policy 0, policy_version 705360 (0.0013) [2024-06-15 20:36:14,884][1653645] Updated weights for policy 0, policy_version 705408 (0.0012) [2024-06-15 20:36:15,960][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1444675584. Throughput: 0: 10911.5. Samples: 361240576. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:15,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:36:16,967][1653645] Updated weights for policy 0, policy_version 705471 (0.0012) [2024-06-15 20:36:20,676][1653645] Updated weights for policy 0, policy_version 705532 (0.0019) [2024-06-15 20:36:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 1444937728. Throughput: 0: 10945.4. Samples: 361273856. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:36:24,218][1653645] Updated weights for policy 0, policy_version 705584 (0.0040) [2024-06-15 20:36:25,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 44784.1, 300 sec: 44764.4). Total num frames: 1445167104. Throughput: 0: 11082.0. Samples: 361343488. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:36:26,061][1653645] Updated weights for policy 0, policy_version 705662 (0.0012) [2024-06-15 20:36:29,790][1653645] Updated weights for policy 0, policy_version 705728 (0.0015) [2024-06-15 20:36:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43691.4, 300 sec: 44542.3). Total num frames: 1445330944. Throughput: 0: 10934.0. Samples: 361402880. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:36:33,541][1653645] Updated weights for policy 0, policy_version 705783 (0.0012) [2024-06-15 20:36:35,958][1648982] Fps is (10 sec: 29491.5, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 1445462016. Throughput: 0: 10604.1. Samples: 361430016. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:36:37,510][1653645] Updated weights for policy 0, policy_version 705856 (0.0105) [2024-06-15 20:36:38,757][1653645] Updated weights for policy 0, policy_version 705920 (0.0013) [2024-06-15 20:36:40,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 42598.4, 300 sec: 44431.2). Total num frames: 1445724160. Throughput: 0: 10627.0. Samples: 361489920. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:36:42,834][1653645] Updated weights for policy 0, policy_version 705984 (0.0038) [2024-06-15 20:36:44,876][1651596] Signal inference workers to stop experience collection... (36650 times) [2024-06-15 20:36:44,914][1653645] InferenceWorker_p0-w0: stopping experience collection (36650 times) [2024-06-15 20:36:45,195][1651596] Signal inference workers to resume experience collection... (36650 times) [2024-06-15 20:36:45,195][1653645] InferenceWorker_p0-w0: resuming experience collection (36650 times) [2024-06-15 20:36:45,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 44431.2). Total num frames: 1445920768. Throughput: 0: 10478.9. Samples: 361557504. Policy #0 lag: (min: 15.0, avg: 136.1, max: 271.0) [2024-06-15 20:36:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:36:46,228][1653645] Updated weights for policy 0, policy_version 706042 (0.0013) [2024-06-15 20:36:49,165][1653645] Updated weights for policy 0, policy_version 706105 (0.0014) [2024-06-15 20:36:50,864][1653645] Updated weights for policy 0, policy_version 706175 (0.0012) [2024-06-15 20:36:50,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1446248448. Throughput: 0: 10649.6. Samples: 361590784. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:36:50,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:36:55,510][1653645] Updated weights for policy 0, policy_version 706240 (0.0012) [2024-06-15 20:36:55,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 41506.5, 300 sec: 44320.1). Total num frames: 1446379520. Throughput: 0: 10604.1. Samples: 361654784. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:36:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:36:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000706240_1446379520.pth... [2024-06-15 20:36:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000701120_1435893760.pth [2024-06-15 20:37:00,623][1653645] Updated weights for policy 0, policy_version 706323 (0.0096) [2024-06-15 20:37:00,960][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1446576128. Throughput: 0: 10626.8. Samples: 361718784. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:00,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:37:02,741][1653645] Updated weights for policy 0, policy_version 706400 (0.0014) [2024-06-15 20:37:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 41506.2, 300 sec: 43986.9). Total num frames: 1446772736. Throughput: 0: 10456.2. Samples: 361744384. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:37:06,631][1653645] Updated weights for policy 0, policy_version 706450 (0.0012) [2024-06-15 20:37:08,468][1653645] Updated weights for policy 0, policy_version 706499 (0.0013) [2024-06-15 20:37:09,681][1653645] Updated weights for policy 0, policy_version 706555 (0.0040) [2024-06-15 20:37:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 43987.6). Total num frames: 1447034880. Throughput: 0: 10615.5. Samples: 361821184. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:37:12,700][1653645] Updated weights for policy 0, policy_version 706612 (0.0014) [2024-06-15 20:37:14,313][1653645] Updated weights for policy 0, policy_version 706661 (0.0012) [2024-06-15 20:37:15,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 1447297024. Throughput: 0: 10797.4. Samples: 361888768. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:37:18,049][1653645] Updated weights for policy 0, policy_version 706708 (0.0012) [2024-06-15 20:37:20,761][1653645] Updated weights for policy 0, policy_version 706784 (0.0013) [2024-06-15 20:37:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 43987.2). Total num frames: 1447493632. Throughput: 0: 10956.8. Samples: 361923072. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:37:25,307][1653645] Updated weights for policy 0, policy_version 706873 (0.0014) [2024-06-15 20:37:25,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 42598.1, 300 sec: 44097.9). Total num frames: 1447723008. Throughput: 0: 10979.5. Samples: 361984000. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:37:27,204][1653645] Updated weights for policy 0, policy_version 706939 (0.0013) [2024-06-15 20:37:30,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 41506.1, 300 sec: 43542.6). Total num frames: 1447821312. Throughput: 0: 10808.9. Samples: 362043904. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:37:32,201][1653645] Updated weights for policy 0, policy_version 706981 (0.0013) [2024-06-15 20:37:33,379][1651596] Signal inference workers to stop experience collection... (36700 times) [2024-06-15 20:37:33,402][1653645] InferenceWorker_p0-w0: stopping experience collection (36700 times) [2024-06-15 20:37:33,612][1651596] Signal inference workers to resume experience collection... (36700 times) [2024-06-15 20:37:33,613][1653645] InferenceWorker_p0-w0: resuming experience collection (36700 times) [2024-06-15 20:37:33,615][1653645] Updated weights for policy 0, policy_version 707040 (0.0013) [2024-06-15 20:37:35,960][1648982] Fps is (10 sec: 36038.3, 60 sec: 43689.0, 300 sec: 43986.5). Total num frames: 1448083456. Throughput: 0: 10671.8. Samples: 362071040. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:35,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:37:37,393][1653645] Updated weights for policy 0, policy_version 707092 (0.0013) [2024-06-15 20:37:40,052][1653645] Updated weights for policy 0, policy_version 707193 (0.0116) [2024-06-15 20:37:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1448345600. Throughput: 0: 10524.5. Samples: 362128384. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:37:44,083][1653645] Updated weights for policy 0, policy_version 707232 (0.0012) [2024-06-15 20:37:45,251][1653645] Updated weights for policy 0, policy_version 707281 (0.0018) [2024-06-15 20:37:45,958][1648982] Fps is (10 sec: 49162.3, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 1448574976. Throughput: 0: 10752.0. Samples: 362202624. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:37:50,901][1653645] Updated weights for policy 0, policy_version 707376 (0.0104) [2024-06-15 20:37:50,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 40959.8, 300 sec: 43542.5). Total num frames: 1448706048. Throughput: 0: 10820.2. Samples: 362231296. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:50,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:37:52,683][1653645] Updated weights for policy 0, policy_version 707444 (0.0012) [2024-06-15 20:37:55,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 41506.2, 300 sec: 43544.4). Total num frames: 1448869888. Throughput: 0: 10387.9. Samples: 362288640. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:37:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:37:56,873][1653645] Updated weights for policy 0, policy_version 707488 (0.0030) [2024-06-15 20:38:00,208][1653645] Updated weights for policy 0, policy_version 707571 (0.0107) [2024-06-15 20:38:00,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 43875.8). Total num frames: 1449132032. Throughput: 0: 10080.7. Samples: 362342400. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:38:00,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 20:38:05,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 40959.9, 300 sec: 43431.5). Total num frames: 1449230336. Throughput: 0: 10012.4. Samples: 362373632. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:38:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:38:06,092][1653645] Updated weights for policy 0, policy_version 707635 (0.0013) [2024-06-15 20:38:07,492][1653645] Updated weights for policy 0, policy_version 707698 (0.0016) [2024-06-15 20:38:10,958][1648982] Fps is (10 sec: 29491.8, 60 sec: 39867.7, 300 sec: 43209.4). Total num frames: 1449426944. Throughput: 0: 9671.2. Samples: 362419200. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:38:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:38:14,393][1653645] Updated weights for policy 0, policy_version 707777 (0.0014) [2024-06-15 20:38:15,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 38775.5, 300 sec: 42987.1). Total num frames: 1449623552. Throughput: 0: 9500.4. Samples: 362471424. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:38:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:38:19,506][1653645] Updated weights for policy 0, policy_version 707843 (0.0041) [2024-06-15 20:38:20,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 38229.3, 300 sec: 42987.2). Total num frames: 1449787392. Throughput: 0: 9546.4. Samples: 362500608. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:38:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:38:21,453][1653645] Updated weights for policy 0, policy_version 707920 (0.0012) [2024-06-15 20:38:25,278][1653645] Updated weights for policy 0, policy_version 707970 (0.0013) [2024-06-15 20:38:25,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 37683.4, 300 sec: 42653.9). Total num frames: 1449984000. Throughput: 0: 9375.3. Samples: 362550272. Policy #0 lag: (min: 38.0, avg: 137.4, max: 294.0) [2024-06-15 20:38:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:38:26,048][1651596] Signal inference workers to stop experience collection... (36750 times) [2024-06-15 20:38:26,144][1653645] InferenceWorker_p0-w0: stopping experience collection (36750 times) [2024-06-15 20:38:26,313][1651596] Signal inference workers to resume experience collection... (36750 times) [2024-06-15 20:38:26,314][1653645] InferenceWorker_p0-w0: resuming experience collection (36750 times) [2024-06-15 20:38:29,225][1653645] Updated weights for policy 0, policy_version 708033 (0.0227) [2024-06-15 20:38:30,714][1653645] Updated weights for policy 0, policy_version 708090 (0.0014) [2024-06-15 20:38:30,960][1648982] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 42653.9). Total num frames: 1450180608. Throughput: 0: 8988.5. Samples: 362607104. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:38:30,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:38:35,452][1653645] Updated weights for policy 0, policy_version 708160 (0.0178) [2024-06-15 20:38:35,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 37684.6, 300 sec: 42654.0). Total num frames: 1450344448. Throughput: 0: 9090.9. Samples: 362640384. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:38:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:38:40,133][1653645] Updated weights for policy 0, policy_version 708243 (0.0014) [2024-06-15 20:38:40,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 36044.8, 300 sec: 42431.8). Total num frames: 1450508288. Throughput: 0: 9011.2. Samples: 362694144. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:38:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:38:41,342][1653645] Updated weights for policy 0, policy_version 708288 (0.0011) [2024-06-15 20:38:44,309][1653645] Updated weights for policy 0, policy_version 708345 (0.0017) [2024-06-15 20:38:45,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 35498.7, 300 sec: 42209.6). Total num frames: 1450704896. Throughput: 0: 9056.7. Samples: 362749952. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:38:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:38:48,653][1653645] Updated weights for policy 0, policy_version 708413 (0.0077) [2024-06-15 20:38:50,291][1653645] Updated weights for policy 0, policy_version 708478 (0.0013) [2024-06-15 20:38:50,967][1648982] Fps is (10 sec: 45834.3, 60 sec: 37677.8, 300 sec: 42208.3). Total num frames: 1450967040. Throughput: 0: 8929.8. Samples: 362775552. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:38:50,967][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:38:55,449][1653645] Updated weights for policy 0, policy_version 708528 (0.0011) [2024-06-15 20:38:55,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 37137.1, 300 sec: 42209.7). Total num frames: 1451098112. Throughput: 0: 9284.3. Samples: 362836992. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:38:55,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:38:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000708544_1451098112.pth... [2024-06-15 20:38:56,032][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000703680_1441136640.pth [2024-06-15 20:38:58,234][1653645] Updated weights for policy 0, policy_version 708570 (0.0013) [2024-06-15 20:38:59,166][1653645] Updated weights for policy 0, policy_version 708601 (0.0014) [2024-06-15 20:39:00,958][1648982] Fps is (10 sec: 32796.8, 60 sec: 36044.8, 300 sec: 41765.3). Total num frames: 1451294720. Throughput: 0: 9352.5. Samples: 362892288. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:39:01,481][1653645] Updated weights for policy 0, policy_version 708672 (0.0014) [2024-06-15 20:39:04,205][1653645] Updated weights for policy 0, policy_version 708735 (0.0013) [2024-06-15 20:39:05,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 37683.2, 300 sec: 41765.4). Total num frames: 1451491328. Throughput: 0: 9238.8. Samples: 362916352. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:39:10,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 36590.9, 300 sec: 41432.1). Total num frames: 1451622400. Throughput: 0: 9443.6. Samples: 362975232. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:39:11,500][1653645] Updated weights for policy 0, policy_version 708805 (0.0014) [2024-06-15 20:39:13,637][1653645] Updated weights for policy 0, policy_version 708896 (0.0012) [2024-06-15 20:39:15,706][1653645] Updated weights for policy 0, policy_version 708944 (0.0011) [2024-06-15 20:39:15,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 38229.5, 300 sec: 41765.3). Total num frames: 1451917312. Throughput: 0: 9773.5. Samples: 363046912. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:15,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 20:39:16,764][1653645] Updated weights for policy 0, policy_version 708991 (0.0012) [2024-06-15 20:39:20,156][1651596] Signal inference workers to stop experience collection... (36800 times) [2024-06-15 20:39:20,237][1653645] InferenceWorker_p0-w0: stopping experience collection (36800 times) [2024-06-15 20:39:20,452][1651596] Signal inference workers to resume experience collection... (36800 times) [2024-06-15 20:39:20,460][1653645] InferenceWorker_p0-w0: resuming experience collection (36800 times) [2024-06-15 20:39:20,765][1653645] Updated weights for policy 0, policy_version 709053 (0.0031) [2024-06-15 20:39:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 39321.6, 300 sec: 41765.3). Total num frames: 1452146688. Throughput: 0: 9784.9. Samples: 363080704. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:20,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:39:23,918][1653645] Updated weights for policy 0, policy_version 709117 (0.0108) [2024-06-15 20:39:25,615][1653645] Updated weights for policy 0, policy_version 709184 (0.0069) [2024-06-15 20:39:25,958][1648982] Fps is (10 sec: 49150.5, 60 sec: 40413.7, 300 sec: 41765.3). Total num frames: 1452408832. Throughput: 0: 10035.1. Samples: 363145728. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:39:28,242][1653645] Updated weights for policy 0, policy_version 709241 (0.0045) [2024-06-15 20:39:30,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 40413.9, 300 sec: 41987.5). Total num frames: 1452605440. Throughput: 0: 10547.2. Samples: 363224576. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:39:31,442][1653645] Updated weights for policy 0, policy_version 709304 (0.0014) [2024-06-15 20:39:34,107][1653645] Updated weights for policy 0, policy_version 709333 (0.0012) [2024-06-15 20:39:35,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 1452834816. Throughput: 0: 10697.2. Samples: 363256832. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:39:36,345][1653645] Updated weights for policy 0, policy_version 709408 (0.0232) [2024-06-15 20:39:39,069][1653645] Updated weights for policy 0, policy_version 709476 (0.0012) [2024-06-15 20:39:40,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 1453064192. Throughput: 0: 10740.6. Samples: 363320320. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:39:41,867][1653645] Updated weights for policy 0, policy_version 709520 (0.0112) [2024-06-15 20:39:43,178][1653645] Updated weights for policy 0, policy_version 709568 (0.0012) [2024-06-15 20:39:45,433][1653645] Updated weights for policy 0, policy_version 709625 (0.0013) [2024-06-15 20:39:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.7, 300 sec: 42098.5). Total num frames: 1453326336. Throughput: 0: 11184.4. Samples: 363395584. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:39:47,951][1653645] Updated weights for policy 0, policy_version 709671 (0.0014) [2024-06-15 20:39:49,780][1653645] Updated weights for policy 0, policy_version 709728 (0.0013) [2024-06-15 20:39:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43697.2, 300 sec: 42209.7). Total num frames: 1453588480. Throughput: 0: 11377.8. Samples: 363428352. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:39:54,029][1653645] Updated weights for policy 0, policy_version 709808 (0.0099) [2024-06-15 20:39:55,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 42209.6). Total num frames: 1453719552. Throughput: 0: 11628.1. Samples: 363498496. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:39:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:39:56,967][1653645] Updated weights for policy 0, policy_version 709865 (0.0103) [2024-06-15 20:39:59,581][1653645] Updated weights for policy 0, policy_version 709936 (0.0011) [2024-06-15 20:40:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.2, 300 sec: 41876.4). Total num frames: 1454014464. Throughput: 0: 11537.1. Samples: 363566080. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:40:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:40:01,354][1653645] Updated weights for policy 0, policy_version 709993 (0.0020) [2024-06-15 20:40:05,082][1653645] Updated weights for policy 0, policy_version 710048 (0.0013) [2024-06-15 20:40:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 42209.6). Total num frames: 1454243840. Throughput: 0: 11571.2. Samples: 363601408. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:40:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:40:08,143][1653645] Updated weights for policy 0, policy_version 710086 (0.0012) [2024-06-15 20:40:08,348][1651596] Signal inference workers to stop experience collection... (36850 times) [2024-06-15 20:40:08,409][1653645] InferenceWorker_p0-w0: stopping experience collection (36850 times) [2024-06-15 20:40:08,508][1651596] Signal inference workers to resume experience collection... (36850 times) [2024-06-15 20:40:08,511][1653645] InferenceWorker_p0-w0: resuming experience collection (36850 times) [2024-06-15 20:40:08,936][1653645] Updated weights for policy 0, policy_version 710141 (0.0165) [2024-06-15 20:40:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 46421.4, 300 sec: 41876.4). Total num frames: 1454407680. Throughput: 0: 11650.9. Samples: 363670016. Policy #0 lag: (min: 31.0, avg: 137.5, max: 287.0) [2024-06-15 20:40:10,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:40:11,288][1653645] Updated weights for policy 0, policy_version 710181 (0.0012) [2024-06-15 20:40:12,316][1653645] Updated weights for policy 0, policy_version 710224 (0.0090) [2024-06-15 20:40:15,647][1653645] Updated weights for policy 0, policy_version 710288 (0.0014) [2024-06-15 20:40:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45875.1, 300 sec: 41876.4). Total num frames: 1454669824. Throughput: 0: 11537.0. Samples: 363743744. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:15,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:40:16,641][1653645] Updated weights for policy 0, policy_version 710333 (0.0012) [2024-06-15 20:40:20,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 45875.2, 300 sec: 42098.8). Total num frames: 1454899200. Throughput: 0: 11605.4. Samples: 363779072. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:40:21,683][1653645] Updated weights for policy 0, policy_version 710404 (0.0021) [2024-06-15 20:40:23,354][1653645] Updated weights for policy 0, policy_version 710480 (0.0030) [2024-06-15 20:40:24,431][1653645] Updated weights for policy 0, policy_version 710528 (0.0016) [2024-06-15 20:40:25,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45875.4, 300 sec: 42209.8). Total num frames: 1455161344. Throughput: 0: 11616.7. Samples: 363843072. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:40:27,911][1653645] Updated weights for policy 0, policy_version 710592 (0.0013) [2024-06-15 20:40:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44782.9, 300 sec: 42209.6). Total num frames: 1455292416. Throughput: 0: 11571.2. Samples: 363916288. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:30,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:40:33,444][1653645] Updated weights for policy 0, policy_version 710672 (0.0043) [2024-06-15 20:40:34,559][1653645] Updated weights for policy 0, policy_version 710720 (0.0019) [2024-06-15 20:40:35,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 46967.5, 300 sec: 42320.7). Total num frames: 1455652864. Throughput: 0: 11582.6. Samples: 363949568. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:40:36,204][1653645] Updated weights for policy 0, policy_version 710781 (0.0013) [2024-06-15 20:40:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 42209.6). Total num frames: 1455816704. Throughput: 0: 11332.3. Samples: 364008448. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:40:42,605][1653645] Updated weights for policy 0, policy_version 710849 (0.0051) [2024-06-15 20:40:43,987][1653645] Updated weights for policy 0, policy_version 710909 (0.0013) [2024-06-15 20:40:45,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 44782.8, 300 sec: 41987.4). Total num frames: 1456013312. Throughput: 0: 11434.6. Samples: 364080640. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:40:46,275][1653645] Updated weights for policy 0, policy_version 710971 (0.0014) [2024-06-15 20:40:48,109][1653645] Updated weights for policy 0, policy_version 711037 (0.0013) [2024-06-15 20:40:50,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 42098.7). Total num frames: 1456308224. Throughput: 0: 11252.6. Samples: 364107776. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:40:54,908][1653645] Updated weights for policy 0, policy_version 711120 (0.0014) [2024-06-15 20:40:55,072][1651596] Signal inference workers to stop experience collection... (36900 times) [2024-06-15 20:40:55,161][1653645] InferenceWorker_p0-w0: stopping experience collection (36900 times) [2024-06-15 20:40:55,266][1651596] Signal inference workers to resume experience collection... (36900 times) [2024-06-15 20:40:55,278][1653645] InferenceWorker_p0-w0: resuming experience collection (36900 times) [2024-06-15 20:40:55,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.1, 300 sec: 42431.8). Total num frames: 1456472064. Throughput: 0: 11400.5. Samples: 364183040. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:40:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:40:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000711168_1456472064.pth... [2024-06-15 20:40:56,014][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000706240_1446379520.pth [2024-06-15 20:40:56,298][1653645] Updated weights for policy 0, policy_version 711171 (0.0046) [2024-06-15 20:40:57,565][1653645] Updated weights for policy 0, policy_version 711230 (0.0011) [2024-06-15 20:40:59,746][1653645] Updated weights for policy 0, policy_version 711293 (0.0012) [2024-06-15 20:41:00,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 42320.7). Total num frames: 1456766976. Throughput: 0: 11275.4. Samples: 364251136. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:00,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:41:01,410][1653645] Updated weights for policy 0, policy_version 711332 (0.0013) [2024-06-15 20:41:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 42209.6). Total num frames: 1456865280. Throughput: 0: 11184.3. Samples: 364282368. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:05,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:41:06,914][1653645] Updated weights for policy 0, policy_version 711400 (0.0028) [2024-06-15 20:41:08,432][1653645] Updated weights for policy 0, policy_version 711458 (0.0013) [2024-06-15 20:41:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 42431.8). Total num frames: 1457192960. Throughput: 0: 11275.4. Samples: 364350464. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:41:11,013][1653645] Updated weights for policy 0, policy_version 711536 (0.0012) [2024-06-15 20:41:13,519][1653645] Updated weights for policy 0, policy_version 711607 (0.0200) [2024-06-15 20:41:15,961][1648982] Fps is (10 sec: 52413.5, 60 sec: 45326.9, 300 sec: 42209.2). Total num frames: 1457389568. Throughput: 0: 11149.5. Samples: 364418048. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:15,962][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:41:18,524][1653645] Updated weights for policy 0, policy_version 711637 (0.0013) [2024-06-15 20:41:20,788][1653645] Updated weights for policy 0, policy_version 711717 (0.0011) [2024-06-15 20:41:20,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45329.1, 300 sec: 42209.6). Total num frames: 1457618944. Throughput: 0: 11320.9. Samples: 364459008. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:41:22,264][1653645] Updated weights for policy 0, policy_version 711792 (0.0043) [2024-06-15 20:41:25,345][1653645] Updated weights for policy 0, policy_version 711845 (0.0030) [2024-06-15 20:41:25,958][1648982] Fps is (10 sec: 52444.7, 60 sec: 45875.3, 300 sec: 42653.9). Total num frames: 1457913856. Throughput: 0: 11320.9. Samples: 364517888. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:41:30,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 44236.8, 300 sec: 42320.7). Total num frames: 1457946624. Throughput: 0: 11355.1. Samples: 364591616. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:30,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:41:31,247][1653645] Updated weights for policy 0, policy_version 711904 (0.0016) [2024-06-15 20:41:33,257][1653645] Updated weights for policy 0, policy_version 712000 (0.0012) [2024-06-15 20:41:34,561][1653645] Updated weights for policy 0, policy_version 712063 (0.0013) [2024-06-15 20:41:35,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 44236.6, 300 sec: 42653.9). Total num frames: 1458307072. Throughput: 0: 11320.8. Samples: 364617216. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:35,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:41:36,681][1651596] Signal inference workers to stop experience collection... (36950 times) [2024-06-15 20:41:36,701][1653645] InferenceWorker_p0-w0: stopping experience collection (36950 times) [2024-06-15 20:41:36,978][1651596] Signal inference workers to resume experience collection... (36950 times) [2024-06-15 20:41:36,979][1653645] InferenceWorker_p0-w0: resuming experience collection (36950 times) [2024-06-15 20:41:37,607][1653645] Updated weights for policy 0, policy_version 712117 (0.0013) [2024-06-15 20:41:40,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 42431.8). Total num frames: 1458438144. Throughput: 0: 11195.8. Samples: 364686848. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:41:42,756][1653645] Updated weights for policy 0, policy_version 712160 (0.0063) [2024-06-15 20:41:44,070][1653645] Updated weights for policy 0, policy_version 712216 (0.0013) [2024-06-15 20:41:45,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 46421.5, 300 sec: 42542.9). Total num frames: 1458798592. Throughput: 0: 11150.2. Samples: 364752896. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:41:46,099][1653645] Updated weights for policy 0, policy_version 712314 (0.0012) [2024-06-15 20:41:48,938][1653645] Updated weights for policy 0, policy_version 712368 (0.0014) [2024-06-15 20:41:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.8, 300 sec: 42654.0). Total num frames: 1458962432. Throughput: 0: 11127.5. Samples: 364783104. Policy #0 lag: (min: 22.0, avg: 127.9, max: 278.0) [2024-06-15 20:41:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:41:53,978][1653645] Updated weights for policy 0, policy_version 712388 (0.0014) [2024-06-15 20:41:55,898][1653645] Updated weights for policy 0, policy_version 712466 (0.0013) [2024-06-15 20:41:55,958][1648982] Fps is (10 sec: 32766.5, 60 sec: 44236.6, 300 sec: 42542.8). Total num frames: 1459126272. Throughput: 0: 11320.8. Samples: 364859904. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:41:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:41:58,188][1653645] Updated weights for policy 0, policy_version 712567 (0.0014) [2024-06-15 20:42:00,714][1653645] Updated weights for policy 0, policy_version 712624 (0.0016) [2024-06-15 20:42:00,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 45328.9, 300 sec: 43098.2). Total num frames: 1459486720. Throughput: 0: 10991.6. Samples: 364912640. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:00,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:42:05,958][1648982] Fps is (10 sec: 36046.7, 60 sec: 43690.7, 300 sec: 42209.6). Total num frames: 1459486720. Throughput: 0: 10922.7. Samples: 364950528. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:42:07,212][1653645] Updated weights for policy 0, policy_version 712688 (0.0120) [2024-06-15 20:42:09,654][1653645] Updated weights for policy 0, policy_version 712784 (0.0014) [2024-06-15 20:42:10,828][1653645] Updated weights for policy 0, policy_version 712832 (0.0012) [2024-06-15 20:42:10,960][1648982] Fps is (10 sec: 39322.6, 60 sec: 44782.9, 300 sec: 42654.0). Total num frames: 1459879936. Throughput: 0: 10945.4. Samples: 365010432. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:10,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:42:13,243][1653645] Updated weights for policy 0, policy_version 712891 (0.0021) [2024-06-15 20:42:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43692.8, 300 sec: 42431.8). Total num frames: 1460011008. Throughput: 0: 10808.9. Samples: 365078016. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:15,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 20:42:20,373][1653645] Updated weights for policy 0, policy_version 712947 (0.0012) [2024-06-15 20:42:20,958][1648982] Fps is (10 sec: 26214.7, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 1460142080. Throughput: 0: 11070.7. Samples: 365115392. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:42:21,557][1651596] Signal inference workers to stop experience collection... (37000 times) [2024-06-15 20:42:21,599][1653645] InferenceWorker_p0-w0: stopping experience collection (37000 times) [2024-06-15 20:42:21,864][1651596] Signal inference workers to resume experience collection... (37000 times) [2024-06-15 20:42:21,865][1653645] InferenceWorker_p0-w0: resuming experience collection (37000 times) [2024-06-15 20:42:22,272][1653645] Updated weights for policy 0, policy_version 713024 (0.0012) [2024-06-15 20:42:23,700][1653645] Updated weights for policy 0, policy_version 713087 (0.0014) [2024-06-15 20:42:25,233][1653645] Updated weights for policy 0, policy_version 713150 (0.0077) [2024-06-15 20:42:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 1460535296. Throughput: 0: 10649.6. Samples: 365166080. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:42:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42209.9). Total num frames: 1460535296. Throughput: 0: 10911.3. Samples: 365243904. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:42:32,802][1653645] Updated weights for policy 0, policy_version 713216 (0.0107) [2024-06-15 20:42:35,010][1653645] Updated weights for policy 0, policy_version 713312 (0.0013) [2024-06-15 20:42:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.9, 300 sec: 42654.0). Total num frames: 1460928512. Throughput: 0: 10877.2. Samples: 365272576. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:42:35,959][1653645] Updated weights for policy 0, policy_version 713348 (0.0014) [2024-06-15 20:42:37,154][1653645] Updated weights for policy 0, policy_version 713403 (0.0012) [2024-06-15 20:42:40,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 43690.4, 300 sec: 42320.7). Total num frames: 1461059584. Throughput: 0: 10490.4. Samples: 365331968. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:42:45,346][1653645] Updated weights for policy 0, policy_version 713472 (0.0099) [2024-06-15 20:42:45,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 40413.8, 300 sec: 42431.8). Total num frames: 1461223424. Throughput: 0: 10956.9. Samples: 365405696. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:42:46,508][1653645] Updated weights for policy 0, policy_version 713524 (0.0013) [2024-06-15 20:42:48,297][1653645] Updated weights for policy 0, policy_version 713602 (0.0013) [2024-06-15 20:42:50,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.6, 300 sec: 43098.2). Total num frames: 1461583872. Throughput: 0: 10717.8. Samples: 365432832. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:50,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:42:55,648][1653645] Updated weights for policy 0, policy_version 713680 (0.0016) [2024-06-15 20:42:55,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 41506.3, 300 sec: 42320.7). Total num frames: 1461616640. Throughput: 0: 10979.5. Samples: 365504512. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:42:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:42:56,403][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000713712_1461682176.pth... [2024-06-15 20:42:56,541][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000708544_1451098112.pth [2024-06-15 20:42:58,561][1653645] Updated weights for policy 0, policy_version 713808 (0.0013) [2024-06-15 20:42:59,743][1651596] Signal inference workers to stop experience collection... (37050 times) [2024-06-15 20:42:59,794][1653645] InferenceWorker_p0-w0: stopping experience collection (37050 times) [2024-06-15 20:43:00,018][1651596] Signal inference workers to resume experience collection... (37050 times) [2024-06-15 20:43:00,042][1653645] InferenceWorker_p0-w0: resuming experience collection (37050 times) [2024-06-15 20:43:00,867][1653645] Updated weights for policy 0, policy_version 713904 (0.0014) [2024-06-15 20:43:00,960][1648982] Fps is (10 sec: 49152.1, 60 sec: 43144.7, 300 sec: 43542.6). Total num frames: 1462075392. Throughput: 0: 10604.1. Samples: 365555200. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:00,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:43:05,958][1648982] Fps is (10 sec: 49153.4, 60 sec: 43690.6, 300 sec: 42987.2). Total num frames: 1462108160. Throughput: 0: 10672.3. Samples: 365595648. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:43:08,649][1653645] Updated weights for policy 0, policy_version 713954 (0.0012) [2024-06-15 20:43:10,387][1653645] Updated weights for policy 0, policy_version 714018 (0.0185) [2024-06-15 20:43:10,958][1648982] Fps is (10 sec: 26214.3, 60 sec: 40960.0, 300 sec: 43098.3). Total num frames: 1462337536. Throughput: 0: 11025.1. Samples: 365662208. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:43:12,760][1653645] Updated weights for policy 0, policy_version 714112 (0.0077) [2024-06-15 20:43:15,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1462632448. Throughput: 0: 10399.3. Samples: 365711872. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:43:20,958][1648982] Fps is (10 sec: 29490.9, 60 sec: 41506.0, 300 sec: 42876.1). Total num frames: 1462632448. Throughput: 0: 10672.3. Samples: 365752832. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:20,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 20:43:21,215][1653645] Updated weights for policy 0, policy_version 714198 (0.0012) [2024-06-15 20:43:23,077][1653645] Updated weights for policy 0, policy_version 714272 (0.0037) [2024-06-15 20:43:24,899][1653645] Updated weights for policy 0, policy_version 714354 (0.0014) [2024-06-15 20:43:25,958][1648982] Fps is (10 sec: 45873.4, 60 sec: 42598.2, 300 sec: 43764.7). Total num frames: 1463091200. Throughput: 0: 10604.1. Samples: 365809152. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:25,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:43:26,515][1653645] Updated weights for policy 0, policy_version 714432 (0.0013) [2024-06-15 20:43:30,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1463156736. Throughput: 0: 10535.8. Samples: 365879808. Policy #0 lag: (min: 15.0, avg: 76.9, max: 271.0) [2024-06-15 20:43:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 20:43:34,060][1653645] Updated weights for policy 0, policy_version 714496 (0.0015) [2024-06-15 20:43:35,607][1653645] Updated weights for policy 0, policy_version 714560 (0.0012) [2024-06-15 20:43:35,958][1648982] Fps is (10 sec: 32768.9, 60 sec: 41506.1, 300 sec: 43764.7). Total num frames: 1463418880. Throughput: 0: 10763.4. Samples: 365917184. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:43:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:43:36,986][1653645] Updated weights for policy 0, policy_version 714617 (0.0013) [2024-06-15 20:43:38,352][1653645] Updated weights for policy 0, policy_version 714682 (0.0015) [2024-06-15 20:43:40,957][1648982] Fps is (10 sec: 52429.4, 60 sec: 43691.0, 300 sec: 43986.9). Total num frames: 1463681024. Throughput: 0: 10490.4. Samples: 365976576. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:43:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:43:44,662][1651596] Signal inference workers to stop experience collection... (37100 times) [2024-06-15 20:43:44,697][1653645] InferenceWorker_p0-w0: stopping experience collection (37100 times) [2024-06-15 20:43:44,943][1651596] Signal inference workers to resume experience collection... (37100 times) [2024-06-15 20:43:44,943][1653645] InferenceWorker_p0-w0: resuming experience collection (37100 times) [2024-06-15 20:43:45,410][1653645] Updated weights for policy 0, policy_version 714752 (0.0013) [2024-06-15 20:43:45,960][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 43655.0). Total num frames: 1463844864. Throughput: 0: 10956.8. Samples: 366048256. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:43:45,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:43:47,811][1653645] Updated weights for policy 0, policy_version 714857 (0.0013) [2024-06-15 20:43:49,849][1653645] Updated weights for policy 0, policy_version 714912 (0.0019) [2024-06-15 20:43:50,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1464205312. Throughput: 0: 10592.7. Samples: 366072320. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:43:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:43:55,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.7, 300 sec: 43764.7). Total num frames: 1464205312. Throughput: 0: 10740.6. Samples: 366145536. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:43:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:43:55,971][1653645] Updated weights for policy 0, policy_version 714946 (0.0043) [2024-06-15 20:43:57,507][1653645] Updated weights for policy 0, policy_version 715024 (0.0014) [2024-06-15 20:43:59,749][1653645] Updated weights for policy 0, policy_version 715120 (0.0112) [2024-06-15 20:44:00,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 44542.2). Total num frames: 1464631296. Throughput: 0: 11013.6. Samples: 366207488. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:44:01,117][1653645] Updated weights for policy 0, policy_version 715157 (0.0013) [2024-06-15 20:44:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1464729600. Throughput: 0: 10820.3. Samples: 366239744. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:05,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 20:44:07,241][1653645] Updated weights for policy 0, policy_version 715204 (0.0012) [2024-06-15 20:44:08,949][1653645] Updated weights for policy 0, policy_version 715266 (0.0013) [2024-06-15 20:44:10,532][1653645] Updated weights for policy 0, policy_version 715329 (0.0031) [2024-06-15 20:44:10,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1465024512. Throughput: 0: 11320.9. Samples: 366318592. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:10,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:44:11,723][1653645] Updated weights for policy 0, policy_version 715383 (0.0020) [2024-06-15 20:44:12,833][1653645] Updated weights for policy 0, policy_version 715424 (0.0012) [2024-06-15 20:44:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1465253888. Throughput: 0: 11082.0. Samples: 366378496. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:44:19,479][1653645] Updated weights for policy 0, policy_version 715498 (0.0057) [2024-06-15 20:44:20,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 46967.4, 300 sec: 44209.0). Total num frames: 1465450496. Throughput: 0: 11252.6. Samples: 366423552. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:44:21,916][1653645] Updated weights for policy 0, policy_version 715588 (0.0014) [2024-06-15 20:44:23,055][1653645] Updated weights for policy 0, policy_version 715638 (0.0015) [2024-06-15 20:44:24,094][1651596] Signal inference workers to stop experience collection... (37150 times) [2024-06-15 20:44:24,116][1653645] InferenceWorker_p0-w0: stopping experience collection (37150 times) [2024-06-15 20:44:24,353][1651596] Signal inference workers to resume experience collection... (37150 times) [2024-06-15 20:44:24,354][1653645] InferenceWorker_p0-w0: resuming experience collection (37150 times) [2024-06-15 20:44:24,356][1653645] Updated weights for policy 0, policy_version 715664 (0.0011) [2024-06-15 20:44:25,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44783.1, 300 sec: 44653.3). Total num frames: 1465778176. Throughput: 0: 11150.2. Samples: 366478336. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:44:30,592][1653645] Updated weights for policy 0, policy_version 715728 (0.0011) [2024-06-15 20:44:30,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 1465843712. Throughput: 0: 11184.4. Samples: 366551552. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:44:33,909][1653645] Updated weights for policy 0, policy_version 715824 (0.0071) [2024-06-15 20:44:35,685][1653645] Updated weights for policy 0, policy_version 715898 (0.0191) [2024-06-15 20:44:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1466171392. Throughput: 0: 11298.1. Samples: 366580736. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:44:37,198][1653645] Updated weights for policy 0, policy_version 715965 (0.0013) [2024-06-15 20:44:40,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1466302464. Throughput: 0: 10990.9. Samples: 366640128. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:44:43,802][1653645] Updated weights for policy 0, policy_version 716016 (0.0013) [2024-06-15 20:44:45,507][1653645] Updated weights for policy 0, policy_version 716065 (0.0012) [2024-06-15 20:44:45,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 1466531840. Throughput: 0: 11286.8. Samples: 366715392. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:44:47,578][1653645] Updated weights for policy 0, policy_version 716160 (0.0122) [2024-06-15 20:44:49,030][1653645] Updated weights for policy 0, policy_version 716223 (0.0048) [2024-06-15 20:44:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1466826752. Throughput: 0: 11150.2. Samples: 366741504. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:44:55,167][1653645] Updated weights for policy 0, policy_version 716278 (0.0031) [2024-06-15 20:44:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45875.2, 300 sec: 43875.8). Total num frames: 1466957824. Throughput: 0: 11127.5. Samples: 366819328. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:44:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:44:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000716288_1466957824.pth... [2024-06-15 20:44:56,207][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000711168_1456472064.pth [2024-06-15 20:44:57,102][1653645] Updated weights for policy 0, policy_version 716323 (0.0011) [2024-06-15 20:44:58,308][1653645] Updated weights for policy 0, policy_version 716378 (0.0013) [2024-06-15 20:45:00,304][1653645] Updated weights for policy 0, policy_version 716464 (0.0103) [2024-06-15 20:45:00,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1467351040. Throughput: 0: 11093.3. Samples: 366877696. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:45:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:45:05,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 1467383808. Throughput: 0: 10934.1. Samples: 366915584. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:45:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:45:06,763][1653645] Updated weights for policy 0, policy_version 716540 (0.0015) [2024-06-15 20:45:08,639][1651596] Signal inference workers to stop experience collection... (37200 times) [2024-06-15 20:45:08,766][1653645] InferenceWorker_p0-w0: stopping experience collection (37200 times) [2024-06-15 20:45:08,858][1651596] Signal inference workers to resume experience collection... (37200 times) [2024-06-15 20:45:08,859][1653645] InferenceWorker_p0-w0: resuming experience collection (37200 times) [2024-06-15 20:45:08,982][1653645] Updated weights for policy 0, policy_version 716593 (0.0198) [2024-06-15 20:45:10,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.2, 300 sec: 44320.1). Total num frames: 1467744256. Throughput: 0: 11286.8. Samples: 366986240. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:45:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:45:10,985][1653645] Updated weights for policy 0, policy_version 716688 (0.0085) [2024-06-15 20:45:15,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1467875328. Throughput: 0: 11104.7. Samples: 367051264. Policy #0 lag: (min: 120.0, avg: 160.9, max: 376.0) [2024-06-15 20:45:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:45:17,378][1653645] Updated weights for policy 0, policy_version 716753 (0.0013) [2024-06-15 20:45:19,264][1653645] Updated weights for policy 0, policy_version 716806 (0.0013) [2024-06-15 20:45:20,551][1653645] Updated weights for policy 0, policy_version 716880 (0.0123) [2024-06-15 20:45:20,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.4, 300 sec: 44209.0). Total num frames: 1468203008. Throughput: 0: 11332.3. Samples: 367090688. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:45:22,160][1653645] Updated weights for policy 0, policy_version 716929 (0.0012) [2024-06-15 20:45:23,603][1653645] Updated weights for policy 0, policy_version 716991 (0.0012) [2024-06-15 20:45:25,966][1648982] Fps is (10 sec: 52383.8, 60 sec: 43684.4, 300 sec: 44429.9). Total num frames: 1468399616. Throughput: 0: 11432.5. Samples: 367154688. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:25,967][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:45:29,171][1653645] Updated weights for policy 0, policy_version 717040 (0.0014) [2024-06-15 20:45:30,595][1653645] Updated weights for policy 0, policy_version 717079 (0.0037) [2024-06-15 20:45:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 43875.8). Total num frames: 1468596224. Throughput: 0: 11434.7. Samples: 367229952. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:45:31,546][1653645] Updated weights for policy 0, policy_version 717121 (0.0055) [2024-06-15 20:45:33,112][1653645] Updated weights for policy 0, policy_version 717185 (0.0013) [2024-06-15 20:45:34,510][1653645] Updated weights for policy 0, policy_version 717242 (0.0012) [2024-06-15 20:45:35,958][1648982] Fps is (10 sec: 52474.2, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1468923904. Throughput: 0: 11593.9. Samples: 367263232. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:45:40,487][1653645] Updated weights for policy 0, policy_version 717296 (0.0024) [2024-06-15 20:45:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45875.1, 300 sec: 44209.1). Total num frames: 1469054976. Throughput: 0: 11525.7. Samples: 367337984. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:45:42,042][1653645] Updated weights for policy 0, policy_version 717328 (0.0011) [2024-06-15 20:45:43,807][1653645] Updated weights for policy 0, policy_version 717415 (0.0013) [2024-06-15 20:45:45,335][1653645] Updated weights for policy 0, policy_version 717459 (0.0021) [2024-06-15 20:45:45,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 47513.6, 300 sec: 44320.1). Total num frames: 1469382656. Throughput: 0: 11525.7. Samples: 367396352. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:45:50,978][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1469448192. Throughput: 0: 11446.1. Samples: 367430656. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:50,978][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:45:51,479][1653645] Updated weights for policy 0, policy_version 717529 (0.0012) [2024-06-15 20:45:51,814][1651596] Signal inference workers to stop experience collection... (37250 times) [2024-06-15 20:45:51,882][1653645] InferenceWorker_p0-w0: stopping experience collection (37250 times) [2024-06-15 20:45:52,203][1651596] Signal inference workers to resume experience collection... (37250 times) [2024-06-15 20:45:52,204][1653645] InferenceWorker_p0-w0: resuming experience collection (37250 times) [2024-06-15 20:45:54,339][1653645] Updated weights for policy 0, policy_version 717588 (0.0017) [2024-06-15 20:45:55,502][1653645] Updated weights for policy 0, policy_version 717649 (0.0015) [2024-06-15 20:45:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 46967.5, 300 sec: 44098.0). Total num frames: 1469775872. Throughput: 0: 11457.4. Samples: 367501824. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:45:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:45:57,490][1653645] Updated weights for policy 0, policy_version 717728 (0.0104) [2024-06-15 20:46:00,958][1648982] Fps is (10 sec: 52426.5, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1469972480. Throughput: 0: 11377.7. Samples: 367563264. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:00,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:46:03,002][1653645] Updated weights for policy 0, policy_version 717768 (0.0012) [2024-06-15 20:46:05,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45875.3, 300 sec: 43875.8). Total num frames: 1470136320. Throughput: 0: 11366.4. Samples: 367602176. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:46:06,299][1653645] Updated weights for policy 0, policy_version 717856 (0.0014) [2024-06-15 20:46:07,974][1653645] Updated weights for policy 0, policy_version 717920 (0.0013) [2024-06-15 20:46:09,507][1653645] Updated weights for policy 0, policy_version 717971 (0.0015) [2024-06-15 20:46:10,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45875.0, 300 sec: 44431.6). Total num frames: 1470496768. Throughput: 0: 11220.6. Samples: 367659520. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:10,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:46:15,343][1653645] Updated weights for policy 0, policy_version 718043 (0.0045) [2024-06-15 20:46:15,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1470595072. Throughput: 0: 11116.1. Samples: 367730176. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:46:17,423][1653645] Updated weights for policy 0, policy_version 718081 (0.0056) [2024-06-15 20:46:19,484][1653645] Updated weights for policy 0, policy_version 718161 (0.0014) [2024-06-15 20:46:20,341][1653645] Updated weights for policy 0, policy_version 718205 (0.0011) [2024-06-15 20:46:20,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 45329.1, 300 sec: 44097.9). Total num frames: 1470922752. Throughput: 0: 11150.2. Samples: 367764992. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:46:21,840][1653645] Updated weights for policy 0, policy_version 718272 (0.0013) [2024-06-15 20:46:25,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43696.9, 300 sec: 44320.1). Total num frames: 1471021056. Throughput: 0: 10945.4. Samples: 367830528. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:46:27,955][1653645] Updated weights for policy 0, policy_version 718334 (0.0040) [2024-06-15 20:46:30,972][1648982] Fps is (10 sec: 39264.7, 60 sec: 45318.1, 300 sec: 44095.8). Total num frames: 1471315968. Throughput: 0: 11112.5. Samples: 367896576. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:30,973][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:46:31,051][1653645] Updated weights for policy 0, policy_version 718416 (0.0218) [2024-06-15 20:46:32,401][1653645] Updated weights for policy 0, policy_version 718467 (0.0014) [2024-06-15 20:46:32,739][1651596] Signal inference workers to stop experience collection... (37300 times) [2024-06-15 20:46:32,790][1653645] InferenceWorker_p0-w0: stopping experience collection (37300 times) [2024-06-15 20:46:32,958][1651596] Signal inference workers to resume experience collection... (37300 times) [2024-06-15 20:46:32,959][1653645] InferenceWorker_p0-w0: resuming experience collection (37300 times) [2024-06-15 20:46:35,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1471545344. Throughput: 0: 10865.8. Samples: 367919616. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:46:39,157][1653645] Updated weights for policy 0, policy_version 718531 (0.0012) [2024-06-15 20:46:40,354][1653645] Updated weights for policy 0, policy_version 718592 (0.0018) [2024-06-15 20:46:40,958][1648982] Fps is (10 sec: 36095.8, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 1471676416. Throughput: 0: 11059.1. Samples: 367999488. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:46:42,455][1653645] Updated weights for policy 0, policy_version 718656 (0.0013) [2024-06-15 20:46:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1472004096. Throughput: 0: 10854.5. Samples: 368051712. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:46:45,970][1653645] Updated weights for policy 0, policy_version 718768 (0.0012) [2024-06-15 20:46:50,957][1648982] Fps is (10 sec: 39323.7, 60 sec: 43690.8, 300 sec: 43875.9). Total num frames: 1472069632. Throughput: 0: 10683.8. Samples: 368082944. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:50,970][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:46:51,633][1653645] Updated weights for policy 0, policy_version 718800 (0.0017) [2024-06-15 20:46:52,824][1653645] Updated weights for policy 0, policy_version 718842 (0.0040) [2024-06-15 20:46:54,992][1653645] Updated weights for policy 0, policy_version 718900 (0.0014) [2024-06-15 20:46:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43764.8). Total num frames: 1472397312. Throughput: 0: 11013.7. Samples: 368155136. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:46:55,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:46:56,354][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000718960_1472430080.pth... [2024-06-15 20:46:56,400][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000713712_1461682176.pth [2024-06-15 20:46:56,625][1653645] Updated weights for policy 0, policy_version 718968 (0.0015) [2024-06-15 20:46:58,028][1653645] Updated weights for policy 0, policy_version 719008 (0.0038) [2024-06-15 20:47:00,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 1472593920. Throughput: 0: 10763.4. Samples: 368214528. Policy #0 lag: (min: 15.0, avg: 85.5, max: 271.0) [2024-06-15 20:47:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:47:03,674][1653645] Updated weights for policy 0, policy_version 719088 (0.0017) [2024-06-15 20:47:05,763][1653645] Updated weights for policy 0, policy_version 719127 (0.0013) [2024-06-15 20:47:05,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1472790528. Throughput: 0: 10854.4. Samples: 368253440. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:47:07,039][1653645] Updated weights for policy 0, policy_version 719184 (0.0039) [2024-06-15 20:47:09,808][1653645] Updated weights for policy 0, policy_version 719248 (0.0014) [2024-06-15 20:47:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1473118208. Throughput: 0: 10808.9. Samples: 368316928. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:47:14,920][1653645] Updated weights for policy 0, policy_version 719333 (0.0014) [2024-06-15 20:47:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1473249280. Throughput: 0: 11017.2. Samples: 368392192. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:47:16,620][1653645] Updated weights for policy 0, policy_version 719379 (0.0013) [2024-06-15 20:47:17,535][1653645] Updated weights for policy 0, policy_version 719423 (0.0012) [2024-06-15 20:47:19,522][1653645] Updated weights for policy 0, policy_version 719488 (0.0108) [2024-06-15 20:47:20,890][1651596] Signal inference workers to stop experience collection... (37350 times) [2024-06-15 20:47:20,950][1653645] InferenceWorker_p0-w0: stopping experience collection (37350 times) [2024-06-15 20:47:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1473511424. Throughput: 0: 11207.1. Samples: 368423936. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:20,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:47:21,054][1651596] Signal inference workers to resume experience collection... (37350 times) [2024-06-15 20:47:21,056][1653645] InferenceWorker_p0-w0: resuming experience collection (37350 times) [2024-06-15 20:47:21,936][1653645] Updated weights for policy 0, policy_version 719550 (0.0013) [2024-06-15 20:47:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1473642496. Throughput: 0: 10865.9. Samples: 368488448. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:25,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:47:27,003][1653645] Updated weights for policy 0, policy_version 719600 (0.0112) [2024-06-15 20:47:28,210][1653645] Updated weights for policy 0, policy_version 719648 (0.0039) [2024-06-15 20:47:28,893][1653645] Updated weights for policy 0, policy_version 719678 (0.0009) [2024-06-15 20:47:30,958][1648982] Fps is (10 sec: 49150.7, 60 sec: 44793.5, 300 sec: 44320.1). Total num frames: 1474002944. Throughput: 0: 11286.7. Samples: 368559616. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:47:31,037][1653645] Updated weights for policy 0, policy_version 719735 (0.0047) [2024-06-15 20:47:33,520][1653645] Updated weights for policy 0, policy_version 719792 (0.0015) [2024-06-15 20:47:35,957][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1474166784. Throughput: 0: 11195.7. Samples: 368586752. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:47:39,252][1653645] Updated weights for policy 0, policy_version 719857 (0.0102) [2024-06-15 20:47:40,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 45329.3, 300 sec: 44653.3). Total num frames: 1474396160. Throughput: 0: 11252.6. Samples: 368661504. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:40,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 20:47:40,969][1653645] Updated weights for policy 0, policy_version 719928 (0.0014) [2024-06-15 20:47:43,135][1653645] Updated weights for policy 0, policy_version 720000 (0.0134) [2024-06-15 20:47:45,369][1653645] Updated weights for policy 0, policy_version 720055 (0.0014) [2024-06-15 20:47:45,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 1474691072. Throughput: 0: 11184.3. Samples: 368717824. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:45,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:47:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45328.9, 300 sec: 44653.4). Total num frames: 1474789376. Throughput: 0: 11298.1. Samples: 368761856. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:47:51,275][1653645] Updated weights for policy 0, policy_version 720128 (0.0040) [2024-06-15 20:47:52,821][1653645] Updated weights for policy 0, policy_version 720192 (0.0013) [2024-06-15 20:47:55,082][1653645] Updated weights for policy 0, policy_version 720245 (0.0015) [2024-06-15 20:47:55,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44782.9, 300 sec: 44098.0). Total num frames: 1475084288. Throughput: 0: 11275.4. Samples: 368824320. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:47:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:47:56,793][1653645] Updated weights for policy 0, policy_version 720304 (0.0013) [2024-06-15 20:48:00,986][1648982] Fps is (10 sec: 42477.0, 60 sec: 43669.8, 300 sec: 44426.9). Total num frames: 1475215360. Throughput: 0: 11211.4. Samples: 368897024. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:00,987][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:48:02,180][1653645] Updated weights for policy 0, policy_version 720352 (0.0013) [2024-06-15 20:48:04,255][1653645] Updated weights for policy 0, policy_version 720432 (0.0111) [2024-06-15 20:48:05,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44542.3). Total num frames: 1475477504. Throughput: 0: 11184.4. Samples: 368927232. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:48:06,307][1651596] Signal inference workers to stop experience collection... (37400 times) [2024-06-15 20:48:06,365][1653645] InferenceWorker_p0-w0: stopping experience collection (37400 times) [2024-06-15 20:48:06,367][1653645] Updated weights for policy 0, policy_version 720469 (0.0022) [2024-06-15 20:48:06,512][1651596] Signal inference workers to resume experience collection... (37400 times) [2024-06-15 20:48:06,514][1653645] InferenceWorker_p0-w0: resuming experience collection (37400 times) [2024-06-15 20:48:08,375][1653645] Updated weights for policy 0, policy_version 720560 (0.0123) [2024-06-15 20:48:10,958][1648982] Fps is (10 sec: 52579.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1475739648. Throughput: 0: 11116.1. Samples: 368988672. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:48:13,683][1653645] Updated weights for policy 0, policy_version 720624 (0.0015) [2024-06-15 20:48:15,619][1653645] Updated weights for policy 0, policy_version 720704 (0.0014) [2024-06-15 20:48:15,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1476001792. Throughput: 0: 11013.8. Samples: 369055232. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:48:20,137][1653645] Updated weights for policy 0, policy_version 720760 (0.0015) [2024-06-15 20:48:20,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44320.2). Total num frames: 1476165632. Throughput: 0: 11241.2. Samples: 369092608. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:48:21,641][1653645] Updated weights for policy 0, policy_version 720824 (0.0015) [2024-06-15 20:48:25,025][1653645] Updated weights for policy 0, policy_version 720852 (0.0010) [2024-06-15 20:48:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1476362240. Throughput: 0: 11195.7. Samples: 369165312. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:48:26,914][1653645] Updated weights for policy 0, policy_version 720916 (0.0013) [2024-06-15 20:48:30,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43144.7, 300 sec: 44653.3). Total num frames: 1476591616. Throughput: 0: 11411.9. Samples: 369231360. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:48:31,270][1653645] Updated weights for policy 0, policy_version 721008 (0.0033) [2024-06-15 20:48:33,087][1653645] Updated weights for policy 0, policy_version 721072 (0.0019) [2024-06-15 20:48:35,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1476788224. Throughput: 0: 10990.9. Samples: 369256448. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:48:37,137][1653645] Updated weights for policy 0, policy_version 721121 (0.0021) [2024-06-15 20:48:38,642][1653645] Updated weights for policy 0, policy_version 721187 (0.0013) [2024-06-15 20:48:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 1477050368. Throughput: 0: 11229.8. Samples: 369329664. Policy #0 lag: (min: 15.0, avg: 98.8, max: 271.0) [2024-06-15 20:48:40,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:48:42,944][1653645] Updated weights for policy 0, policy_version 721251 (0.0015) [2024-06-15 20:48:44,775][1653645] Updated weights for policy 0, policy_version 721314 (0.0013) [2024-06-15 20:48:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1477312512. Throughput: 0: 10906.8. Samples: 369387520. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:48:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:48:48,046][1653645] Updated weights for policy 0, policy_version 721345 (0.0032) [2024-06-15 20:48:48,854][1651596] Signal inference workers to stop experience collection... (37450 times) [2024-06-15 20:48:48,952][1653645] InferenceWorker_p0-w0: stopping experience collection (37450 times) [2024-06-15 20:48:49,057][1651596] Signal inference workers to resume experience collection... (37450 times) [2024-06-15 20:48:49,058][1653645] InferenceWorker_p0-w0: resuming experience collection (37450 times) [2024-06-15 20:48:49,466][1653645] Updated weights for policy 0, policy_version 721408 (0.0024) [2024-06-15 20:48:50,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1477574656. Throughput: 0: 11059.2. Samples: 369424896. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:48:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:48:53,937][1653645] Updated weights for policy 0, policy_version 721474 (0.0013) [2024-06-15 20:48:55,620][1653645] Updated weights for policy 0, policy_version 721538 (0.0119) [2024-06-15 20:48:55,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 1477738496. Throughput: 0: 11252.6. Samples: 369495040. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:48:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:48:56,379][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000721568_1477771264.pth... [2024-06-15 20:48:56,541][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000716288_1466957824.pth [2024-06-15 20:48:57,239][1653645] Updated weights for policy 0, policy_version 721600 (0.0014) [2024-06-15 20:49:00,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44804.3, 300 sec: 44653.3). Total num frames: 1477902336. Throughput: 0: 11127.5. Samples: 369555968. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:49:01,370][1653645] Updated weights for policy 0, policy_version 721664 (0.0013) [2024-06-15 20:49:05,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1478131712. Throughput: 0: 10888.5. Samples: 369582592. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:05,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:49:06,093][1653645] Updated weights for policy 0, policy_version 721747 (0.0013) [2024-06-15 20:49:08,152][1653645] Updated weights for policy 0, policy_version 721815 (0.0033) [2024-06-15 20:49:08,932][1653645] Updated weights for policy 0, policy_version 721848 (0.0012) [2024-06-15 20:49:10,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1478361088. Throughput: 0: 10808.9. Samples: 369651712. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:49:12,942][1653645] Updated weights for policy 0, policy_version 721915 (0.0080) [2024-06-15 20:49:14,910][1653645] Updated weights for policy 0, policy_version 721972 (0.0046) [2024-06-15 20:49:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.6, 300 sec: 44653.4). Total num frames: 1478623232. Throughput: 0: 10831.7. Samples: 369718784. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:49:18,326][1653645] Updated weights for policy 0, policy_version 722017 (0.0036) [2024-06-15 20:49:19,685][1653645] Updated weights for policy 0, policy_version 722067 (0.0011) [2024-06-15 20:49:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1478885376. Throughput: 0: 11184.3. Samples: 369759744. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:49:23,942][1653645] Updated weights for policy 0, policy_version 722116 (0.0019) [2024-06-15 20:49:25,379][1653645] Updated weights for policy 0, policy_version 722174 (0.0021) [2024-06-15 20:49:25,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.7, 300 sec: 44764.4). Total num frames: 1479049216. Throughput: 0: 10979.5. Samples: 369823744. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:49:26,920][1653645] Updated weights for policy 0, policy_version 722232 (0.0012) [2024-06-15 20:49:30,167][1653645] Updated weights for policy 0, policy_version 722275 (0.0028) [2024-06-15 20:49:30,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1479278592. Throughput: 0: 11286.8. Samples: 369895424. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 20:49:31,438][1651596] Signal inference workers to stop experience collection... (37500 times) [2024-06-15 20:49:31,479][1653645] InferenceWorker_p0-w0: stopping experience collection (37500 times) [2024-06-15 20:49:31,662][1651596] Signal inference workers to resume experience collection... (37500 times) [2024-06-15 20:49:31,663][1653645] InferenceWorker_p0-w0: resuming experience collection (37500 times) [2024-06-15 20:49:31,805][1653645] Updated weights for policy 0, policy_version 722338 (0.0011) [2024-06-15 20:49:35,877][1653645] Updated weights for policy 0, policy_version 722416 (0.0139) [2024-06-15 20:49:35,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1479507968. Throughput: 0: 11104.7. Samples: 369924608. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:35,959][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 20:49:37,620][1653645] Updated weights for policy 0, policy_version 722480 (0.0033) [2024-06-15 20:49:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.9, 300 sec: 44542.3). Total num frames: 1479671808. Throughput: 0: 11047.9. Samples: 369992192. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:49:41,282][1653645] Updated weights for policy 0, policy_version 722514 (0.0013) [2024-06-15 20:49:43,675][1653645] Updated weights for policy 0, policy_version 722618 (0.0023) [2024-06-15 20:49:45,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1479933952. Throughput: 0: 11184.3. Samples: 370059264. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:49:47,435][1653645] Updated weights for policy 0, policy_version 722665 (0.0011) [2024-06-15 20:49:48,405][1653645] Updated weights for policy 0, policy_version 722704 (0.0014) [2024-06-15 20:49:49,442][1653645] Updated weights for policy 0, policy_version 722744 (0.0011) [2024-06-15 20:49:50,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1480196096. Throughput: 0: 11377.8. Samples: 370094592. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:50,959][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 20:49:53,391][1653645] Updated weights for policy 0, policy_version 722800 (0.0014) [2024-06-15 20:49:55,234][1653645] Updated weights for policy 0, policy_version 722872 (0.0087) [2024-06-15 20:49:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1480458240. Throughput: 0: 11343.6. Samples: 370162176. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:49:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:49:59,394][1653645] Updated weights for policy 0, policy_version 722943 (0.0108) [2024-06-15 20:50:00,871][1653645] Updated weights for policy 0, policy_version 722993 (0.0022) [2024-06-15 20:50:00,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 46421.2, 300 sec: 45097.7). Total num frames: 1480687616. Throughput: 0: 11275.4. Samples: 370226176. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:50:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:50:01,221][1653645] Updated weights for policy 0, policy_version 723008 (0.0011) [2024-06-15 20:50:05,199][1653645] Updated weights for policy 0, policy_version 723072 (0.0013) [2024-06-15 20:50:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1480884224. Throughput: 0: 11286.8. Samples: 370267648. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:50:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:50:06,749][1653645] Updated weights for policy 0, policy_version 723131 (0.0012) [2024-06-15 20:50:10,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1481113600. Throughput: 0: 11412.0. Samples: 370337280. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:50:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:50:10,977][1653645] Updated weights for policy 0, policy_version 723201 (0.0014) [2024-06-15 20:50:12,377][1653645] Updated weights for policy 0, policy_version 723260 (0.0099) [2024-06-15 20:50:15,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1481244672. Throughput: 0: 11275.4. Samples: 370402816. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:50:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:50:16,830][1653645] Updated weights for policy 0, policy_version 723317 (0.0012) [2024-06-15 20:50:17,521][1651596] Signal inference workers to stop experience collection... (37550 times) [2024-06-15 20:50:17,553][1653645] InferenceWorker_p0-w0: stopping experience collection (37550 times) [2024-06-15 20:50:17,749][1651596] Signal inference workers to resume experience collection... (37550 times) [2024-06-15 20:50:17,751][1653645] InferenceWorker_p0-w0: resuming experience collection (37550 times) [2024-06-15 20:50:18,265][1653645] Updated weights for policy 0, policy_version 723363 (0.0131) [2024-06-15 20:50:20,918][1653645] Updated weights for policy 0, policy_version 723412 (0.0016) [2024-06-15 20:50:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44543.6). Total num frames: 1481539584. Throughput: 0: 11320.9. Samples: 370434048. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:50:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:50:22,250][1653645] Updated weights for policy 0, policy_version 723472 (0.0095) [2024-06-15 20:50:23,526][1653645] Updated weights for policy 0, policy_version 723515 (0.0013) [2024-06-15 20:50:25,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1481768960. Throughput: 0: 11275.3. Samples: 370499584. Policy #0 lag: (min: 95.0, avg: 195.7, max: 351.0) [2024-06-15 20:50:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:50:28,441][1653645] Updated weights for policy 0, policy_version 723568 (0.0061) [2024-06-15 20:50:29,206][1653645] Updated weights for policy 0, policy_version 723600 (0.0017) [2024-06-15 20:50:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1482031104. Throughput: 0: 11434.7. Samples: 370573824. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:50:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:50:32,112][1653645] Updated weights for policy 0, policy_version 723680 (0.0014) [2024-06-15 20:50:33,201][1653645] Updated weights for policy 0, policy_version 723714 (0.0012) [2024-06-15 20:50:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 46421.3, 300 sec: 44875.5). Total num frames: 1482293248. Throughput: 0: 11457.4. Samples: 370610176. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:50:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:50:38,858][1653645] Updated weights for policy 0, policy_version 723777 (0.0012) [2024-06-15 20:50:40,831][1653645] Updated weights for policy 0, policy_version 723843 (0.0018) [2024-06-15 20:50:40,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 45874.8, 300 sec: 44209.0). Total num frames: 1482424320. Throughput: 0: 11525.6. Samples: 370680832. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:50:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:50:43,356][1653645] Updated weights for policy 0, policy_version 723922 (0.0011) [2024-06-15 20:50:45,159][1653645] Updated weights for policy 0, policy_version 724000 (0.0011) [2024-06-15 20:50:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 48059.7, 300 sec: 45319.8). Total num frames: 1482817536. Throughput: 0: 11468.8. Samples: 370742272. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:50:45,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:50:50,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1482817536. Throughput: 0: 11366.4. Samples: 370779136. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:50:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:50:51,627][1653645] Updated weights for policy 0, policy_version 724069 (0.0014) [2024-06-15 20:50:53,769][1653645] Updated weights for policy 0, policy_version 724160 (0.0014) [2024-06-15 20:50:55,825][1653645] Updated weights for policy 0, policy_version 724216 (0.0021) [2024-06-15 20:50:55,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 45329.1, 300 sec: 44764.5). Total num frames: 1483177984. Throughput: 0: 11332.3. Samples: 370847232. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:50:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:50:56,020][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000724224_1483210752.pth... [2024-06-15 20:50:56,212][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000718960_1472430080.pth [2024-06-15 20:50:56,217][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000724224_1483210752.pth [2024-06-15 20:50:57,402][1653645] Updated weights for policy 0, policy_version 724272 (0.0020) [2024-06-15 20:51:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1483341824. Throughput: 0: 11355.0. Samples: 370913792. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:51:03,419][1653645] Updated weights for policy 0, policy_version 724321 (0.0013) [2024-06-15 20:51:03,744][1651596] Signal inference workers to stop experience collection... (37600 times) [2024-06-15 20:51:03,764][1653645] InferenceWorker_p0-w0: stopping experience collection (37600 times) [2024-06-15 20:51:03,978][1651596] Signal inference workers to resume experience collection... (37600 times) [2024-06-15 20:51:03,979][1653645] InferenceWorker_p0-w0: resuming experience collection (37600 times) [2024-06-15 20:51:05,085][1653645] Updated weights for policy 0, policy_version 724408 (0.0011) [2024-06-15 20:51:05,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 45328.8, 300 sec: 44431.2). Total num frames: 1483603968. Throughput: 0: 11468.7. Samples: 370950144. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:51:07,194][1653645] Updated weights for policy 0, policy_version 724453 (0.0016) [2024-06-15 20:51:09,053][1653645] Updated weights for policy 0, policy_version 724497 (0.0015) [2024-06-15 20:51:10,957][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.3, 300 sec: 44986.6). Total num frames: 1483866112. Throughput: 0: 11377.9. Samples: 371011584. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:51:14,928][1653645] Updated weights for policy 0, policy_version 724562 (0.0030) [2024-06-15 20:51:15,885][1653645] Updated weights for policy 0, policy_version 724606 (0.0029) [2024-06-15 20:51:15,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 1483997184. Throughput: 0: 11309.5. Samples: 371082752. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:51:17,824][1653645] Updated weights for policy 0, policy_version 724688 (0.0013) [2024-06-15 20:51:18,901][1653645] Updated weights for policy 0, policy_version 724736 (0.0013) [2024-06-15 20:51:20,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 46421.4, 300 sec: 45097.7). Total num frames: 1484324864. Throughput: 0: 11093.4. Samples: 371109376. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:51:21,327][1653645] Updated weights for policy 0, policy_version 724790 (0.0011) [2024-06-15 20:51:25,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 43690.7, 300 sec: 44322.2). Total num frames: 1484390400. Throughput: 0: 11343.7. Samples: 371191296. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:51:26,263][1653645] Updated weights for policy 0, policy_version 724821 (0.0013) [2024-06-15 20:51:28,556][1653645] Updated weights for policy 0, policy_version 724928 (0.0014) [2024-06-15 20:51:30,123][1653645] Updated weights for policy 0, policy_version 724990 (0.0011) [2024-06-15 20:51:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1484783616. Throughput: 0: 11195.8. Samples: 371246080. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:51:33,093][1653645] Updated weights for policy 0, policy_version 725044 (0.0099) [2024-06-15 20:51:35,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1484914688. Throughput: 0: 11138.8. Samples: 371280384. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:51:37,522][1653645] Updated weights for policy 0, policy_version 725061 (0.0042) [2024-06-15 20:51:38,742][1653645] Updated weights for policy 0, policy_version 725129 (0.0094) [2024-06-15 20:51:40,110][1653645] Updated weights for policy 0, policy_version 725188 (0.0153) [2024-06-15 20:51:40,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 46967.7, 300 sec: 44875.5). Total num frames: 1485242368. Throughput: 0: 11366.4. Samples: 371358720. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:51:41,405][1653645] Updated weights for policy 0, policy_version 725243 (0.0035) [2024-06-15 20:51:42,915][1651596] Signal inference workers to stop experience collection... (37650 times) [2024-06-15 20:51:42,994][1653645] InferenceWorker_p0-w0: stopping experience collection (37650 times) [2024-06-15 20:51:43,252][1651596] Signal inference workers to resume experience collection... (37650 times) [2024-06-15 20:51:43,253][1653645] InferenceWorker_p0-w0: resuming experience collection (37650 times) [2024-06-15 20:51:43,977][1653645] Updated weights for policy 0, policy_version 725302 (0.0013) [2024-06-15 20:51:45,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1485438976. Throughput: 0: 11275.4. Samples: 371421184. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:45,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:51:49,625][1653645] Updated weights for policy 0, policy_version 725344 (0.0012) [2024-06-15 20:51:50,896][1653645] Updated weights for policy 0, policy_version 725392 (0.0014) [2024-06-15 20:51:50,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 46421.4, 300 sec: 44764.4). Total num frames: 1485602816. Throughput: 0: 11400.6. Samples: 371463168. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:50,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:51:52,310][1653645] Updated weights for policy 0, policy_version 725460 (0.0014) [2024-06-15 20:51:53,776][1653645] Updated weights for policy 0, policy_version 725510 (0.0018) [2024-06-15 20:51:55,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1485963264. Throughput: 0: 11355.0. Samples: 371522560. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:51:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:52:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1485963264. Throughput: 0: 11605.3. Samples: 371604992. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:52:00,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:52:01,225][1653645] Updated weights for policy 0, policy_version 725584 (0.0013) [2024-06-15 20:52:02,623][1653645] Updated weights for policy 0, policy_version 725652 (0.0012) [2024-06-15 20:52:04,610][1653645] Updated weights for policy 0, policy_version 725728 (0.0013) [2024-06-15 20:52:05,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 46421.5, 300 sec: 44986.6). Total num frames: 1486389248. Throughput: 0: 11582.6. Samples: 371630592. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:52:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:52:06,617][1653645] Updated weights for policy 0, policy_version 725808 (0.0013) [2024-06-15 20:52:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1486487552. Throughput: 0: 11309.6. Samples: 371700224. Policy #0 lag: (min: 14.0, avg: 91.5, max: 270.0) [2024-06-15 20:52:10,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:52:13,467][1653645] Updated weights for policy 0, policy_version 725856 (0.0015) [2024-06-15 20:52:15,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1486749696. Throughput: 0: 11457.4. Samples: 371761664. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:15,959][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 20:52:16,070][1653645] Updated weights for policy 0, policy_version 725968 (0.0132) [2024-06-15 20:52:18,192][1653645] Updated weights for policy 0, policy_version 726048 (0.0014) [2024-06-15 20:52:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1487011840. Throughput: 0: 11195.8. Samples: 371784192. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:52:25,376][1653645] Updated weights for policy 0, policy_version 726096 (0.0012) [2024-06-15 20:52:25,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 44783.1, 300 sec: 44320.1). Total num frames: 1487077376. Throughput: 0: 11309.5. Samples: 371867648. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:52:26,811][1653645] Updated weights for policy 0, policy_version 726146 (0.0016) [2024-06-15 20:52:27,152][1651596] Signal inference workers to stop experience collection... (37700 times) [2024-06-15 20:52:27,249][1653645] InferenceWorker_p0-w0: stopping experience collection (37700 times) [2024-06-15 20:52:27,431][1651596] Signal inference workers to resume experience collection... (37700 times) [2024-06-15 20:52:27,431][1653645] InferenceWorker_p0-w0: resuming experience collection (37700 times) [2024-06-15 20:52:28,921][1653645] Updated weights for policy 0, policy_version 726225 (0.0015) [2024-06-15 20:52:30,817][1653645] Updated weights for policy 0, policy_version 726304 (0.0013) [2024-06-15 20:52:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44782.9, 300 sec: 45097.6). Total num frames: 1487470592. Throughput: 0: 10945.4. Samples: 371913728. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:52:35,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 44542.2). Total num frames: 1487536128. Throughput: 0: 10797.5. Samples: 371949056. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:35,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:52:38,823][1653645] Updated weights for policy 0, policy_version 726384 (0.0013) [2024-06-15 20:52:40,619][1653645] Updated weights for policy 0, policy_version 726448 (0.0013) [2024-06-15 20:52:40,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 42052.4, 300 sec: 44320.1). Total num frames: 1487765504. Throughput: 0: 11025.1. Samples: 372018688. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:52:42,311][1653645] Updated weights for policy 0, policy_version 726519 (0.0103) [2024-06-15 20:52:43,723][1653645] Updated weights for policy 0, policy_version 726583 (0.0013) [2024-06-15 20:52:45,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1488060416. Throughput: 0: 10467.5. Samples: 372076032. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:52:50,856][1653645] Updated weights for policy 0, policy_version 726624 (0.0014) [2024-06-15 20:52:50,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42052.2, 300 sec: 44209.0). Total num frames: 1488125952. Throughput: 0: 10797.5. Samples: 372116480. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:52:52,394][1653645] Updated weights for policy 0, policy_version 726688 (0.0083) [2024-06-15 20:52:54,452][1653645] Updated weights for policy 0, policy_version 726772 (0.0014) [2024-06-15 20:52:55,605][1653645] Updated weights for policy 0, policy_version 726833 (0.0017) [2024-06-15 20:52:55,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.7, 300 sec: 45324.2). Total num frames: 1488584704. Throughput: 0: 10456.2. Samples: 372170752. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:52:55,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 20:52:55,961][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000726848_1488584704.pth... [2024-06-15 20:52:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000721568_1477771264.pth [2024-06-15 20:53:00,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1488584704. Throughput: 0: 10991.0. Samples: 372256256. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:53:01,904][1653645] Updated weights for policy 0, policy_version 726866 (0.0011) [2024-06-15 20:53:03,585][1653645] Updated weights for policy 0, policy_version 726936 (0.0013) [2024-06-15 20:53:04,873][1653645] Updated weights for policy 0, policy_version 726998 (0.0026) [2024-06-15 20:53:05,427][1651596] Signal inference workers to stop experience collection... (37750 times) [2024-06-15 20:53:05,453][1653645] InferenceWorker_p0-w0: stopping experience collection (37750 times) [2024-06-15 20:53:05,655][1651596] Signal inference workers to resume experience collection... (37750 times) [2024-06-15 20:53:05,672][1653645] InferenceWorker_p0-w0: resuming experience collection (37750 times) [2024-06-15 20:53:05,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1489010688. Throughput: 0: 11082.0. Samples: 372282880. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:53:06,699][1653645] Updated weights for policy 0, policy_version 727093 (0.0014) [2024-06-15 20:53:10,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1489108992. Throughput: 0: 10786.1. Samples: 372353024. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:10,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:53:13,942][1653645] Updated weights for policy 0, policy_version 727152 (0.0017) [2024-06-15 20:53:15,416][1653645] Updated weights for policy 0, policy_version 727219 (0.0014) [2024-06-15 20:53:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1489403904. Throughput: 0: 11264.0. Samples: 372420608. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:53:17,040][1653645] Updated weights for policy 0, policy_version 727300 (0.0015) [2024-06-15 20:53:18,331][1653645] Updated weights for policy 0, policy_version 727360 (0.0014) [2024-06-15 20:53:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.5, 300 sec: 44986.5). Total num frames: 1489633280. Throughput: 0: 11093.3. Samples: 372448256. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:20,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:53:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1489797120. Throughput: 0: 11457.4. Samples: 372534272. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:25,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 20:53:26,118][1653645] Updated weights for policy 0, policy_version 727443 (0.0012) [2024-06-15 20:53:28,131][1653645] Updated weights for policy 0, policy_version 727537 (0.0013) [2024-06-15 20:53:29,828][1653645] Updated weights for policy 0, policy_version 727607 (0.0013) [2024-06-15 20:53:30,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1490157568. Throughput: 0: 11332.3. Samples: 372585984. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:53:35,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1490157568. Throughput: 0: 11434.6. Samples: 372631040. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:35,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 20:53:36,872][1653645] Updated weights for policy 0, policy_version 727671 (0.0017) [2024-06-15 20:53:38,072][1653645] Updated weights for policy 0, policy_version 727730 (0.0014) [2024-06-15 20:53:40,147][1653645] Updated weights for policy 0, policy_version 727811 (0.0014) [2024-06-15 20:53:40,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 47513.5, 300 sec: 45097.6). Total num frames: 1490616320. Throughput: 0: 11628.0. Samples: 372694016. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:53:45,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1490681856. Throughput: 0: 11207.1. Samples: 372760576. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:45,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:53:48,166][1653645] Updated weights for policy 0, policy_version 727888 (0.0035) [2024-06-15 20:53:48,267][1651596] Signal inference workers to stop experience collection... (37800 times) [2024-06-15 20:53:48,339][1653645] InferenceWorker_p0-w0: stopping experience collection (37800 times) [2024-06-15 20:53:48,464][1651596] Signal inference workers to resume experience collection... (37800 times) [2024-06-15 20:53:48,466][1653645] InferenceWorker_p0-w0: resuming experience collection (37800 times) [2024-06-15 20:53:49,789][1653645] Updated weights for policy 0, policy_version 727954 (0.0014) [2024-06-15 20:53:50,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 46967.4, 300 sec: 44764.4). Total num frames: 1490944000. Throughput: 0: 11434.7. Samples: 372797440. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:53:51,275][1653645] Updated weights for policy 0, policy_version 728032 (0.0015) [2024-06-15 20:53:53,300][1653645] Updated weights for policy 0, policy_version 728126 (0.0017) [2024-06-15 20:53:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.4, 300 sec: 45097.6). Total num frames: 1491206144. Throughput: 0: 11138.8. Samples: 372854272. Policy #0 lag: (min: 0.0, avg: 42.7, max: 240.0) [2024-06-15 20:53:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 20:54:00,296][1653645] Updated weights for policy 0, policy_version 728165 (0.0012) [2024-06-15 20:54:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1491337216. Throughput: 0: 11411.9. Samples: 372934144. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:54:01,811][1653645] Updated weights for policy 0, policy_version 728240 (0.0082) [2024-06-15 20:54:03,968][1653645] Updated weights for policy 0, policy_version 728326 (0.0022) [2024-06-15 20:54:05,973][1648982] Fps is (10 sec: 52348.7, 60 sec: 45317.3, 300 sec: 45317.4). Total num frames: 1491730432. Throughput: 0: 11328.4. Samples: 372958208. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:05,974][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:54:10,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1491730432. Throughput: 0: 10979.6. Samples: 373028352. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:54:11,870][1653645] Updated weights for policy 0, policy_version 728404 (0.0013) [2024-06-15 20:54:13,469][1653645] Updated weights for policy 0, policy_version 728480 (0.0012) [2024-06-15 20:54:14,743][1653645] Updated weights for policy 0, policy_version 728533 (0.0011) [2024-06-15 20:54:15,958][1648982] Fps is (10 sec: 39382.8, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1492123648. Throughput: 0: 11150.2. Samples: 373087744. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:54:16,647][1653645] Updated weights for policy 0, policy_version 728610 (0.0180) [2024-06-15 20:54:20,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1492254720. Throughput: 0: 10786.1. Samples: 373116416. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:54:23,629][1653645] Updated weights for policy 0, policy_version 728656 (0.0013) [2024-06-15 20:54:25,520][1653645] Updated weights for policy 0, policy_version 728736 (0.0015) [2024-06-15 20:54:25,651][1651596] Signal inference workers to stop experience collection... (37850 times) [2024-06-15 20:54:25,686][1653645] InferenceWorker_p0-w0: stopping experience collection (37850 times) [2024-06-15 20:54:25,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1492451328. Throughput: 0: 11161.6. Samples: 373196288. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:54:25,959][1651596] Signal inference workers to resume experience collection... (37850 times) [2024-06-15 20:54:25,959][1653645] InferenceWorker_p0-w0: resuming experience collection (37850 times) [2024-06-15 20:54:27,283][1653645] Updated weights for policy 0, policy_version 728800 (0.0012) [2024-06-15 20:54:29,738][1653645] Updated weights for policy 0, policy_version 728889 (0.0138) [2024-06-15 20:54:30,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.5, 300 sec: 44986.6). Total num frames: 1492779008. Throughput: 0: 10626.8. Samples: 373238784. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:30,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 20:54:35,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.2, 300 sec: 44653.3). Total num frames: 1492844544. Throughput: 0: 10752.0. Samples: 373281280. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:54:36,183][1653645] Updated weights for policy 0, policy_version 728948 (0.0014) [2024-06-15 20:54:37,578][1653645] Updated weights for policy 0, policy_version 729008 (0.0103) [2024-06-15 20:54:39,740][1653645] Updated weights for policy 0, policy_version 729088 (0.0011) [2024-06-15 20:54:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 44236.8, 300 sec: 45208.7). Total num frames: 1493270528. Throughput: 0: 10900.0. Samples: 373344768. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:54:41,168][1653645] Updated weights for policy 0, policy_version 729147 (0.0013) [2024-06-15 20:54:45,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1493303296. Throughput: 0: 10661.0. Samples: 373413888. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:54:48,194][1653645] Updated weights for policy 0, policy_version 729200 (0.0101) [2024-06-15 20:54:50,074][1653645] Updated weights for policy 0, policy_version 729275 (0.0012) [2024-06-15 20:54:50,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1493598208. Throughput: 0: 11017.5. Samples: 373453824. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:54:51,597][1653645] Updated weights for policy 0, policy_version 729315 (0.0011) [2024-06-15 20:54:55,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 1493827584. Throughput: 0: 10535.8. Samples: 373502464. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:54:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:54:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000729408_1493827584.pth... [2024-06-15 20:54:56,013][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000724224_1483210752.pth [2024-06-15 20:54:59,091][1653645] Updated weights for policy 0, policy_version 729409 (0.0142) [2024-06-15 20:55:00,659][1653645] Updated weights for policy 0, policy_version 729475 (0.0010) [2024-06-15 20:55:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1493991424. Throughput: 0: 10968.2. Samples: 373581312. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:55:03,061][1653645] Updated weights for policy 0, policy_version 729552 (0.0012) [2024-06-15 20:55:04,484][1653645] Updated weights for policy 0, policy_version 729616 (0.0013) [2024-06-15 20:55:04,812][1651596] Signal inference workers to stop experience collection... (37900 times) [2024-06-15 20:55:04,893][1653645] InferenceWorker_p0-w0: stopping experience collection (37900 times) [2024-06-15 20:55:05,025][1651596] Signal inference workers to resume experience collection... (37900 times) [2024-06-15 20:55:05,026][1653645] InferenceWorker_p0-w0: resuming experience collection (37900 times) [2024-06-15 20:55:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43702.0, 300 sec: 44875.5). Total num frames: 1494351872. Throughput: 0: 10979.6. Samples: 373610496. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:55:10,420][1653645] Updated weights for policy 0, policy_version 729682 (0.0020) [2024-06-15 20:55:10,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.8, 300 sec: 44653.3). Total num frames: 1494417408. Throughput: 0: 10854.4. Samples: 373684736. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:55:12,162][1653645] Updated weights for policy 0, policy_version 729744 (0.0106) [2024-06-15 20:55:14,480][1653645] Updated weights for policy 0, policy_version 729824 (0.0087) [2024-06-15 20:55:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1494777856. Throughput: 0: 11275.4. Samples: 373746176. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:15,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 20:55:16,247][1653645] Updated weights for policy 0, policy_version 729890 (0.0012) [2024-06-15 20:55:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1494876160. Throughput: 0: 11002.3. Samples: 373776384. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:20,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 20:55:22,949][1653645] Updated weights for policy 0, policy_version 729952 (0.0011) [2024-06-15 20:55:24,643][1653645] Updated weights for policy 0, policy_version 730019 (0.0012) [2024-06-15 20:55:25,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1495138304. Throughput: 0: 11229.9. Samples: 373850112. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:55:27,502][1653645] Updated weights for policy 0, policy_version 730096 (0.0035) [2024-06-15 20:55:28,759][1653645] Updated weights for policy 0, policy_version 730148 (0.0013) [2024-06-15 20:55:30,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1495400448. Throughput: 0: 11059.2. Samples: 373911552. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:55:35,247][1653645] Updated weights for policy 0, policy_version 730224 (0.0014) [2024-06-15 20:55:35,958][1648982] Fps is (10 sec: 39319.1, 60 sec: 44782.4, 300 sec: 44431.2). Total num frames: 1495531520. Throughput: 0: 11059.1. Samples: 373951488. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:35,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:55:37,075][1653645] Updated weights for policy 0, policy_version 730298 (0.0013) [2024-06-15 20:55:39,952][1653645] Updated weights for policy 0, policy_version 730373 (0.0013) [2024-06-15 20:55:40,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1495891968. Throughput: 0: 11150.2. Samples: 374004224. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:40,958][1648982] Avg episode reward: [(0, '37.560')] [2024-06-15 20:55:41,202][1651596] Saving new best policy, reward=37.560! [2024-06-15 20:55:45,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1495924736. Throughput: 0: 10888.4. Samples: 374071296. Policy #0 lag: (min: 15.0, avg: 62.4, max: 271.0) [2024-06-15 20:55:45,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 20:55:47,322][1653645] Updated weights for policy 0, policy_version 730448 (0.0012) [2024-06-15 20:55:49,772][1653645] Updated weights for policy 0, policy_version 730544 (0.0011) [2024-06-15 20:55:50,354][1651596] Signal inference workers to stop experience collection... (37950 times) [2024-06-15 20:55:50,390][1653645] InferenceWorker_p0-w0: stopping experience collection (37950 times) [2024-06-15 20:55:50,659][1651596] Signal inference workers to resume experience collection... (37950 times) [2024-06-15 20:55:50,660][1653645] InferenceWorker_p0-w0: resuming experience collection (37950 times) [2024-06-15 20:55:50,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1496219648. Throughput: 0: 11082.0. Samples: 374109184. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:55:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:55:51,600][1653645] Updated weights for policy 0, policy_version 730619 (0.0014) [2024-06-15 20:55:53,126][1653645] Updated weights for policy 0, policy_version 730672 (0.0013) [2024-06-15 20:55:55,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1496449024. Throughput: 0: 10615.5. Samples: 374162432. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:55:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:56:00,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 42598.3, 300 sec: 43875.8). Total num frames: 1496547328. Throughput: 0: 10899.9. Samples: 374236672. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:56:01,239][1653645] Updated weights for policy 0, policy_version 730752 (0.0014) [2024-06-15 20:56:03,390][1653645] Updated weights for policy 0, policy_version 730832 (0.0014) [2024-06-15 20:56:05,230][1653645] Updated weights for policy 0, policy_version 730912 (0.0013) [2024-06-15 20:56:05,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1496973312. Throughput: 0: 10763.4. Samples: 374260736. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:56:10,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1496973312. Throughput: 0: 10581.3. Samples: 374326272. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:56:12,161][1653645] Updated weights for policy 0, policy_version 730962 (0.0014) [2024-06-15 20:56:14,211][1653645] Updated weights for policy 0, policy_version 731044 (0.0011) [2024-06-15 20:56:15,830][1653645] Updated weights for policy 0, policy_version 731088 (0.0016) [2024-06-15 20:56:15,958][1648982] Fps is (10 sec: 29491.7, 60 sec: 41506.1, 300 sec: 43875.8). Total num frames: 1497268224. Throughput: 0: 10649.7. Samples: 374390784. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:56:17,898][1653645] Updated weights for policy 0, policy_version 731168 (0.0020) [2024-06-15 20:56:20,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1497497600. Throughput: 0: 10240.1. Samples: 374412288. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:56:24,359][1653645] Updated weights for policy 0, policy_version 731219 (0.0014) [2024-06-15 20:56:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 43653.6). Total num frames: 1497661440. Throughput: 0: 10831.6. Samples: 374491648. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:56:26,046][1653645] Updated weights for policy 0, policy_version 731296 (0.0124) [2024-06-15 20:56:28,899][1653645] Updated weights for policy 0, policy_version 731376 (0.0014) [2024-06-15 20:56:30,724][1653645] Updated weights for policy 0, policy_version 731450 (0.0014) [2024-06-15 20:56:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1498021888. Throughput: 0: 10513.1. Samples: 374544384. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:56:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 42052.7, 300 sec: 43431.5). Total num frames: 1498054656. Throughput: 0: 10444.8. Samples: 374579200. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:56:36,163][1651596] Signal inference workers to stop experience collection... (38000 times) [2024-06-15 20:56:36,212][1653645] InferenceWorker_p0-w0: stopping experience collection (38000 times) [2024-06-15 20:56:36,405][1651596] Signal inference workers to resume experience collection... (38000 times) [2024-06-15 20:56:36,406][1653645] InferenceWorker_p0-w0: resuming experience collection (38000 times) [2024-06-15 20:56:36,579][1653645] Updated weights for policy 0, policy_version 731511 (0.0014) [2024-06-15 20:56:37,598][1653645] Updated weights for policy 0, policy_version 731560 (0.0012) [2024-06-15 20:56:38,047][1653645] Updated weights for policy 0, policy_version 731584 (0.0020) [2024-06-15 20:56:40,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 43986.9). Total num frames: 1498415104. Throughput: 0: 11025.1. Samples: 374658560. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 20:56:41,369][1653645] Updated weights for policy 0, policy_version 731680 (0.0239) [2024-06-15 20:56:45,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 1498546176. Throughput: 0: 10729.3. Samples: 374719488. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:56:47,143][1653645] Updated weights for policy 0, policy_version 731731 (0.0013) [2024-06-15 20:56:48,575][1653645] Updated weights for policy 0, policy_version 731795 (0.0114) [2024-06-15 20:56:49,352][1653645] Updated weights for policy 0, policy_version 731833 (0.0012) [2024-06-15 20:56:50,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 43542.6). Total num frames: 1498808320. Throughput: 0: 10991.0. Samples: 374755328. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:50,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 20:56:51,564][1653645] Updated weights for policy 0, policy_version 731873 (0.0014) [2024-06-15 20:56:53,028][1653645] Updated weights for policy 0, policy_version 731938 (0.0015) [2024-06-15 20:56:55,961][1648982] Fps is (10 sec: 52412.4, 60 sec: 43688.4, 300 sec: 44430.7). Total num frames: 1499070464. Throughput: 0: 11138.1. Samples: 374827520. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:56:55,962][1648982] Avg episode reward: [(0, '37.150')] [2024-06-15 20:56:55,970][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000731968_1499070464.pth... [2024-06-15 20:56:56,059][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000726848_1488584704.pth [2024-06-15 20:56:58,346][1653645] Updated weights for policy 0, policy_version 731987 (0.0013) [2024-06-15 20:57:00,259][1653645] Updated weights for policy 0, policy_version 732064 (0.0011) [2024-06-15 20:57:00,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45875.3, 300 sec: 43764.7). Total num frames: 1499299840. Throughput: 0: 11229.9. Samples: 374896128. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:57:03,012][1653645] Updated weights for policy 0, policy_version 732114 (0.0013) [2024-06-15 20:57:04,962][1653645] Updated weights for policy 0, policy_version 732192 (0.0106) [2024-06-15 20:57:05,958][1648982] Fps is (10 sec: 52445.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1499594752. Throughput: 0: 11480.2. Samples: 374928896. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:57:10,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 44236.6, 300 sec: 43653.6). Total num frames: 1499627520. Throughput: 0: 11127.4. Samples: 374992384. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:57:11,435][1653645] Updated weights for policy 0, policy_version 732272 (0.0080) [2024-06-15 20:57:13,011][1653645] Updated weights for policy 0, policy_version 732341 (0.0013) [2024-06-15 20:57:15,873][1653645] Updated weights for policy 0, policy_version 732400 (0.0014) [2024-06-15 20:57:15,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 1499955200. Throughput: 0: 11400.6. Samples: 375057408. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:57:16,000][1651596] Signal inference workers to stop experience collection... (38050 times) [2024-06-15 20:57:16,041][1653645] InferenceWorker_p0-w0: stopping experience collection (38050 times) [2024-06-15 20:57:16,209][1651596] Signal inference workers to resume experience collection... (38050 times) [2024-06-15 20:57:16,210][1653645] InferenceWorker_p0-w0: resuming experience collection (38050 times) [2024-06-15 20:57:17,510][1653645] Updated weights for policy 0, policy_version 732470 (0.0099) [2024-06-15 20:57:20,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1500119040. Throughput: 0: 11218.5. Samples: 375084032. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:20,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 20:57:23,472][1653645] Updated weights for policy 0, policy_version 732534 (0.0013) [2024-06-15 20:57:24,600][1653645] Updated weights for policy 0, policy_version 732593 (0.0011) [2024-06-15 20:57:25,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 43764.7). Total num frames: 1500381184. Throughput: 0: 10979.6. Samples: 375152640. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:57:26,982][1653645] Updated weights for policy 0, policy_version 732624 (0.0010) [2024-06-15 20:57:28,312][1653645] Updated weights for policy 0, policy_version 732676 (0.0014) [2024-06-15 20:57:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1500643328. Throughput: 0: 11081.9. Samples: 375218176. Policy #0 lag: (min: 0.0, avg: 62.4, max: 256.0) [2024-06-15 20:57:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:57:34,183][1653645] Updated weights for policy 0, policy_version 732768 (0.0012) [2024-06-15 20:57:35,497][1653645] Updated weights for policy 0, policy_version 732802 (0.0052) [2024-06-15 20:57:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 1500807168. Throughput: 0: 11161.6. Samples: 375257600. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:57:35,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 20:57:36,850][1653645] Updated weights for policy 0, policy_version 732855 (0.0012) [2024-06-15 20:57:38,890][1653645] Updated weights for policy 0, policy_version 732887 (0.0012) [2024-06-15 20:57:40,923][1653645] Updated weights for policy 0, policy_version 732976 (0.0041) [2024-06-15 20:57:40,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1501134848. Throughput: 0: 11037.2. Samples: 375324160. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:57:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:57:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1501233152. Throughput: 0: 10968.2. Samples: 375389696. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:57:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 20:57:46,447][1653645] Updated weights for policy 0, policy_version 733056 (0.0013) [2024-06-15 20:57:48,456][1653645] Updated weights for policy 0, policy_version 733120 (0.0014) [2024-06-15 20:57:50,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 1501495296. Throughput: 0: 10820.3. Samples: 375415808. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:57:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 20:57:51,574][1653645] Updated weights for policy 0, policy_version 733176 (0.0022) [2024-06-15 20:57:53,931][1653645] Updated weights for policy 0, policy_version 733248 (0.0110) [2024-06-15 20:57:55,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43692.9, 300 sec: 44431.1). Total num frames: 1501691904. Throughput: 0: 10968.2. Samples: 375485952. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:57:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 20:57:57,657][1653645] Updated weights for policy 0, policy_version 733310 (0.0011) [2024-06-15 20:57:59,716][1653645] Updated weights for policy 0, policy_version 733370 (0.0012) [2024-06-15 20:58:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1501954048. Throughput: 0: 11025.1. Samples: 375553536. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:58:02,907][1651596] Signal inference workers to stop experience collection... (38100 times) [2024-06-15 20:58:02,939][1653645] Updated weights for policy 0, policy_version 733409 (0.0011) [2024-06-15 20:58:03,036][1653645] InferenceWorker_p0-w0: stopping experience collection (38100 times) [2024-06-15 20:58:03,283][1651596] Signal inference workers to resume experience collection... (38100 times) [2024-06-15 20:58:03,283][1653645] InferenceWorker_p0-w0: resuming experience collection (38100 times) [2024-06-15 20:58:05,158][1653645] Updated weights for policy 0, policy_version 733459 (0.0019) [2024-06-15 20:58:05,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1502183424. Throughput: 0: 11173.0. Samples: 375586816. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:58:09,144][1653645] Updated weights for policy 0, policy_version 733538 (0.0013) [2024-06-15 20:58:10,721][1653645] Updated weights for policy 0, policy_version 733600 (0.0115) [2024-06-15 20:58:10,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46421.5, 300 sec: 44098.0). Total num frames: 1502412800. Throughput: 0: 11127.5. Samples: 375653376. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:58:14,611][1653645] Updated weights for policy 0, policy_version 733648 (0.0012) [2024-06-15 20:58:15,652][1653645] Updated weights for policy 0, policy_version 733694 (0.0011) [2024-06-15 20:58:15,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 44236.6, 300 sec: 43986.9). Total num frames: 1502609408. Throughput: 0: 11195.7. Samples: 375721984. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:58:17,261][1653645] Updated weights for policy 0, policy_version 733760 (0.0014) [2024-06-15 20:58:20,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45875.3, 300 sec: 44320.1). Total num frames: 1502871552. Throughput: 0: 11161.6. Samples: 375759872. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:58:21,361][1653645] Updated weights for policy 0, policy_version 733825 (0.0097) [2024-06-15 20:58:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1503002624. Throughput: 0: 11138.8. Samples: 375825408. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:58:26,887][1653645] Updated weights for policy 0, policy_version 733906 (0.0013) [2024-06-15 20:58:28,301][1653645] Updated weights for policy 0, policy_version 733956 (0.0012) [2024-06-15 20:58:29,482][1653645] Updated weights for policy 0, policy_version 734013 (0.0012) [2024-06-15 20:58:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1503264768. Throughput: 0: 11184.4. Samples: 375892992. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:58:32,684][1653645] Updated weights for policy 0, policy_version 734083 (0.0012) [2024-06-15 20:58:33,933][1653645] Updated weights for policy 0, policy_version 734143 (0.0012) [2024-06-15 20:58:35,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 45329.0, 300 sec: 43764.7). Total num frames: 1503526912. Throughput: 0: 11286.8. Samples: 375923712. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:35,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 20:58:40,115][1653645] Updated weights for policy 0, policy_version 734209 (0.0012) [2024-06-15 20:58:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 1503723520. Throughput: 0: 11275.4. Samples: 375993344. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 20:58:41,367][1653645] Updated weights for policy 0, policy_version 734264 (0.0021) [2024-06-15 20:58:43,009][1653645] Updated weights for policy 0, policy_version 734290 (0.0010) [2024-06-15 20:58:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 1503985664. Throughput: 0: 11218.5. Samples: 376058368. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:58:46,159][1653645] Updated weights for policy 0, policy_version 734384 (0.0014) [2024-06-15 20:58:50,958][1648982] Fps is (10 sec: 36043.7, 60 sec: 43144.3, 300 sec: 43653.6). Total num frames: 1504083968. Throughput: 0: 11161.5. Samples: 376089088. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 20:58:51,123][1651596] Signal inference workers to stop experience collection... (38150 times) [2024-06-15 20:58:51,174][1653645] InferenceWorker_p0-w0: stopping experience collection (38150 times) [2024-06-15 20:58:51,379][1651596] Signal inference workers to resume experience collection... (38150 times) [2024-06-15 20:58:51,380][1653645] InferenceWorker_p0-w0: resuming experience collection (38150 times) [2024-06-15 20:58:52,045][1653645] Updated weights for policy 0, policy_version 734467 (0.0129) [2024-06-15 20:58:55,097][1653645] Updated weights for policy 0, policy_version 734544 (0.0015) [2024-06-15 20:58:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1504411648. Throughput: 0: 11047.8. Samples: 376150528. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:58:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:58:56,208][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000734592_1504444416.pth... [2024-06-15 20:58:56,252][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000729408_1493827584.pth [2024-06-15 20:58:58,215][1653645] Updated weights for policy 0, policy_version 734610 (0.0055) [2024-06-15 20:59:00,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 43690.7, 300 sec: 43544.9). Total num frames: 1504575488. Throughput: 0: 10911.4. Samples: 376212992. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:59:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:59:03,504][1653645] Updated weights for policy 0, policy_version 734688 (0.0044) [2024-06-15 20:59:05,517][1653645] Updated weights for policy 0, policy_version 734776 (0.0077) [2024-06-15 20:59:05,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1504837632. Throughput: 0: 11047.8. Samples: 376257024. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:59:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:59:10,264][1653645] Updated weights for policy 0, policy_version 734868 (0.0013) [2024-06-15 20:59:10,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1505067008. Throughput: 0: 10831.7. Samples: 376312832. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:59:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 20:59:14,895][1653645] Updated weights for policy 0, policy_version 734929 (0.0012) [2024-06-15 20:59:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1505230848. Throughput: 0: 11002.3. Samples: 376388096. Policy #0 lag: (min: 15.0, avg: 101.2, max: 271.0) [2024-06-15 20:59:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:59:16,176][1653645] Updated weights for policy 0, policy_version 734992 (0.0013) [2024-06-15 20:59:17,116][1653645] Updated weights for policy 0, policy_version 735040 (0.0015) [2024-06-15 20:59:19,658][1653645] Updated weights for policy 0, policy_version 735104 (0.0018) [2024-06-15 20:59:20,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1505492992. Throughput: 0: 11025.1. Samples: 376419840. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 20:59:22,232][1653645] Updated weights for policy 0, policy_version 735168 (0.0024) [2024-06-15 20:59:25,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1505624064. Throughput: 0: 10888.5. Samples: 376483328. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 20:59:28,137][1653645] Updated weights for policy 0, policy_version 735248 (0.0017) [2024-06-15 20:59:30,651][1653645] Updated weights for policy 0, policy_version 735301 (0.0013) [2024-06-15 20:59:30,957][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.9, 300 sec: 44320.1). Total num frames: 1505918976. Throughput: 0: 10843.1. Samples: 376546304. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 20:59:34,009][1653645] Updated weights for policy 0, policy_version 735392 (0.0013) [2024-06-15 20:59:35,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1506148352. Throughput: 0: 10843.1. Samples: 376577024. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 20:59:38,512][1651596] Signal inference workers to stop experience collection... (38200 times) [2024-06-15 20:59:38,564][1653645] InferenceWorker_p0-w0: stopping experience collection (38200 times) [2024-06-15 20:59:38,565][1653645] Updated weights for policy 0, policy_version 735428 (0.0012) [2024-06-15 20:59:38,750][1651596] Signal inference workers to resume experience collection... (38200 times) [2024-06-15 20:59:38,766][1653645] InferenceWorker_p0-w0: resuming experience collection (38200 times) [2024-06-15 20:59:40,258][1653645] Updated weights for policy 0, policy_version 735489 (0.0014) [2024-06-15 20:59:40,958][1648982] Fps is (10 sec: 42596.4, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 1506344960. Throughput: 0: 11229.8. Samples: 376655872. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 20:59:42,889][1653645] Updated weights for policy 0, policy_version 735553 (0.0137) [2024-06-15 20:59:45,174][1653645] Updated weights for policy 0, policy_version 735632 (0.0017) [2024-06-15 20:59:45,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1506639872. Throughput: 0: 10990.9. Samples: 376707584. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:45,960][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 20:59:50,958][1648982] Fps is (10 sec: 36046.4, 60 sec: 43691.0, 300 sec: 43653.7). Total num frames: 1506705408. Throughput: 0: 10956.8. Samples: 376750080. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 20:59:51,181][1653645] Updated weights for policy 0, policy_version 735712 (0.0014) [2024-06-15 20:59:52,421][1653645] Updated weights for policy 0, policy_version 735766 (0.0013) [2024-06-15 20:59:53,315][1653645] Updated weights for policy 0, policy_version 735805 (0.0059) [2024-06-15 20:59:55,184][1653645] Updated weights for policy 0, policy_version 735856 (0.0014) [2024-06-15 20:59:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1507065856. Throughput: 0: 11207.1. Samples: 376817152. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 20:59:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 20:59:56,892][1653645] Updated weights for policy 0, policy_version 735904 (0.0013) [2024-06-15 21:00:00,958][1648982] Fps is (10 sec: 49150.0, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1507196928. Throughput: 0: 11184.3. Samples: 376891392. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:00,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 21:00:02,243][1653645] Updated weights for policy 0, policy_version 735968 (0.0014) [2024-06-15 21:00:04,010][1653645] Updated weights for policy 0, policy_version 736036 (0.0013) [2024-06-15 21:00:05,579][1653645] Updated weights for policy 0, policy_version 736080 (0.0014) [2024-06-15 21:00:05,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1507524608. Throughput: 0: 11104.7. Samples: 376919552. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:00:08,306][1653645] Updated weights for policy 0, policy_version 736129 (0.0013) [2024-06-15 21:00:10,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1507721216. Throughput: 0: 11093.4. Samples: 376982528. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:00:12,826][1653645] Updated weights for policy 0, policy_version 736209 (0.0125) [2024-06-15 21:00:14,555][1653645] Updated weights for policy 0, policy_version 736288 (0.0012) [2024-06-15 21:00:15,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1507983360. Throughput: 0: 11400.5. Samples: 377059328. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:00:16,228][1653645] Updated weights for policy 0, policy_version 736323 (0.0014) [2024-06-15 21:00:19,858][1653645] Updated weights for policy 0, policy_version 736400 (0.0013) [2024-06-15 21:00:19,962][1651596] Signal inference workers to stop experience collection... (38250 times) [2024-06-15 21:00:20,019][1653645] InferenceWorker_p0-w0: stopping experience collection (38250 times) [2024-06-15 21:00:20,156][1651596] Signal inference workers to resume experience collection... (38250 times) [2024-06-15 21:00:20,156][1653645] InferenceWorker_p0-w0: resuming experience collection (38250 times) [2024-06-15 21:00:20,695][1653645] Updated weights for policy 0, policy_version 736447 (0.0029) [2024-06-15 21:00:20,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 45875.0, 300 sec: 44431.1). Total num frames: 1508245504. Throughput: 0: 11457.4. Samples: 377092608. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:00:25,490][1653645] Updated weights for policy 0, policy_version 736528 (0.0036) [2024-06-15 21:00:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 46967.7, 300 sec: 44209.1). Total num frames: 1508442112. Throughput: 0: 11446.1. Samples: 377170944. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:00:26,603][1653645] Updated weights for policy 0, policy_version 736574 (0.0014) [2024-06-15 21:00:28,679][1653645] Updated weights for policy 0, policy_version 736633 (0.0013) [2024-06-15 21:00:30,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45328.7, 300 sec: 44431.2). Total num frames: 1508638720. Throughput: 0: 11639.4. Samples: 377231360. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:00:32,134][1653645] Updated weights for policy 0, policy_version 736674 (0.0012) [2024-06-15 21:00:35,685][1653645] Updated weights for policy 0, policy_version 736705 (0.0011) [2024-06-15 21:00:35,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1508802560. Throughput: 0: 11571.2. Samples: 377270784. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:00:37,967][1653645] Updated weights for policy 0, policy_version 736802 (0.0013) [2024-06-15 21:00:40,572][1653645] Updated weights for policy 0, policy_version 736893 (0.0011) [2024-06-15 21:00:40,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 46967.7, 300 sec: 44875.6). Total num frames: 1509163008. Throughput: 0: 11411.9. Samples: 377330688. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:40,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:00:43,757][1653645] Updated weights for policy 0, policy_version 736930 (0.0011) [2024-06-15 21:00:45,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 1509294080. Throughput: 0: 11286.8. Samples: 377399296. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:00:48,238][1653645] Updated weights for policy 0, policy_version 736976 (0.0012) [2024-06-15 21:00:49,733][1653645] Updated weights for policy 0, policy_version 737033 (0.0013) [2024-06-15 21:00:50,902][1653645] Updated weights for policy 0, policy_version 737085 (0.0065) [2024-06-15 21:00:50,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 46967.3, 300 sec: 44320.1). Total num frames: 1509523456. Throughput: 0: 11502.9. Samples: 377437184. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:00:54,321][1653645] Updated weights for policy 0, policy_version 737153 (0.0016) [2024-06-15 21:00:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1509818368. Throughput: 0: 11423.3. Samples: 377496576. Policy #0 lag: (min: 47.0, avg: 155.3, max: 335.0) [2024-06-15 21:00:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:00:55,980][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000737216_1509818368.pth... [2024-06-15 21:00:56,045][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000731968_1499070464.pth [2024-06-15 21:00:59,530][1653645] Updated weights for policy 0, policy_version 737217 (0.0012) [2024-06-15 21:01:00,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45329.3, 300 sec: 43875.8). Total num frames: 1509916672. Throughput: 0: 11332.3. Samples: 377569280. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:01:02,024][1653645] Updated weights for policy 0, policy_version 737316 (0.0138) [2024-06-15 21:01:04,006][1651596] Signal inference workers to stop experience collection... (38300 times) [2024-06-15 21:01:04,085][1653645] InferenceWorker_p0-w0: stopping experience collection (38300 times) [2024-06-15 21:01:04,295][1651596] Signal inference workers to resume experience collection... (38300 times) [2024-06-15 21:01:04,296][1653645] InferenceWorker_p0-w0: resuming experience collection (38300 times) [2024-06-15 21:01:04,412][1653645] Updated weights for policy 0, policy_version 737395 (0.0015) [2024-06-15 21:01:05,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44783.0, 300 sec: 44875.5). Total num frames: 1510211584. Throughput: 0: 11104.8. Samples: 377592320. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:01:06,107][1653645] Updated weights for policy 0, policy_version 737412 (0.0013) [2024-06-15 21:01:07,560][1653645] Updated weights for policy 0, policy_version 737466 (0.0012) [2024-06-15 21:01:10,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1510342656. Throughput: 0: 10888.5. Samples: 377660928. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:01:13,878][1653645] Updated weights for policy 0, policy_version 737552 (0.0118) [2024-06-15 21:01:15,763][1653645] Updated weights for policy 0, policy_version 737616 (0.0012) [2024-06-15 21:01:15,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1510637568. Throughput: 0: 10865.9. Samples: 377720320. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:01:19,062][1653645] Updated weights for policy 0, policy_version 737680 (0.0014) [2024-06-15 21:01:20,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1510866944. Throughput: 0: 10774.7. Samples: 377755648. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:01:24,559][1653645] Updated weights for policy 0, policy_version 737744 (0.0014) [2024-06-15 21:01:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1510998016. Throughput: 0: 11104.7. Samples: 377830400. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:01:26,676][1653645] Updated weights for policy 0, policy_version 737824 (0.0020) [2024-06-15 21:01:28,135][1653645] Updated weights for policy 0, policy_version 737904 (0.0012) [2024-06-15 21:01:30,830][1653645] Updated weights for policy 0, policy_version 737936 (0.0014) [2024-06-15 21:01:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.9, 300 sec: 44875.5). Total num frames: 1511292928. Throughput: 0: 10956.8. Samples: 377892352. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:30,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:01:35,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1511391232. Throughput: 0: 10797.5. Samples: 377923072. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:01:37,467][1653645] Updated weights for policy 0, policy_version 738017 (0.0257) [2024-06-15 21:01:38,799][1653645] Updated weights for policy 0, policy_version 738081 (0.0015) [2024-06-15 21:01:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1511784448. Throughput: 0: 10888.5. Samples: 377986560. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:01:43,599][1653645] Updated weights for policy 0, policy_version 738178 (0.0012) [2024-06-15 21:01:44,672][1653645] Updated weights for policy 0, policy_version 738227 (0.0014) [2024-06-15 21:01:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1511915520. Throughput: 0: 10808.9. Samples: 378055680. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:45,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:01:48,399][1653645] Updated weights for policy 0, policy_version 738256 (0.0012) [2024-06-15 21:01:49,542][1651596] Signal inference workers to stop experience collection... (38350 times) [2024-06-15 21:01:49,587][1653645] InferenceWorker_p0-w0: stopping experience collection (38350 times) [2024-06-15 21:01:49,716][1651596] Signal inference workers to resume experience collection... (38350 times) [2024-06-15 21:01:49,716][1653645] InferenceWorker_p0-w0: resuming experience collection (38350 times) [2024-06-15 21:01:50,512][1653645] Updated weights for policy 0, policy_version 738353 (0.0013) [2024-06-15 21:01:50,959][1648982] Fps is (10 sec: 39322.3, 60 sec: 44236.8, 300 sec: 44431.7). Total num frames: 1512177664. Throughput: 0: 11138.8. Samples: 378093568. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:50,961][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:01:51,623][1653645] Updated weights for policy 0, policy_version 738404 (0.0014) [2024-06-15 21:01:55,212][1653645] Updated weights for policy 0, policy_version 738448 (0.0013) [2024-06-15 21:01:55,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 44320.1). Total num frames: 1512374272. Throughput: 0: 11047.8. Samples: 378158080. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:01:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:02:00,757][1653645] Updated weights for policy 0, policy_version 738528 (0.0013) [2024-06-15 21:02:00,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43144.5, 300 sec: 43764.7). Total num frames: 1512505344. Throughput: 0: 11252.6. Samples: 378226688. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:02:03,047][1653645] Updated weights for policy 0, policy_version 738626 (0.0013) [2024-06-15 21:02:04,360][1653645] Updated weights for policy 0, policy_version 738683 (0.0012) [2024-06-15 21:02:05,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 43690.4, 300 sec: 44764.4). Total num frames: 1512833024. Throughput: 0: 10899.9. Samples: 378246144. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:02:08,464][1653645] Updated weights for policy 0, policy_version 738744 (0.0012) [2024-06-15 21:02:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 1512964096. Throughput: 0: 10774.8. Samples: 378315264. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:02:12,844][1653645] Updated weights for policy 0, policy_version 738772 (0.0010) [2024-06-15 21:02:15,058][1653645] Updated weights for policy 0, policy_version 738866 (0.0012) [2024-06-15 21:02:15,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1513259008. Throughput: 0: 10899.9. Samples: 378382848. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:15,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:02:16,889][1653645] Updated weights for policy 0, policy_version 738939 (0.0012) [2024-06-15 21:02:20,751][1653645] Updated weights for policy 0, policy_version 738999 (0.0030) [2024-06-15 21:02:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1513488384. Throughput: 0: 10854.4. Samples: 378411520. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:02:25,155][1653645] Updated weights for policy 0, policy_version 739056 (0.0014) [2024-06-15 21:02:25,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1513652224. Throughput: 0: 11059.3. Samples: 378484224. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:02:26,342][1653645] Updated weights for policy 0, policy_version 739104 (0.0011) [2024-06-15 21:02:28,555][1653645] Updated weights for policy 0, policy_version 739195 (0.0137) [2024-06-15 21:02:30,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43144.7, 300 sec: 44320.1). Total num frames: 1513881600. Throughput: 0: 10717.9. Samples: 378537984. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:02:31,610][1651596] Signal inference workers to stop experience collection... (38400 times) [2024-06-15 21:02:31,650][1653645] InferenceWorker_p0-w0: stopping experience collection (38400 times) [2024-06-15 21:02:31,812][1651596] Signal inference workers to resume experience collection... (38400 times) [2024-06-15 21:02:31,813][1653645] InferenceWorker_p0-w0: resuming experience collection (38400 times) [2024-06-15 21:02:32,380][1653645] Updated weights for policy 0, policy_version 739261 (0.0015) [2024-06-15 21:02:35,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 43690.6, 300 sec: 43653.6). Total num frames: 1514012672. Throughput: 0: 10649.6. Samples: 378572800. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:02:37,442][1653645] Updated weights for policy 0, policy_version 739325 (0.0013) [2024-06-15 21:02:39,059][1653645] Updated weights for policy 0, policy_version 739377 (0.0094) [2024-06-15 21:02:40,852][1653645] Updated weights for policy 0, policy_version 739446 (0.0013) [2024-06-15 21:02:40,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43144.7, 300 sec: 44542.3). Total num frames: 1514373120. Throughput: 0: 10661.0. Samples: 378637824. Policy #0 lag: (min: 12.0, avg: 79.9, max: 268.0) [2024-06-15 21:02:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:02:44,256][1653645] Updated weights for policy 0, policy_version 739488 (0.0015) [2024-06-15 21:02:45,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 1514536960. Throughput: 0: 10604.0. Samples: 378703872. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:02:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:02:48,202][1653645] Updated weights for policy 0, policy_version 739552 (0.0012) [2024-06-15 21:02:49,692][1653645] Updated weights for policy 0, policy_version 739603 (0.0110) [2024-06-15 21:02:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1514799104. Throughput: 0: 11116.2. Samples: 378746368. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:02:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:02:51,553][1653645] Updated weights for policy 0, policy_version 739684 (0.0013) [2024-06-15 21:02:55,194][1653645] Updated weights for policy 0, policy_version 739713 (0.0031) [2024-06-15 21:02:55,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 1514995712. Throughput: 0: 10922.6. Samples: 378806784. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:02:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:02:56,337][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000739776_1515061248.pth... [2024-06-15 21:02:56,401][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000734592_1504444416.pth [2024-06-15 21:02:59,339][1653645] Updated weights for policy 0, policy_version 739792 (0.0013) [2024-06-15 21:03:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 44098.0). Total num frames: 1515192320. Throughput: 0: 11150.3. Samples: 378884608. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:03:01,830][1653645] Updated weights for policy 0, policy_version 739888 (0.0014) [2024-06-15 21:03:03,671][1653645] Updated weights for policy 0, policy_version 739959 (0.0014) [2024-06-15 21:03:05,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 43690.9, 300 sec: 44209.0). Total num frames: 1515454464. Throughput: 0: 10934.0. Samples: 378903552. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:03:07,946][1653645] Updated weights for policy 0, policy_version 740001 (0.0034) [2024-06-15 21:03:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1515585536. Throughput: 0: 10956.8. Samples: 378977280. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:10,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:03:11,489][1653645] Updated weights for policy 0, policy_version 740033 (0.0013) [2024-06-15 21:03:12,754][1653645] Updated weights for policy 0, policy_version 740088 (0.0015) [2024-06-15 21:03:13,909][1651596] Signal inference workers to stop experience collection... (38450 times) [2024-06-15 21:03:13,958][1653645] InferenceWorker_p0-w0: stopping experience collection (38450 times) [2024-06-15 21:03:14,142][1651596] Signal inference workers to resume experience collection... (38450 times) [2024-06-15 21:03:14,143][1653645] InferenceWorker_p0-w0: resuming experience collection (38450 times) [2024-06-15 21:03:15,442][1653645] Updated weights for policy 0, policy_version 740198 (0.0102) [2024-06-15 21:03:15,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1515978752. Throughput: 0: 11025.0. Samples: 379034112. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:15,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:03:20,309][1653645] Updated weights for policy 0, policy_version 740262 (0.0049) [2024-06-15 21:03:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1516109824. Throughput: 0: 10991.0. Samples: 379067392. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:03:24,263][1653645] Updated weights for policy 0, policy_version 740320 (0.0012) [2024-06-15 21:03:25,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1516306432. Throughput: 0: 11138.8. Samples: 379139072. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:03:27,658][1653645] Updated weights for policy 0, policy_version 740434 (0.0015) [2024-06-15 21:03:28,801][1653645] Updated weights for policy 0, policy_version 740480 (0.0012) [2024-06-15 21:03:30,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 1516503040. Throughput: 0: 10865.8. Samples: 379192832. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:30,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:03:35,863][1653645] Updated weights for policy 0, policy_version 740560 (0.0102) [2024-06-15 21:03:35,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 1516666880. Throughput: 0: 10729.2. Samples: 379229184. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:03:37,235][1653645] Updated weights for policy 0, policy_version 740619 (0.0025) [2024-06-15 21:03:38,441][1653645] Updated weights for policy 0, policy_version 740668 (0.0014) [2024-06-15 21:03:39,854][1653645] Updated weights for policy 0, policy_version 740708 (0.0016) [2024-06-15 21:03:40,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 1517027328. Throughput: 0: 10922.7. Samples: 379298304. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:03:43,952][1653645] Updated weights for policy 0, policy_version 740755 (0.0014) [2024-06-15 21:03:45,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1517158400. Throughput: 0: 10786.1. Samples: 379369984. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:03:47,722][1653645] Updated weights for policy 0, policy_version 740864 (0.0016) [2024-06-15 21:03:50,120][1653645] Updated weights for policy 0, policy_version 740927 (0.0012) [2024-06-15 21:03:50,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1517420544. Throughput: 0: 11059.3. Samples: 379401216. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:03:52,470][1653645] Updated weights for policy 0, policy_version 740992 (0.0014) [2024-06-15 21:03:55,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43144.8, 300 sec: 44097.9). Total num frames: 1517584384. Throughput: 0: 10740.6. Samples: 379460608. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:03:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:03:56,644][1653645] Updated weights for policy 0, policy_version 741048 (0.0072) [2024-06-15 21:03:59,499][1653645] Updated weights for policy 0, policy_version 741104 (0.0012) [2024-06-15 21:04:00,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1517813760. Throughput: 0: 11070.6. Samples: 379532288. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:04:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:04:01,327][1651596] Signal inference workers to stop experience collection... (38500 times) [2024-06-15 21:04:01,391][1653645] InferenceWorker_p0-w0: stopping experience collection (38500 times) [2024-06-15 21:04:01,590][1651596] Signal inference workers to resume experience collection... (38500 times) [2024-06-15 21:04:01,592][1653645] InferenceWorker_p0-w0: resuming experience collection (38500 times) [2024-06-15 21:04:02,082][1653645] Updated weights for policy 0, policy_version 741172 (0.0024) [2024-06-15 21:04:03,763][1653645] Updated weights for policy 0, policy_version 741232 (0.0013) [2024-06-15 21:04:05,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1518075904. Throughput: 0: 10934.0. Samples: 379559424. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:04:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:04:08,808][1653645] Updated weights for policy 0, policy_version 741304 (0.0012) [2024-06-15 21:04:10,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 1518239744. Throughput: 0: 10877.1. Samples: 379628544. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:04:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:04:11,873][1653645] Updated weights for policy 0, policy_version 741370 (0.0053) [2024-06-15 21:04:14,060][1653645] Updated weights for policy 0, policy_version 741410 (0.0012) [2024-06-15 21:04:15,173][1653645] Updated weights for policy 0, policy_version 741444 (0.0011) [2024-06-15 21:04:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 44209.0). Total num frames: 1518534656. Throughput: 0: 11150.3. Samples: 379694592. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:04:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:04:16,552][1653645] Updated weights for policy 0, policy_version 741504 (0.0018) [2024-06-15 21:04:20,275][1653645] Updated weights for policy 0, policy_version 741568 (0.0014) [2024-06-15 21:04:20,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1518731264. Throughput: 0: 11150.2. Samples: 379730944. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:04:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:04:23,537][1653645] Updated weights for policy 0, policy_version 741628 (0.0016) [2024-06-15 21:04:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1518895104. Throughput: 0: 11013.7. Samples: 379793920. Policy #0 lag: (min: 15.0, avg: 135.1, max: 271.0) [2024-06-15 21:04:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:04:26,221][1653645] Updated weights for policy 0, policy_version 741671 (0.0014) [2024-06-15 21:04:26,961][1653645] Updated weights for policy 0, policy_version 741700 (0.0013) [2024-06-15 21:04:30,301][1653645] Updated weights for policy 0, policy_version 741776 (0.0146) [2024-06-15 21:04:30,958][1648982] Fps is (10 sec: 49153.0, 60 sec: 45329.3, 300 sec: 44320.1). Total num frames: 1519222784. Throughput: 0: 11013.7. Samples: 379865600. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:04:30,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:04:31,231][1653645] Updated weights for policy 0, policy_version 741823 (0.0013) [2024-06-15 21:04:34,336][1653645] Updated weights for policy 0, policy_version 741880 (0.0015) [2024-06-15 21:04:35,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.2, 300 sec: 44209.1). Total num frames: 1519386624. Throughput: 0: 11104.7. Samples: 379900928. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:04:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:04:38,385][1653645] Updated weights for policy 0, policy_version 741936 (0.0094) [2024-06-15 21:04:39,845][1653645] Updated weights for policy 0, policy_version 742012 (0.0019) [2024-06-15 21:04:40,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 1519648768. Throughput: 0: 11207.1. Samples: 379964928. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:04:40,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:04:42,900][1653645] Updated weights for policy 0, policy_version 742049 (0.0012) [2024-06-15 21:04:45,897][1653645] Updated weights for policy 0, policy_version 742128 (0.0017) [2024-06-15 21:04:45,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 45329.1, 300 sec: 44653.3). Total num frames: 1519878144. Throughput: 0: 11104.7. Samples: 380032000. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:04:45,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:04:49,343][1651596] Signal inference workers to stop experience collection... (38550 times) [2024-06-15 21:04:49,377][1653645] InferenceWorker_p0-w0: stopping experience collection (38550 times) [2024-06-15 21:04:49,529][1651596] Signal inference workers to resume experience collection... (38550 times) [2024-06-15 21:04:49,529][1653645] InferenceWorker_p0-w0: resuming experience collection (38550 times) [2024-06-15 21:04:50,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 1520074752. Throughput: 0: 11332.3. Samples: 380069376. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:04:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:04:51,405][1653645] Updated weights for policy 0, policy_version 742256 (0.0013) [2024-06-15 21:04:54,501][1653645] Updated weights for policy 0, policy_version 742304 (0.0011) [2024-06-15 21:04:55,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1520304128. Throughput: 0: 11116.1. Samples: 380128768. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:04:55,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:04:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000742336_1520304128.pth... [2024-06-15 21:04:56,065][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000737216_1509818368.pth [2024-06-15 21:04:57,164][1653645] Updated weights for policy 0, policy_version 742352 (0.0014) [2024-06-15 21:05:00,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1520435200. Throughput: 0: 11241.2. Samples: 380200448. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:05:01,644][1653645] Updated weights for policy 0, policy_version 742432 (0.0013) [2024-06-15 21:05:02,916][1653645] Updated weights for policy 0, policy_version 742481 (0.0023) [2024-06-15 21:05:05,441][1653645] Updated weights for policy 0, policy_version 742531 (0.0012) [2024-06-15 21:05:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1520730112. Throughput: 0: 11116.1. Samples: 380231168. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:05:06,482][1653645] Updated weights for policy 0, policy_version 742583 (0.0013) [2024-06-15 21:05:09,152][1653645] Updated weights for policy 0, policy_version 742644 (0.0124) [2024-06-15 21:05:10,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 45329.0, 300 sec: 43986.8). Total num frames: 1520959488. Throughput: 0: 11093.3. Samples: 380293120. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:10,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:05:13,320][1653645] Updated weights for policy 0, policy_version 742690 (0.0012) [2024-06-15 21:05:15,432][1653645] Updated weights for policy 0, policy_version 742768 (0.0108) [2024-06-15 21:05:15,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1521221632. Throughput: 0: 11070.6. Samples: 380363776. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:05:18,285][1653645] Updated weights for policy 0, policy_version 742832 (0.0098) [2024-06-15 21:05:20,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.0, 300 sec: 44097.9). Total num frames: 1521451008. Throughput: 0: 10956.7. Samples: 380393984. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:05:21,079][1653645] Updated weights for policy 0, policy_version 742912 (0.0012) [2024-06-15 21:05:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 1521582080. Throughput: 0: 11195.7. Samples: 380468736. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:05:25,972][1653645] Updated weights for policy 0, policy_version 742963 (0.0012) [2024-06-15 21:05:27,511][1653645] Updated weights for policy 0, policy_version 743028 (0.0013) [2024-06-15 21:05:29,672][1653645] Updated weights for policy 0, policy_version 743062 (0.0012) [2024-06-15 21:05:30,962][1648982] Fps is (10 sec: 42599.3, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1521876992. Throughput: 0: 11059.2. Samples: 380529664. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:30,963][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:05:31,400][1653645] Updated weights for policy 0, policy_version 743107 (0.0014) [2024-06-15 21:05:32,148][1651596] Signal inference workers to stop experience collection... (38600 times) [2024-06-15 21:05:32,218][1653645] InferenceWorker_p0-w0: stopping experience collection (38600 times) [2024-06-15 21:05:32,360][1651596] Signal inference workers to resume experience collection... (38600 times) [2024-06-15 21:05:32,361][1653645] InferenceWorker_p0-w0: resuming experience collection (38600 times) [2024-06-15 21:05:32,607][1653645] Updated weights for policy 0, policy_version 743166 (0.0014) [2024-06-15 21:05:35,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1522008064. Throughput: 0: 11036.4. Samples: 380566016. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:05:37,363][1653645] Updated weights for policy 0, policy_version 743224 (0.0011) [2024-06-15 21:05:39,234][1653645] Updated weights for policy 0, policy_version 743286 (0.0137) [2024-06-15 21:05:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1522335744. Throughput: 0: 11252.7. Samples: 380635136. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:05:41,363][1653645] Updated weights for policy 0, policy_version 743344 (0.0047) [2024-06-15 21:05:44,009][1653645] Updated weights for policy 0, policy_version 743416 (0.0016) [2024-06-15 21:05:45,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1522532352. Throughput: 0: 11070.6. Samples: 380698624. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:05:49,099][1653645] Updated weights for policy 0, policy_version 743472 (0.0083) [2024-06-15 21:05:50,664][1653645] Updated weights for policy 0, policy_version 743536 (0.0013) [2024-06-15 21:05:50,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 1522794496. Throughput: 0: 11275.4. Samples: 380738560. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:05:53,316][1653645] Updated weights for policy 0, policy_version 743613 (0.0015) [2024-06-15 21:05:55,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 1523023872. Throughput: 0: 11229.9. Samples: 380798464. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:05:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:05:55,981][1653645] Updated weights for policy 0, policy_version 743673 (0.0014) [2024-06-15 21:06:00,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 44236.8, 300 sec: 43653.6). Total num frames: 1523089408. Throughput: 0: 11389.2. Samples: 380876288. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:06:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:06:01,389][1653645] Updated weights for policy 0, policy_version 743728 (0.0013) [2024-06-15 21:06:04,076][1653645] Updated weights for policy 0, policy_version 743811 (0.0015) [2024-06-15 21:06:05,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 45328.9, 300 sec: 44431.1). Total num frames: 1523449856. Throughput: 0: 11252.6. Samples: 380900352. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:06:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:06:06,856][1653645] Updated weights for policy 0, policy_version 743876 (0.0014) [2024-06-15 21:06:10,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1523580928. Throughput: 0: 11036.5. Samples: 380965376. Policy #0 lag: (min: 31.0, avg: 146.4, max: 271.0) [2024-06-15 21:06:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:06:11,972][1653645] Updated weights for policy 0, policy_version 743940 (0.0015) [2024-06-15 21:06:13,658][1653645] Updated weights for policy 0, policy_version 744018 (0.0143) [2024-06-15 21:06:14,425][1653645] Updated weights for policy 0, policy_version 744061 (0.0013) [2024-06-15 21:06:15,960][1648982] Fps is (10 sec: 45876.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1523908608. Throughput: 0: 11286.8. Samples: 381037568. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:15,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:06:16,581][1653645] Updated weights for policy 0, policy_version 744128 (0.0014) [2024-06-15 21:06:18,632][1651596] Signal inference workers to stop experience collection... (38650 times) [2024-06-15 21:06:18,668][1653645] InferenceWorker_p0-w0: stopping experience collection (38650 times) [2024-06-15 21:06:18,890][1651596] Signal inference workers to resume experience collection... (38650 times) [2024-06-15 21:06:18,891][1653645] InferenceWorker_p0-w0: resuming experience collection (38650 times) [2024-06-15 21:06:19,511][1653645] Updated weights for policy 0, policy_version 744192 (0.0013) [2024-06-15 21:06:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 1524105216. Throughput: 0: 11150.2. Samples: 381067776. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:06:25,412][1653645] Updated weights for policy 0, policy_version 744260 (0.0013) [2024-06-15 21:06:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 1524301824. Throughput: 0: 11218.5. Samples: 381139968. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:06:26,816][1653645] Updated weights for policy 0, policy_version 744320 (0.0012) [2024-06-15 21:06:29,025][1653645] Updated weights for policy 0, policy_version 744384 (0.0012) [2024-06-15 21:06:30,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.9, 300 sec: 44542.3). Total num frames: 1524531200. Throughput: 0: 11104.7. Samples: 381198336. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:06:31,764][1653645] Updated weights for policy 0, policy_version 744441 (0.0014) [2024-06-15 21:06:35,958][1648982] Fps is (10 sec: 32768.3, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1524629504. Throughput: 0: 10968.2. Samples: 381232128. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:06:37,090][1653645] Updated weights for policy 0, policy_version 744496 (0.0016) [2024-06-15 21:06:38,996][1653645] Updated weights for policy 0, policy_version 744569 (0.0012) [2024-06-15 21:06:40,931][1653645] Updated weights for policy 0, policy_version 744624 (0.0015) [2024-06-15 21:06:40,963][1648982] Fps is (10 sec: 45851.0, 60 sec: 44233.0, 300 sec: 44319.3). Total num frames: 1524989952. Throughput: 0: 11035.2. Samples: 381295104. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:40,964][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:06:43,176][1653645] Updated weights for policy 0, policy_version 744672 (0.0013) [2024-06-15 21:06:43,876][1653645] Updated weights for policy 0, policy_version 744704 (0.0028) [2024-06-15 21:06:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1525153792. Throughput: 0: 10786.1. Samples: 381361664. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:06:50,721][1653645] Updated weights for policy 0, policy_version 744802 (0.0089) [2024-06-15 21:06:50,958][1648982] Fps is (10 sec: 36062.1, 60 sec: 42598.1, 300 sec: 43986.8). Total num frames: 1525350400. Throughput: 0: 11138.8. Samples: 381401600. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:50,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:06:52,056][1653645] Updated weights for policy 0, policy_version 744853 (0.0112) [2024-06-15 21:06:55,383][1653645] Updated weights for policy 0, policy_version 744913 (0.0014) [2024-06-15 21:06:55,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1525645312. Throughput: 0: 10854.4. Samples: 381453824. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:06:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:06:56,196][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000744960_1525678080.pth... [2024-06-15 21:06:56,261][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000739776_1515061248.pth [2024-06-15 21:07:00,630][1653645] Updated weights for policy 0, policy_version 744964 (0.0013) [2024-06-15 21:07:00,964][1648982] Fps is (10 sec: 36025.9, 60 sec: 43686.5, 300 sec: 43652.8). Total num frames: 1525710848. Throughput: 0: 11012.3. Samples: 381533184. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:00,965][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:07:02,741][1653645] Updated weights for policy 0, policy_version 745043 (0.0014) [2024-06-15 21:07:03,515][1651596] Signal inference workers to stop experience collection... (38700 times) [2024-06-15 21:07:03,555][1653645] InferenceWorker_p0-w0: stopping experience collection (38700 times) [2024-06-15 21:07:03,798][1651596] Signal inference workers to resume experience collection... (38700 times) [2024-06-15 21:07:03,799][1653645] InferenceWorker_p0-w0: resuming experience collection (38700 times) [2024-06-15 21:07:04,341][1653645] Updated weights for policy 0, policy_version 745104 (0.0029) [2024-06-15 21:07:05,265][1653645] Updated weights for policy 0, policy_version 745145 (0.0013) [2024-06-15 21:07:05,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1526071296. Throughput: 0: 10865.8. Samples: 381556736. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:05,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 21:07:07,966][1653645] Updated weights for policy 0, policy_version 745214 (0.0013) [2024-06-15 21:07:10,958][1648982] Fps is (10 sec: 49179.1, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 1526202368. Throughput: 0: 10717.8. Samples: 381622272. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:10,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 21:07:14,056][1653645] Updated weights for policy 0, policy_version 745265 (0.0013) [2024-06-15 21:07:15,695][1653645] Updated weights for policy 0, policy_version 745329 (0.0012) [2024-06-15 21:07:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1526464512. Throughput: 0: 10922.6. Samples: 381689856. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:15,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 21:07:16,947][1653645] Updated weights for policy 0, policy_version 745392 (0.0105) [2024-06-15 21:07:19,102][1653645] Updated weights for policy 0, policy_version 745447 (0.0013) [2024-06-15 21:07:20,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1526726656. Throughput: 0: 10865.8. Samples: 381721088. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:20,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 21:07:25,487][1653645] Updated weights for policy 0, policy_version 745491 (0.0012) [2024-06-15 21:07:25,958][1648982] Fps is (10 sec: 32767.4, 60 sec: 41506.0, 300 sec: 43764.7). Total num frames: 1526792192. Throughput: 0: 11014.9. Samples: 381790720. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:25,959][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 21:07:26,910][1653645] Updated weights for policy 0, policy_version 745554 (0.0011) [2024-06-15 21:07:29,139][1653645] Updated weights for policy 0, policy_version 745652 (0.0102) [2024-06-15 21:07:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.7, 300 sec: 44653.4). Total num frames: 1527185408. Throughput: 0: 10865.8. Samples: 381850624. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:07:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 1527250944. Throughput: 0: 10752.1. Samples: 381885440. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:35,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 21:07:36,745][1653645] Updated weights for policy 0, policy_version 745730 (0.0014) [2024-06-15 21:07:38,042][1653645] Updated weights for policy 0, policy_version 745792 (0.0011) [2024-06-15 21:07:38,990][1653645] Updated weights for policy 0, policy_version 745826 (0.0019) [2024-06-15 21:07:40,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43694.4, 300 sec: 44320.2). Total num frames: 1527611392. Throughput: 0: 11252.6. Samples: 381960192. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:40,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 21:07:40,980][1653645] Updated weights for policy 0, policy_version 745920 (0.0014) [2024-06-15 21:07:42,552][1653645] Updated weights for policy 0, policy_version 745980 (0.0012) [2024-06-15 21:07:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1527775232. Throughput: 0: 10855.7. Samples: 382021632. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:07:48,770][1651596] Signal inference workers to stop experience collection... (38750 times) [2024-06-15 21:07:48,815][1653645] InferenceWorker_p0-w0: stopping experience collection (38750 times) [2024-06-15 21:07:49,025][1651596] Signal inference workers to resume experience collection... (38750 times) [2024-06-15 21:07:49,027][1653645] InferenceWorker_p0-w0: resuming experience collection (38750 times) [2024-06-15 21:07:49,234][1653645] Updated weights for policy 0, policy_version 746020 (0.0011) [2024-06-15 21:07:50,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44237.2, 300 sec: 44098.0). Total num frames: 1528004608. Throughput: 0: 11275.4. Samples: 382064128. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:07:51,069][1653645] Updated weights for policy 0, policy_version 746099 (0.0012) [2024-06-15 21:07:52,574][1653645] Updated weights for policy 0, policy_version 746167 (0.0072) [2024-06-15 21:07:54,881][1653645] Updated weights for policy 0, policy_version 746224 (0.0013) [2024-06-15 21:07:55,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 1528299520. Throughput: 0: 11059.2. Samples: 382119936. Policy #0 lag: (min: 6.0, avg: 83.8, max: 262.0) [2024-06-15 21:07:55,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:08:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44787.2, 300 sec: 43875.8). Total num frames: 1528397824. Throughput: 0: 11116.1. Samples: 382190080. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:00,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:08:01,429][1653645] Updated weights for policy 0, policy_version 746307 (0.0025) [2024-06-15 21:08:03,519][1653645] Updated weights for policy 0, policy_version 746402 (0.0036) [2024-06-15 21:08:04,187][1653645] Updated weights for policy 0, policy_version 746432 (0.0011) [2024-06-15 21:08:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1528692736. Throughput: 0: 10979.6. Samples: 382215168. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:05,961][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:08:08,004][1653645] Updated weights for policy 0, policy_version 746496 (0.0013) [2024-06-15 21:08:10,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1528823808. Throughput: 0: 10945.5. Samples: 382283264. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:08:13,673][1653645] Updated weights for policy 0, policy_version 746578 (0.0209) [2024-06-15 21:08:15,278][1653645] Updated weights for policy 0, policy_version 746656 (0.0013) [2024-06-15 21:08:15,873][1653645] Updated weights for policy 0, policy_version 746688 (0.0015) [2024-06-15 21:08:15,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1529217024. Throughput: 0: 11093.3. Samples: 382349824. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:08:20,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1529348096. Throughput: 0: 11082.0. Samples: 382384128. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:20,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:08:23,776][1653645] Updated weights for policy 0, policy_version 746754 (0.0015) [2024-06-15 21:08:25,278][1653645] Updated weights for policy 0, policy_version 746819 (0.0012) [2024-06-15 21:08:25,963][1648982] Fps is (10 sec: 32768.2, 60 sec: 45875.5, 300 sec: 44209.1). Total num frames: 1529544704. Throughput: 0: 11104.7. Samples: 382459904. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:25,964][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:08:26,516][1653645] Updated weights for policy 0, policy_version 746880 (0.0012) [2024-06-15 21:08:26,994][1651596] Signal inference workers to stop experience collection... (38800 times) [2024-06-15 21:08:27,123][1653645] InferenceWorker_p0-w0: stopping experience collection (38800 times) [2024-06-15 21:08:27,250][1651596] Signal inference workers to resume experience collection... (38800 times) [2024-06-15 21:08:27,250][1653645] InferenceWorker_p0-w0: resuming experience collection (38800 times) [2024-06-15 21:08:27,904][1653645] Updated weights for policy 0, policy_version 746934 (0.0102) [2024-06-15 21:08:30,050][1653645] Updated weights for policy 0, policy_version 746998 (0.0011) [2024-06-15 21:08:30,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1529872384. Throughput: 0: 11104.8. Samples: 382521344. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:30,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:08:35,665][1653645] Updated weights for policy 0, policy_version 747028 (0.0011) [2024-06-15 21:08:35,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44783.1, 300 sec: 43764.7). Total num frames: 1529937920. Throughput: 0: 11081.9. Samples: 382562816. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:08:37,462][1653645] Updated weights for policy 0, policy_version 747111 (0.0031) [2024-06-15 21:08:39,411][1653645] Updated weights for policy 0, policy_version 747195 (0.0012) [2024-06-15 21:08:40,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1530265600. Throughput: 0: 11070.6. Samples: 382618112. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:08:42,448][1653645] Updated weights for policy 0, policy_version 747263 (0.0012) [2024-06-15 21:08:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.9, 300 sec: 43986.8). Total num frames: 1530396672. Throughput: 0: 11047.8. Samples: 382687232. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:45,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 21:08:49,635][1653645] Updated weights for policy 0, policy_version 747344 (0.0012) [2024-06-15 21:08:50,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1530658816. Throughput: 0: 11320.9. Samples: 382724608. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:08:51,835][1653645] Updated weights for policy 0, policy_version 747429 (0.0049) [2024-06-15 21:08:54,202][1653645] Updated weights for policy 0, policy_version 747481 (0.0014) [2024-06-15 21:08:55,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 43690.6, 300 sec: 44431.1). Total num frames: 1530920960. Throughput: 0: 10968.1. Samples: 382776832. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:08:55,961][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:08:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000747520_1530920960.pth... [2024-06-15 21:08:56,025][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000742336_1520304128.pth [2024-06-15 21:09:00,130][1653645] Updated weights for policy 0, policy_version 747540 (0.0013) [2024-06-15 21:09:00,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1531019264. Throughput: 0: 11116.1. Samples: 382850048. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:09:02,198][1653645] Updated weights for policy 0, policy_version 747617 (0.0012) [2024-06-15 21:09:03,585][1653645] Updated weights for policy 0, policy_version 747666 (0.0020) [2024-06-15 21:09:04,703][1653645] Updated weights for policy 0, policy_version 747710 (0.0022) [2024-06-15 21:09:05,959][1648982] Fps is (10 sec: 42599.6, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1531346944. Throughput: 0: 10911.3. Samples: 382875136. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:05,960][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 21:09:06,909][1653645] Updated weights for policy 0, policy_version 747768 (0.0015) [2024-06-15 21:09:10,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1531445248. Throughput: 0: 10672.3. Samples: 382940160. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:09:13,394][1651596] Signal inference workers to stop experience collection... (38850 times) [2024-06-15 21:09:13,433][1653645] InferenceWorker_p0-w0: stopping experience collection (38850 times) [2024-06-15 21:09:13,676][1651596] Signal inference workers to resume experience collection... (38850 times) [2024-06-15 21:09:13,678][1653645] InferenceWorker_p0-w0: resuming experience collection (38850 times) [2024-06-15 21:09:14,261][1653645] Updated weights for policy 0, policy_version 747858 (0.0271) [2024-06-15 21:09:15,957][1648982] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 44098.0). Total num frames: 1531740160. Throughput: 0: 10535.9. Samples: 382995456. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:09:16,371][1653645] Updated weights for policy 0, policy_version 747939 (0.0010) [2024-06-15 21:09:19,646][1653645] Updated weights for policy 0, policy_version 748005 (0.0014) [2024-06-15 21:09:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1531969536. Throughput: 0: 10376.5. Samples: 383029760. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:09:24,412][1653645] Updated weights for policy 0, policy_version 748049 (0.0013) [2024-06-15 21:09:25,420][1653645] Updated weights for policy 0, policy_version 748096 (0.0013) [2024-06-15 21:09:25,958][1648982] Fps is (10 sec: 36042.9, 60 sec: 42598.1, 300 sec: 43653.6). Total num frames: 1532100608. Throughput: 0: 10763.3. Samples: 383102464. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:25,961][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 21:09:27,370][1653645] Updated weights for policy 0, policy_version 748161 (0.0138) [2024-06-15 21:09:30,753][1653645] Updated weights for policy 0, policy_version 748240 (0.0040) [2024-06-15 21:09:30,957][1648982] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 44098.0). Total num frames: 1532395520. Throughput: 0: 10592.8. Samples: 383163904. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:30,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 21:09:35,472][1653645] Updated weights for policy 0, policy_version 748304 (0.0014) [2024-06-15 21:09:35,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1532559360. Throughput: 0: 10615.5. Samples: 383202304. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:35,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:09:37,077][1653645] Updated weights for policy 0, policy_version 748356 (0.0012) [2024-06-15 21:09:39,720][1653645] Updated weights for policy 0, policy_version 748448 (0.0104) [2024-06-15 21:09:40,958][1648982] Fps is (10 sec: 49149.5, 60 sec: 43690.4, 300 sec: 44097.9). Total num frames: 1532887040. Throughput: 0: 10877.1. Samples: 383266304. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:40,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 21:09:42,506][1653645] Updated weights for policy 0, policy_version 748497 (0.0012) [2024-06-15 21:09:45,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1533018112. Throughput: 0: 10979.5. Samples: 383344128. Policy #0 lag: (min: 15.0, avg: 86.1, max: 271.0) [2024-06-15 21:09:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:09:46,327][1653645] Updated weights for policy 0, policy_version 748548 (0.0015) [2024-06-15 21:09:48,505][1653645] Updated weights for policy 0, policy_version 748609 (0.0013) [2024-06-15 21:09:50,030][1653645] Updated weights for policy 0, policy_version 748673 (0.0146) [2024-06-15 21:09:50,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 45328.9, 300 sec: 44320.1). Total num frames: 1533378560. Throughput: 0: 11093.3. Samples: 383374336. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:09:50,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 21:09:51,149][1653645] Updated weights for policy 0, policy_version 748736 (0.0013) [2024-06-15 21:09:55,168][1653645] Updated weights for policy 0, policy_version 748788 (0.0013) [2024-06-15 21:09:55,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1533542400. Throughput: 0: 11104.7. Samples: 383439872. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:09:55,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:09:58,088][1653645] Updated weights for policy 0, policy_version 748805 (0.0042) [2024-06-15 21:09:58,376][1651596] Signal inference workers to stop experience collection... (38900 times) [2024-06-15 21:09:58,437][1653645] InferenceWorker_p0-w0: stopping experience collection (38900 times) [2024-06-15 21:09:58,660][1651596] Signal inference workers to resume experience collection... (38900 times) [2024-06-15 21:09:58,660][1653645] InferenceWorker_p0-w0: resuming experience collection (38900 times) [2024-06-15 21:09:59,154][1653645] Updated weights for policy 0, policy_version 748864 (0.0012) [2024-06-15 21:10:00,959][1648982] Fps is (10 sec: 32768.8, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1533706240. Throughput: 0: 11423.2. Samples: 383509504. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:00,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 21:10:01,779][1653645] Updated weights for policy 0, policy_version 748913 (0.0013) [2024-06-15 21:10:05,127][1653645] Updated weights for policy 0, policy_version 748993 (0.0013) [2024-06-15 21:10:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.8, 300 sec: 44209.1). Total num frames: 1534001152. Throughput: 0: 11298.2. Samples: 383538176. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:05,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 21:10:06,614][1653645] Updated weights for policy 0, policy_version 749056 (0.0012) [2024-06-15 21:10:10,966][1648982] Fps is (10 sec: 45837.0, 60 sec: 45322.8, 300 sec: 43874.6). Total num frames: 1534164992. Throughput: 0: 11296.1. Samples: 383610880. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:10,967][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 21:10:12,661][1653645] Updated weights for policy 0, policy_version 749124 (0.0142) [2024-06-15 21:10:14,154][1653645] Updated weights for policy 0, policy_version 749187 (0.0052) [2024-06-15 21:10:15,524][1653645] Updated weights for policy 0, policy_version 749248 (0.0015) [2024-06-15 21:10:15,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 44098.0). Total num frames: 1534459904. Throughput: 0: 11218.5. Samples: 383668736. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:10:18,745][1653645] Updated weights for policy 0, policy_version 749312 (0.0012) [2024-06-15 21:10:20,958][1648982] Fps is (10 sec: 42634.2, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1534590976. Throughput: 0: 11116.1. Samples: 383702528. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:10:22,421][1653645] Updated weights for policy 0, policy_version 749365 (0.0013) [2024-06-15 21:10:25,581][1653645] Updated weights for policy 0, policy_version 749424 (0.0100) [2024-06-15 21:10:26,008][1648982] Fps is (10 sec: 39123.9, 60 sec: 45836.9, 300 sec: 43979.4). Total num frames: 1534853120. Throughput: 0: 11296.9. Samples: 383775232. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:26,011][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 21:10:27,508][1653645] Updated weights for policy 0, policy_version 749497 (0.0032) [2024-06-15 21:10:29,852][1653645] Updated weights for policy 0, policy_version 749524 (0.0014) [2024-06-15 21:10:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1535115264. Throughput: 0: 10888.6. Samples: 383834112. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:10:33,158][1653645] Updated weights for policy 0, policy_version 749569 (0.0019) [2024-06-15 21:10:34,477][1653645] Updated weights for policy 0, policy_version 749630 (0.0013) [2024-06-15 21:10:35,958][1648982] Fps is (10 sec: 39521.0, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 1535246336. Throughput: 0: 11047.9. Samples: 383871488. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:35,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:10:39,104][1653645] Updated weights for policy 0, policy_version 749698 (0.0017) [2024-06-15 21:10:40,305][1653645] Updated weights for policy 0, policy_version 749752 (0.0014) [2024-06-15 21:10:40,958][1648982] Fps is (10 sec: 39320.4, 60 sec: 43690.7, 300 sec: 43986.8). Total num frames: 1535508480. Throughput: 0: 10831.6. Samples: 383927296. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:40,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 21:10:41,946][1651596] Signal inference workers to stop experience collection... (38950 times) [2024-06-15 21:10:41,994][1653645] InferenceWorker_p0-w0: stopping experience collection (38950 times) [2024-06-15 21:10:42,130][1651596] Signal inference workers to resume experience collection... (38950 times) [2024-06-15 21:10:42,146][1653645] InferenceWorker_p0-w0: resuming experience collection (38950 times) [2024-06-15 21:10:42,148][1653645] Updated weights for policy 0, policy_version 749808 (0.0010) [2024-06-15 21:10:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.9, 300 sec: 43653.6). Total num frames: 1535672320. Throughput: 0: 10934.1. Samples: 384001536. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:10:46,072][1653645] Updated weights for policy 0, policy_version 749856 (0.0012) [2024-06-15 21:10:50,298][1653645] Updated weights for policy 0, policy_version 749922 (0.0012) [2024-06-15 21:10:50,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 42052.5, 300 sec: 43653.6). Total num frames: 1535901696. Throughput: 0: 10979.5. Samples: 384032256. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:50,962][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:10:52,327][1653645] Updated weights for policy 0, policy_version 750007 (0.0011) [2024-06-15 21:10:54,176][1653645] Updated weights for policy 0, policy_version 750076 (0.0013) [2024-06-15 21:10:55,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1536163840. Throughput: 0: 10697.1. Samples: 384092160. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:10:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:10:55,970][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000750080_1536163840.pth... [2024-06-15 21:10:56,019][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000744960_1525678080.pth [2024-06-15 21:10:56,024][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000750080_1536163840.pth [2024-06-15 21:10:58,411][1653645] Updated weights for policy 0, policy_version 750128 (0.0016) [2024-06-15 21:11:00,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1536294912. Throughput: 0: 11025.0. Samples: 384164864. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:11:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:11:01,558][1653645] Updated weights for policy 0, policy_version 750164 (0.0012) [2024-06-15 21:11:03,438][1653645] Updated weights for policy 0, policy_version 750241 (0.0017) [2024-06-15 21:11:05,408][1653645] Updated weights for policy 0, policy_version 750304 (0.0013) [2024-06-15 21:11:05,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 44236.5, 300 sec: 44320.1). Total num frames: 1536655360. Throughput: 0: 10877.1. Samples: 384192000. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:11:05,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:11:08,717][1653645] Updated weights for policy 0, policy_version 750338 (0.0012) [2024-06-15 21:11:10,386][1653645] Updated weights for policy 0, policy_version 750400 (0.0033) [2024-06-15 21:11:10,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44243.0, 300 sec: 43764.7). Total num frames: 1536819200. Throughput: 0: 10912.1. Samples: 384265728. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:11:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:11:13,998][1653645] Updated weights for policy 0, policy_version 750448 (0.0010) [2024-06-15 21:11:15,715][1653645] Updated weights for policy 0, policy_version 750528 (0.0090) [2024-06-15 21:11:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.3, 300 sec: 43986.8). Total num frames: 1537081344. Throughput: 0: 10945.3. Samples: 384326656. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:11:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:11:17,839][1653645] Updated weights for policy 0, policy_version 750588 (0.0121) [2024-06-15 21:11:20,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1537245184. Throughput: 0: 10877.1. Samples: 384360960. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:11:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:11:21,810][1653645] Updated weights for policy 0, policy_version 750656 (0.0152) [2024-06-15 21:11:25,958][1648982] Fps is (10 sec: 26215.0, 60 sec: 41541.0, 300 sec: 43431.5). Total num frames: 1537343488. Throughput: 0: 11093.4. Samples: 384426496. Policy #0 lag: (min: 12.0, avg: 93.3, max: 268.0) [2024-06-15 21:11:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:11:27,724][1653645] Updated weights for policy 0, policy_version 750736 (0.0013) [2024-06-15 21:11:29,358][1651596] Signal inference workers to stop experience collection... (39000 times) [2024-06-15 21:11:29,406][1653645] InferenceWorker_p0-w0: stopping experience collection (39000 times) [2024-06-15 21:11:29,552][1651596] Signal inference workers to resume experience collection... (39000 times) [2024-06-15 21:11:29,553][1653645] InferenceWorker_p0-w0: resuming experience collection (39000 times) [2024-06-15 21:11:29,555][1653645] Updated weights for policy 0, policy_version 750800 (0.0047) [2024-06-15 21:11:30,857][1653645] Updated weights for policy 0, policy_version 750845 (0.0014) [2024-06-15 21:11:30,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1537736704. Throughput: 0: 10683.7. Samples: 384482304. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:11:30,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:11:33,971][1653645] Updated weights for policy 0, policy_version 750912 (0.0183) [2024-06-15 21:11:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.6, 300 sec: 43654.4). Total num frames: 1537867776. Throughput: 0: 10797.5. Samples: 384518144. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:11:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:11:39,134][1653645] Updated weights for policy 0, policy_version 750976 (0.0112) [2024-06-15 21:11:40,325][1653645] Updated weights for policy 0, policy_version 751033 (0.0012) [2024-06-15 21:11:40,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1538129920. Throughput: 0: 10888.5. Samples: 384582144. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:11:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:11:42,463][1653645] Updated weights for policy 0, policy_version 751076 (0.0013) [2024-06-15 21:11:45,396][1653645] Updated weights for policy 0, policy_version 751105 (0.0014) [2024-06-15 21:11:45,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 43986.9). Total num frames: 1538326528. Throughput: 0: 10717.8. Samples: 384647168. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:11:45,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:11:46,376][1653645] Updated weights for policy 0, policy_version 751159 (0.0106) [2024-06-15 21:11:50,561][1653645] Updated weights for policy 0, policy_version 751220 (0.0013) [2024-06-15 21:11:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1538523136. Throughput: 0: 11002.4. Samples: 384687104. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:11:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:11:52,149][1653645] Updated weights for policy 0, policy_version 751283 (0.0012) [2024-06-15 21:11:54,174][1653645] Updated weights for policy 0, policy_version 751352 (0.0044) [2024-06-15 21:11:55,963][1648982] Fps is (10 sec: 45850.1, 60 sec: 43686.6, 300 sec: 44320.1). Total num frames: 1538785280. Throughput: 0: 10716.5. Samples: 384748032. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:11:55,964][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:11:57,241][1653645] Updated weights for policy 0, policy_version 751396 (0.0046) [2024-06-15 21:12:00,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1538916352. Throughput: 0: 11059.3. Samples: 384824320. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:12:01,381][1653645] Updated weights for policy 0, policy_version 751429 (0.0089) [2024-06-15 21:12:03,665][1653645] Updated weights for policy 0, policy_version 751520 (0.0089) [2024-06-15 21:12:05,764][1653645] Updated weights for policy 0, policy_version 751555 (0.0019) [2024-06-15 21:12:05,958][1648982] Fps is (10 sec: 39343.9, 60 sec: 42052.5, 300 sec: 43986.9). Total num frames: 1539178496. Throughput: 0: 10911.3. Samples: 384851968. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:12:08,308][1653645] Updated weights for policy 0, policy_version 751622 (0.0015) [2024-06-15 21:12:09,469][1653645] Updated weights for policy 0, policy_version 751676 (0.0086) [2024-06-15 21:12:10,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1539440640. Throughput: 0: 10763.4. Samples: 384910848. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:12:14,632][1653645] Updated weights for policy 0, policy_version 751717 (0.0030) [2024-06-15 21:12:15,817][1651596] Signal inference workers to stop experience collection... (39050 times) [2024-06-15 21:12:15,860][1653645] InferenceWorker_p0-w0: stopping experience collection (39050 times) [2024-06-15 21:12:15,960][1648982] Fps is (10 sec: 42598.1, 60 sec: 42052.5, 300 sec: 43653.6). Total num frames: 1539604480. Throughput: 0: 11059.2. Samples: 384979968. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:15,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:12:16,097][1651596] Signal inference workers to resume experience collection... (39050 times) [2024-06-15 21:12:16,097][1653645] InferenceWorker_p0-w0: resuming experience collection (39050 times) [2024-06-15 21:12:16,610][1653645] Updated weights for policy 0, policy_version 751804 (0.0144) [2024-06-15 21:12:20,509][1653645] Updated weights for policy 0, policy_version 751873 (0.0018) [2024-06-15 21:12:20,962][1648982] Fps is (10 sec: 42599.5, 60 sec: 43690.7, 300 sec: 44320.2). Total num frames: 1539866624. Throughput: 0: 10888.6. Samples: 385008128. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:20,962][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:12:21,763][1653645] Updated weights for policy 0, policy_version 751936 (0.0020) [2024-06-15 21:12:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 43431.5). Total num frames: 1539997696. Throughput: 0: 11047.8. Samples: 385079296. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:12:27,363][1653645] Updated weights for policy 0, policy_version 752016 (0.0014) [2024-06-15 21:12:29,778][1653645] Updated weights for policy 0, policy_version 752069 (0.0015) [2024-06-15 21:12:30,883][1653645] Updated weights for policy 0, policy_version 752121 (0.0013) [2024-06-15 21:12:30,958][1648982] Fps is (10 sec: 45873.7, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1540325376. Throughput: 0: 10956.8. Samples: 385140224. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:30,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:12:32,829][1653645] Updated weights for policy 0, policy_version 752166 (0.0014) [2024-06-15 21:12:35,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 1540489216. Throughput: 0: 10911.3. Samples: 385178112. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:35,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:12:37,523][1653645] Updated weights for policy 0, policy_version 752210 (0.0014) [2024-06-15 21:12:39,041][1653645] Updated weights for policy 0, policy_version 752273 (0.0103) [2024-06-15 21:12:40,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1540751360. Throughput: 0: 11049.2. Samples: 385245184. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:12:41,501][1653645] Updated weights for policy 0, policy_version 752336 (0.0012) [2024-06-15 21:12:42,653][1653645] Updated weights for policy 0, policy_version 752383 (0.0065) [2024-06-15 21:12:44,995][1653645] Updated weights for policy 0, policy_version 752442 (0.0013) [2024-06-15 21:12:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 44097.9). Total num frames: 1541013504. Throughput: 0: 10854.4. Samples: 385312768. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:12:49,743][1653645] Updated weights for policy 0, policy_version 752496 (0.0014) [2024-06-15 21:12:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44236.8, 300 sec: 43653.7). Total num frames: 1541177344. Throughput: 0: 11104.7. Samples: 385351680. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:50,958][1648982] Avg episode reward: [(0, '37.140')] [2024-06-15 21:12:52,016][1653645] Updated weights for policy 0, policy_version 752569 (0.0012) [2024-06-15 21:12:53,465][1653645] Updated weights for policy 0, policy_version 752608 (0.0012) [2024-06-15 21:12:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44240.9, 300 sec: 44209.0). Total num frames: 1541439488. Throughput: 0: 11059.2. Samples: 385408512. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:12:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:12:56,317][1653645] Updated weights for policy 0, policy_version 752673 (0.0013) [2024-06-15 21:12:56,706][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000752688_1541505024.pth... [2024-06-15 21:12:56,765][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000747520_1530920960.pth [2024-06-15 21:13:00,959][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1541537792. Throughput: 0: 11150.2. Samples: 385481728. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:13:00,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:13:01,207][1653645] Updated weights for policy 0, policy_version 752707 (0.0018) [2024-06-15 21:13:02,396][1653645] Updated weights for policy 0, policy_version 752753 (0.0017) [2024-06-15 21:13:02,883][1651596] Signal inference workers to stop experience collection... (39100 times) [2024-06-15 21:13:02,995][1653645] InferenceWorker_p0-w0: stopping experience collection (39100 times) [2024-06-15 21:13:03,104][1651596] Signal inference workers to resume experience collection... (39100 times) [2024-06-15 21:13:03,105][1653645] InferenceWorker_p0-w0: resuming experience collection (39100 times) [2024-06-15 21:13:04,111][1653645] Updated weights for policy 0, policy_version 752827 (0.0120) [2024-06-15 21:13:05,459][1653645] Updated weights for policy 0, policy_version 752866 (0.0012) [2024-06-15 21:13:05,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1541931008. Throughput: 0: 11195.7. Samples: 385511936. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:13:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:13:07,663][1653645] Updated weights for policy 0, policy_version 752916 (0.0037) [2024-06-15 21:13:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1542062080. Throughput: 0: 11138.8. Samples: 385580544. Policy #0 lag: (min: 33.0, avg: 177.9, max: 289.0) [2024-06-15 21:13:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:13:13,709][1653645] Updated weights for policy 0, policy_version 752993 (0.0030) [2024-06-15 21:13:15,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1542258688. Throughput: 0: 11161.7. Samples: 385642496. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:13:16,152][1653645] Updated weights for policy 0, policy_version 753081 (0.0013) [2024-06-15 21:13:17,460][1653645] Updated weights for policy 0, policy_version 753124 (0.0038) [2024-06-15 21:13:19,288][1653645] Updated weights for policy 0, policy_version 753154 (0.0055) [2024-06-15 21:13:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 1542586368. Throughput: 0: 11025.1. Samples: 385674240. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:13:25,110][1653645] Updated weights for policy 0, policy_version 753217 (0.0088) [2024-06-15 21:13:25,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1542651904. Throughput: 0: 11059.2. Samples: 385742848. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:13:26,847][1653645] Updated weights for policy 0, policy_version 753284 (0.0014) [2024-06-15 21:13:28,377][1653645] Updated weights for policy 0, policy_version 753347 (0.0010) [2024-06-15 21:13:29,500][1653645] Updated weights for policy 0, policy_version 753395 (0.0010) [2024-06-15 21:13:30,961][1648982] Fps is (10 sec: 39308.8, 60 sec: 44234.6, 300 sec: 44208.5). Total num frames: 1542979584. Throughput: 0: 11035.7. Samples: 385809408. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:30,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:13:31,753][1653645] Updated weights for policy 0, policy_version 753425 (0.0013) [2024-06-15 21:13:35,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1543110656. Throughput: 0: 11002.3. Samples: 385846784. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:35,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:13:36,357][1653645] Updated weights for policy 0, policy_version 753475 (0.0012) [2024-06-15 21:13:38,065][1653645] Updated weights for policy 0, policy_version 753540 (0.0070) [2024-06-15 21:13:39,571][1653645] Updated weights for policy 0, policy_version 753600 (0.0011) [2024-06-15 21:13:40,958][1648982] Fps is (10 sec: 49168.1, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1543471104. Throughput: 0: 11241.3. Samples: 385914368. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:13:43,015][1653645] Updated weights for policy 0, policy_version 753665 (0.0013) [2024-06-15 21:13:45,959][1648982] Fps is (10 sec: 52424.4, 60 sec: 43689.9, 300 sec: 43986.7). Total num frames: 1543634944. Throughput: 0: 11081.7. Samples: 385980416. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:45,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:13:48,223][1651596] Signal inference workers to stop experience collection... (39150 times) [2024-06-15 21:13:48,280][1653645] InferenceWorker_p0-w0: stopping experience collection (39150 times) [2024-06-15 21:13:48,451][1651596] Signal inference workers to resume experience collection... (39150 times) [2024-06-15 21:13:48,451][1653645] InferenceWorker_p0-w0: resuming experience collection (39150 times) [2024-06-15 21:13:48,704][1653645] Updated weights for policy 0, policy_version 753749 (0.0012) [2024-06-15 21:13:50,563][1653645] Updated weights for policy 0, policy_version 753824 (0.0019) [2024-06-15 21:13:50,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 1543864320. Throughput: 0: 11252.6. Samples: 386018304. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:13:52,612][1653645] Updated weights for policy 0, policy_version 753904 (0.0011) [2024-06-15 21:13:55,440][1653645] Updated weights for policy 0, policy_version 753952 (0.0015) [2024-06-15 21:13:55,972][1648982] Fps is (10 sec: 52357.6, 60 sec: 45318.0, 300 sec: 44540.0). Total num frames: 1544159232. Throughput: 0: 11021.5. Samples: 386076672. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:13:55,973][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:14:00,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 44782.9, 300 sec: 43653.6). Total num frames: 1544224768. Throughput: 0: 11241.2. Samples: 386148352. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:14:00,972][1653645] Updated weights for policy 0, policy_version 754032 (0.0095) [2024-06-15 21:14:03,163][1653645] Updated weights for policy 0, policy_version 754099 (0.0011) [2024-06-15 21:14:04,551][1653645] Updated weights for policy 0, policy_version 754168 (0.0015) [2024-06-15 21:14:05,960][1648982] Fps is (10 sec: 39379.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1544552448. Throughput: 0: 11138.8. Samples: 386175488. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:05,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:14:07,645][1653645] Updated weights for policy 0, policy_version 754212 (0.0012) [2024-06-15 21:14:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1544683520. Throughput: 0: 11070.6. Samples: 386241024. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:14:12,855][1653645] Updated weights for policy 0, policy_version 754275 (0.0016) [2024-06-15 21:14:14,101][1653645] Updated weights for policy 0, policy_version 754327 (0.0012) [2024-06-15 21:14:15,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 1544978432. Throughput: 0: 11082.8. Samples: 386308096. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:14:16,288][1653645] Updated weights for policy 0, policy_version 754403 (0.0014) [2024-06-15 21:14:18,541][1653645] Updated weights for policy 0, policy_version 754464 (0.0013) [2024-06-15 21:14:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1545207808. Throughput: 0: 11138.9. Samples: 386348032. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:14:25,110][1653645] Updated weights for policy 0, policy_version 754546 (0.0014) [2024-06-15 21:14:25,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 45329.1, 300 sec: 43986.8). Total num frames: 1545371648. Throughput: 0: 11150.2. Samples: 386416128. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:14:26,435][1653645] Updated weights for policy 0, policy_version 754609 (0.0052) [2024-06-15 21:14:28,057][1653645] Updated weights for policy 0, policy_version 754663 (0.0012) [2024-06-15 21:14:30,247][1653645] Updated weights for policy 0, policy_version 754704 (0.0082) [2024-06-15 21:14:30,280][1651596] Signal inference workers to stop experience collection... (39200 times) [2024-06-15 21:14:30,343][1653645] InferenceWorker_p0-w0: stopping experience collection (39200 times) [2024-06-15 21:14:30,529][1651596] Signal inference workers to resume experience collection... (39200 times) [2024-06-15 21:14:30,530][1653645] InferenceWorker_p0-w0: resuming experience collection (39200 times) [2024-06-15 21:14:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45331.5, 300 sec: 44542.3). Total num frames: 1545699328. Throughput: 0: 11025.3. Samples: 386476544. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:14:31,197][1653645] Updated weights for policy 0, policy_version 754752 (0.0012) [2024-06-15 21:14:35,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44237.0, 300 sec: 43653.7). Total num frames: 1545764864. Throughput: 0: 11047.8. Samples: 386515456. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:14:37,071][1653645] Updated weights for policy 0, policy_version 754821 (0.0013) [2024-06-15 21:14:38,363][1653645] Updated weights for policy 0, policy_version 754881 (0.0012) [2024-06-15 21:14:39,527][1653645] Updated weights for policy 0, policy_version 754938 (0.0037) [2024-06-15 21:14:40,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 44236.5, 300 sec: 44431.1). Total num frames: 1546125312. Throughput: 0: 11085.5. Samples: 386575360. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:14:42,978][1653645] Updated weights for policy 0, policy_version 754999 (0.0012) [2024-06-15 21:14:45,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43691.5, 300 sec: 43653.7). Total num frames: 1546256384. Throughput: 0: 11138.9. Samples: 386649600. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:14:47,445][1653645] Updated weights for policy 0, policy_version 755040 (0.0012) [2024-06-15 21:14:48,649][1653645] Updated weights for policy 0, policy_version 755104 (0.0014) [2024-06-15 21:14:50,166][1653645] Updated weights for policy 0, policy_version 755155 (0.0020) [2024-06-15 21:14:50,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 1546616832. Throughput: 0: 11343.6. Samples: 386685952. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:14:53,548][1653645] Updated weights for policy 0, policy_version 755204 (0.0014) [2024-06-15 21:14:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43701.3, 300 sec: 44320.1). Total num frames: 1546780672. Throughput: 0: 11298.1. Samples: 386749440. Policy #0 lag: (min: 52.0, avg: 120.3, max: 308.0) [2024-06-15 21:14:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:14:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000755264_1546780672.pth... [2024-06-15 21:14:56,007][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000750080_1536163840.pth [2024-06-15 21:14:58,403][1653645] Updated weights for policy 0, policy_version 755281 (0.0014) [2024-06-15 21:14:59,343][1653645] Updated weights for policy 0, policy_version 755332 (0.0013) [2024-06-15 21:15:00,605][1653645] Updated weights for policy 0, policy_version 755389 (0.0014) [2024-06-15 21:15:00,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 46967.6, 300 sec: 44209.0). Total num frames: 1547042816. Throughput: 0: 11400.5. Samples: 386821120. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:00,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:15:02,179][1653645] Updated weights for policy 0, policy_version 755440 (0.0015) [2024-06-15 21:15:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44782.9, 300 sec: 44321.4). Total num frames: 1547239424. Throughput: 0: 11207.1. Samples: 386852352. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:15:06,095][1653645] Updated weights for policy 0, policy_version 755492 (0.0011) [2024-06-15 21:15:09,752][1653645] Updated weights for policy 0, policy_version 755536 (0.0012) [2024-06-15 21:15:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1547436032. Throughput: 0: 11411.9. Samples: 386929664. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:15:11,282][1653645] Updated weights for policy 0, policy_version 755600 (0.0089) [2024-06-15 21:15:13,150][1653645] Updated weights for policy 0, policy_version 755681 (0.0013) [2024-06-15 21:15:15,958][1648982] Fps is (10 sec: 45874.0, 60 sec: 45328.8, 300 sec: 44431.1). Total num frames: 1547698176. Throughput: 0: 11389.1. Samples: 386989056. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:15:16,937][1651596] Signal inference workers to stop experience collection... (39250 times) [2024-06-15 21:15:16,974][1653645] InferenceWorker_p0-w0: stopping experience collection (39250 times) [2024-06-15 21:15:17,296][1651596] Signal inference workers to resume experience collection... (39250 times) [2024-06-15 21:15:17,297][1653645] InferenceWorker_p0-w0: resuming experience collection (39250 times) [2024-06-15 21:15:17,678][1653645] Updated weights for policy 0, policy_version 755744 (0.0017) [2024-06-15 21:15:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 44105.5). Total num frames: 1547862016. Throughput: 0: 11320.9. Samples: 387024896. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:15:21,239][1653645] Updated weights for policy 0, policy_version 755808 (0.0048) [2024-06-15 21:15:22,946][1653645] Updated weights for policy 0, policy_version 755872 (0.0037) [2024-06-15 21:15:25,277][1653645] Updated weights for policy 0, policy_version 755959 (0.0016) [2024-06-15 21:15:25,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 47513.6, 300 sec: 44431.2). Total num frames: 1548222464. Throughput: 0: 11434.7. Samples: 387089920. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:15:30,376][1653645] Updated weights for policy 0, policy_version 756032 (0.0013) [2024-06-15 21:15:30,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1548353536. Throughput: 0: 11218.5. Samples: 387154432. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:15:33,755][1653645] Updated weights for policy 0, policy_version 756089 (0.0028) [2024-06-15 21:15:35,403][1653645] Updated weights for policy 0, policy_version 756150 (0.0012) [2024-06-15 21:15:35,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 47513.3, 300 sec: 44431.2). Total num frames: 1548615680. Throughput: 0: 11320.8. Samples: 387195392. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:35,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:15:37,115][1653645] Updated weights for policy 0, policy_version 756217 (0.0012) [2024-06-15 21:15:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.9, 300 sec: 44320.1). Total num frames: 1548746752. Throughput: 0: 11207.1. Samples: 387253760. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:15:42,047][1653645] Updated weights for policy 0, policy_version 756278 (0.0012) [2024-06-15 21:15:45,900][1653645] Updated weights for policy 0, policy_version 756322 (0.0013) [2024-06-15 21:15:45,958][1648982] Fps is (10 sec: 32768.9, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1548943360. Throughput: 0: 11298.1. Samples: 387329536. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:45,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:15:47,460][1653645] Updated weights for policy 0, policy_version 756388 (0.0014) [2024-06-15 21:15:49,545][1653645] Updated weights for policy 0, policy_version 756471 (0.0015) [2024-06-15 21:15:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1549271040. Throughput: 0: 11161.6. Samples: 387354624. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:15:53,443][1653645] Updated weights for policy 0, policy_version 756539 (0.0013) [2024-06-15 21:15:55,957][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1549402112. Throughput: 0: 10979.6. Samples: 387423744. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:15:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:15:57,612][1653645] Updated weights for policy 0, policy_version 756578 (0.0014) [2024-06-15 21:15:59,296][1653645] Updated weights for policy 0, policy_version 756640 (0.0017) [2024-06-15 21:16:00,484][1651596] Signal inference workers to stop experience collection... (39300 times) [2024-06-15 21:16:00,592][1653645] InferenceWorker_p0-w0: stopping experience collection (39300 times) [2024-06-15 21:16:00,746][1651596] Signal inference workers to resume experience collection... (39300 times) [2024-06-15 21:16:00,747][1653645] InferenceWorker_p0-w0: resuming experience collection (39300 times) [2024-06-15 21:16:00,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.0, 300 sec: 44320.2). Total num frames: 1549729792. Throughput: 0: 11093.4. Samples: 387488256. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:16:01,174][1653645] Updated weights for policy 0, policy_version 756720 (0.0014) [2024-06-15 21:16:05,333][1653645] Updated weights for policy 0, policy_version 756795 (0.0013) [2024-06-15 21:16:05,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1549926400. Throughput: 0: 11025.1. Samples: 387521024. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:16:10,280][1653645] Updated weights for policy 0, policy_version 756859 (0.0030) [2024-06-15 21:16:10,958][1648982] Fps is (10 sec: 32766.7, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1550057472. Throughput: 0: 11161.5. Samples: 387592192. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:16:12,569][1653645] Updated weights for policy 0, policy_version 756928 (0.0012) [2024-06-15 21:16:15,742][1653645] Updated weights for policy 0, policy_version 756995 (0.0112) [2024-06-15 21:16:15,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 44237.0, 300 sec: 44431.2). Total num frames: 1550352384. Throughput: 0: 10990.9. Samples: 387649024. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:16:16,837][1653645] Updated weights for policy 0, policy_version 757056 (0.0109) [2024-06-15 21:16:20,958][1648982] Fps is (10 sec: 39323.2, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1550450688. Throughput: 0: 10888.6. Samples: 387685376. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:16:22,334][1653645] Updated weights for policy 0, policy_version 757117 (0.0014) [2024-06-15 21:16:23,941][1653645] Updated weights for policy 0, policy_version 757172 (0.0102) [2024-06-15 21:16:25,207][1653645] Updated weights for policy 0, policy_version 757248 (0.0013) [2024-06-15 21:16:25,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1550843904. Throughput: 0: 11013.7. Samples: 387749376. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:16:27,962][1653645] Updated weights for policy 0, policy_version 757309 (0.0013) [2024-06-15 21:16:30,958][1648982] Fps is (10 sec: 52426.3, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1550974976. Throughput: 0: 10956.7. Samples: 387822592. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:30,959][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 21:16:34,424][1653645] Updated weights for policy 0, policy_version 757392 (0.0029) [2024-06-15 21:16:35,155][1653645] Updated weights for policy 0, policy_version 757435 (0.0019) [2024-06-15 21:16:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1551237120. Throughput: 0: 11161.6. Samples: 387856896. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:16:37,361][1653645] Updated weights for policy 0, policy_version 757504 (0.0019) [2024-06-15 21:16:38,667][1653645] Updated weights for policy 0, policy_version 757552 (0.0012) [2024-06-15 21:16:40,958][1648982] Fps is (10 sec: 52430.8, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 1551499264. Throughput: 0: 11150.2. Samples: 387925504. Policy #0 lag: (min: 13.0, avg: 88.6, max: 269.0) [2024-06-15 21:16:40,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:16:45,087][1653645] Updated weights for policy 0, policy_version 757625 (0.0015) [2024-06-15 21:16:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45329.0, 300 sec: 44542.3). Total num frames: 1551663104. Throughput: 0: 11218.5. Samples: 387993088. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:16:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:16:46,553][1653645] Updated weights for policy 0, policy_version 757680 (0.0013) [2024-06-15 21:16:47,717][1651596] Signal inference workers to stop experience collection... (39350 times) [2024-06-15 21:16:47,799][1653645] InferenceWorker_p0-w0: stopping experience collection (39350 times) [2024-06-15 21:16:47,956][1651596] Signal inference workers to resume experience collection... (39350 times) [2024-06-15 21:16:47,957][1653645] InferenceWorker_p0-w0: resuming experience collection (39350 times) [2024-06-15 21:16:48,946][1653645] Updated weights for policy 0, policy_version 757747 (0.0014) [2024-06-15 21:16:50,671][1653645] Updated weights for policy 0, policy_version 757819 (0.0038) [2024-06-15 21:16:50,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 44876.4). Total num frames: 1552023552. Throughput: 0: 11264.0. Samples: 388027904. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:16:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:16:55,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 43690.3, 300 sec: 44431.1). Total num frames: 1552023552. Throughput: 0: 11059.2. Samples: 388089856. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:16:55,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:16:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000757824_1552023552.pth... [2024-06-15 21:16:56,017][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000752688_1541505024.pth [2024-06-15 21:16:57,548][1653645] Updated weights for policy 0, policy_version 757877 (0.0014) [2024-06-15 21:16:58,935][1653645] Updated weights for policy 0, policy_version 757951 (0.0014) [2024-06-15 21:17:00,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 43690.5, 300 sec: 44653.3). Total num frames: 1552351232. Throughput: 0: 11343.6. Samples: 388159488. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:00,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:17:01,412][1653645] Updated weights for policy 0, policy_version 758002 (0.0011) [2024-06-15 21:17:02,858][1653645] Updated weights for policy 0, policy_version 758073 (0.0013) [2024-06-15 21:17:05,962][1648982] Fps is (10 sec: 52406.8, 60 sec: 43687.3, 300 sec: 44430.5). Total num frames: 1552547840. Throughput: 0: 11183.2. Samples: 388188672. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:05,963][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:17:09,226][1653645] Updated weights for policy 0, policy_version 758144 (0.0013) [2024-06-15 21:17:10,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 45329.3, 300 sec: 44653.3). Total num frames: 1552777216. Throughput: 0: 11286.8. Samples: 388257280. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:17:10,965][1653645] Updated weights for policy 0, policy_version 758204 (0.0014) [2024-06-15 21:17:12,929][1653645] Updated weights for policy 0, policy_version 758241 (0.0013) [2024-06-15 21:17:14,541][1653645] Updated weights for policy 0, policy_version 758328 (0.0012) [2024-06-15 21:17:15,958][1648982] Fps is (10 sec: 52452.1, 60 sec: 45328.9, 300 sec: 44764.4). Total num frames: 1553072128. Throughput: 0: 11013.7. Samples: 388318208. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:17:20,957][1648982] Fps is (10 sec: 32768.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1553104896. Throughput: 0: 11059.2. Samples: 388354560. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:17:21,900][1653645] Updated weights for policy 0, policy_version 758400 (0.0133) [2024-06-15 21:17:23,229][1653645] Updated weights for policy 0, policy_version 758458 (0.0012) [2024-06-15 21:17:25,568][1653645] Updated weights for policy 0, policy_version 758528 (0.0013) [2024-06-15 21:17:25,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1553465344. Throughput: 0: 10922.7. Samples: 388417024. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:17:27,057][1653645] Updated weights for policy 0, policy_version 758583 (0.0012) [2024-06-15 21:17:30,958][1648982] Fps is (10 sec: 49149.5, 60 sec: 43690.7, 300 sec: 44431.1). Total num frames: 1553596416. Throughput: 0: 10729.2. Samples: 388475904. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:17:33,640][1651596] Signal inference workers to stop experience collection... (39400 times) [2024-06-15 21:17:33,717][1653645] InferenceWorker_p0-w0: stopping experience collection (39400 times) [2024-06-15 21:17:33,941][1651596] Signal inference workers to resume experience collection... (39400 times) [2024-06-15 21:17:33,942][1653645] InferenceWorker_p0-w0: resuming experience collection (39400 times) [2024-06-15 21:17:34,231][1653645] Updated weights for policy 0, policy_version 758628 (0.0129) [2024-06-15 21:17:35,568][1653645] Updated weights for policy 0, policy_version 758688 (0.0012) [2024-06-15 21:17:35,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1553825792. Throughput: 0: 10774.7. Samples: 388512768. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:17:36,205][1653645] Updated weights for policy 0, policy_version 758718 (0.0018) [2024-06-15 21:17:37,727][1653645] Updated weights for policy 0, policy_version 758768 (0.0047) [2024-06-15 21:17:39,475][1653645] Updated weights for policy 0, policy_version 758846 (0.0089) [2024-06-15 21:17:40,958][1648982] Fps is (10 sec: 52430.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1554120704. Throughput: 0: 10661.0. Samples: 388569600. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:17:45,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42052.2, 300 sec: 44098.0). Total num frames: 1554186240. Throughput: 0: 10797.5. Samples: 388645376. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:45,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:17:47,071][1653645] Updated weights for policy 0, policy_version 758928 (0.0011) [2024-06-15 21:17:49,065][1653645] Updated weights for policy 0, policy_version 758978 (0.0018) [2024-06-15 21:17:50,801][1653645] Updated weights for policy 0, policy_version 759056 (0.0013) [2024-06-15 21:17:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 44431.2). Total num frames: 1554546688. Throughput: 0: 10741.7. Samples: 388672000. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:50,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:17:55,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 1554644992. Throughput: 0: 10604.1. Samples: 388734464. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:17:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:17:57,055][1653645] Updated weights for policy 0, policy_version 759109 (0.0013) [2024-06-15 21:17:58,444][1653645] Updated weights for policy 0, policy_version 759168 (0.0011) [2024-06-15 21:17:59,906][1653645] Updated weights for policy 0, policy_version 759226 (0.0010) [2024-06-15 21:18:00,960][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 1554939904. Throughput: 0: 10831.7. Samples: 388805632. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:00,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:18:01,744][1653645] Updated weights for policy 0, policy_version 759280 (0.0012) [2024-06-15 21:18:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43694.0, 300 sec: 44431.2). Total num frames: 1555169280. Throughput: 0: 10604.1. Samples: 388831744. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:18:08,863][1653645] Updated weights for policy 0, policy_version 759362 (0.0013) [2024-06-15 21:18:10,457][1653645] Updated weights for policy 0, policy_version 759425 (0.0021) [2024-06-15 21:18:10,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 1555333120. Throughput: 0: 10922.7. Samples: 388908544. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:18:13,692][1653645] Updated weights for policy 0, policy_version 759505 (0.0012) [2024-06-15 21:18:14,384][1651596] Signal inference workers to stop experience collection... (39450 times) [2024-06-15 21:18:14,429][1653645] InferenceWorker_p0-w0: stopping experience collection (39450 times) [2024-06-15 21:18:14,614][1651596] Signal inference workers to resume experience collection... (39450 times) [2024-06-15 21:18:14,620][1653645] InferenceWorker_p0-w0: resuming experience collection (39450 times) [2024-06-15 21:18:14,983][1653645] Updated weights for policy 0, policy_version 759568 (0.0013) [2024-06-15 21:18:15,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 43144.4, 300 sec: 44320.1). Total num frames: 1555660800. Throughput: 0: 10911.3. Samples: 388966912. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:15,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:18:15,985][1653645] Updated weights for policy 0, policy_version 759612 (0.0038) [2024-06-15 21:18:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1555726336. Throughput: 0: 10911.3. Samples: 389003776. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:18:21,716][1653645] Updated weights for policy 0, policy_version 759680 (0.0011) [2024-06-15 21:18:22,806][1653645] Updated weights for policy 0, policy_version 759740 (0.0014) [2024-06-15 21:18:25,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43144.5, 300 sec: 44320.6). Total num frames: 1556054016. Throughput: 0: 11309.5. Samples: 389078528. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:25,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:18:26,230][1653645] Updated weights for policy 0, policy_version 759809 (0.0013) [2024-06-15 21:18:30,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43691.0, 300 sec: 44431.2). Total num frames: 1556217856. Throughput: 0: 10899.9. Samples: 389135872. Policy #0 lag: (min: 15.0, avg: 86.5, max: 271.0) [2024-06-15 21:18:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:18:31,959][1653645] Updated weights for policy 0, policy_version 759873 (0.0015) [2024-06-15 21:18:32,872][1653645] Updated weights for policy 0, policy_version 759928 (0.0033) [2024-06-15 21:18:34,266][1653645] Updated weights for policy 0, policy_version 759991 (0.0107) [2024-06-15 21:18:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 1556480000. Throughput: 0: 11184.4. Samples: 389175296. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:18:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:18:36,720][1653645] Updated weights for policy 0, policy_version 760033 (0.0011) [2024-06-15 21:18:38,644][1653645] Updated weights for policy 0, policy_version 760112 (0.0134) [2024-06-15 21:18:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.8, 300 sec: 44431.4). Total num frames: 1556742144. Throughput: 0: 11161.6. Samples: 389236736. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:18:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:18:44,377][1653645] Updated weights for policy 0, policy_version 760182 (0.0075) [2024-06-15 21:18:45,216][1653645] Updated weights for policy 0, policy_version 760212 (0.0011) [2024-06-15 21:18:45,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 46421.4, 300 sec: 44431.2). Total num frames: 1556971520. Throughput: 0: 11207.2. Samples: 389309952. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:18:45,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:18:46,044][1653645] Updated weights for policy 0, policy_version 760251 (0.0013) [2024-06-15 21:18:48,398][1653645] Updated weights for policy 0, policy_version 760304 (0.0013) [2024-06-15 21:18:50,524][1653645] Updated weights for policy 0, policy_version 760380 (0.0021) [2024-06-15 21:18:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45329.2, 300 sec: 44433.4). Total num frames: 1557266432. Throughput: 0: 11320.9. Samples: 389341184. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:18:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:18:55,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 44782.7, 300 sec: 44431.2). Total num frames: 1557331968. Throughput: 0: 11127.4. Samples: 389409280. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:18:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:18:56,282][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000760432_1557364736.pth... [2024-06-15 21:18:56,334][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000755264_1546780672.pth [2024-06-15 21:18:56,535][1653645] Updated weights for policy 0, policy_version 760442 (0.0020) [2024-06-15 21:18:58,119][1653645] Updated weights for policy 0, policy_version 760503 (0.0013) [2024-06-15 21:18:59,713][1651596] Signal inference workers to stop experience collection... (39500 times) [2024-06-15 21:18:59,766][1653645] InferenceWorker_p0-w0: stopping experience collection (39500 times) [2024-06-15 21:18:59,983][1651596] Signal inference workers to resume experience collection... (39500 times) [2024-06-15 21:18:59,984][1653645] InferenceWorker_p0-w0: resuming experience collection (39500 times) [2024-06-15 21:19:00,800][1653645] Updated weights for policy 0, policy_version 760551 (0.0015) [2024-06-15 21:19:00,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 1557594112. Throughput: 0: 11241.3. Samples: 389472768. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:19:02,600][1653645] Updated weights for policy 0, policy_version 760612 (0.0013) [2024-06-15 21:19:05,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1557790720. Throughput: 0: 10968.2. Samples: 389497344. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:19:07,913][1653645] Updated weights for policy 0, policy_version 760672 (0.0015) [2024-06-15 21:19:10,380][1653645] Updated weights for policy 0, policy_version 760752 (0.0108) [2024-06-15 21:19:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1558052864. Throughput: 0: 10899.9. Samples: 389569024. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:19:13,833][1653645] Updated weights for policy 0, policy_version 760817 (0.0013) [2024-06-15 21:19:15,391][1653645] Updated weights for policy 0, policy_version 760885 (0.0013) [2024-06-15 21:19:15,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1558315008. Throughput: 0: 10865.7. Samples: 389624832. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:19:19,810][1653645] Updated weights for policy 0, policy_version 760929 (0.0014) [2024-06-15 21:19:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1558446080. Throughput: 0: 10934.1. Samples: 389667328. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:19:21,866][1653645] Updated weights for policy 0, policy_version 760981 (0.0014) [2024-06-15 21:19:22,864][1653645] Updated weights for policy 0, policy_version 761023 (0.0032) [2024-06-15 21:19:25,722][1653645] Updated weights for policy 0, policy_version 761072 (0.0014) [2024-06-15 21:19:25,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1558675456. Throughput: 0: 10990.9. Samples: 389731328. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:25,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:19:27,386][1653645] Updated weights for policy 0, policy_version 761151 (0.0011) [2024-06-15 21:19:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1558839296. Throughput: 0: 10945.4. Samples: 389802496. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:19:32,206][1653645] Updated weights for policy 0, policy_version 761213 (0.0019) [2024-06-15 21:19:34,420][1653645] Updated weights for policy 0, policy_version 761271 (0.0012) [2024-06-15 21:19:35,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1559101440. Throughput: 0: 10968.2. Samples: 389834752. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:35,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:19:37,484][1653645] Updated weights for policy 0, policy_version 761331 (0.0013) [2024-06-15 21:19:39,059][1653645] Updated weights for policy 0, policy_version 761403 (0.0024) [2024-06-15 21:19:40,960][1648982] Fps is (10 sec: 52416.6, 60 sec: 43689.0, 300 sec: 44430.8). Total num frames: 1559363584. Throughput: 0: 10740.1. Samples: 389892608. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:40,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:19:44,565][1653645] Updated weights for policy 0, policy_version 761468 (0.0012) [2024-06-15 21:19:44,709][1651596] Signal inference workers to stop experience collection... (39550 times) [2024-06-15 21:19:44,771][1653645] InferenceWorker_p0-w0: stopping experience collection (39550 times) [2024-06-15 21:19:44,959][1651596] Signal inference workers to resume experience collection... (39550 times) [2024-06-15 21:19:44,960][1653645] InferenceWorker_p0-w0: resuming experience collection (39550 times) [2024-06-15 21:19:45,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1559592960. Throughput: 0: 10956.8. Samples: 389965824. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:45,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:19:46,112][1653645] Updated weights for policy 0, policy_version 761535 (0.0083) [2024-06-15 21:19:49,962][1653645] Updated weights for policy 0, policy_version 761616 (0.0063) [2024-06-15 21:19:50,958][1648982] Fps is (10 sec: 49163.6, 60 sec: 43144.6, 300 sec: 44320.1). Total num frames: 1559855104. Throughput: 0: 11127.5. Samples: 389998080. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:19:51,119][1653645] Updated weights for policy 0, policy_version 761662 (0.0039) [2024-06-15 21:19:55,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44237.0, 300 sec: 43875.8). Total num frames: 1559986176. Throughput: 0: 10922.7. Samples: 390060544. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:19:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:19:56,084][1653645] Updated weights for policy 0, policy_version 761717 (0.0013) [2024-06-15 21:19:57,503][1653645] Updated weights for policy 0, policy_version 761745 (0.0011) [2024-06-15 21:19:58,307][1653645] Updated weights for policy 0, policy_version 761790 (0.0012) [2024-06-15 21:20:00,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 44236.6, 300 sec: 44097.9). Total num frames: 1560248320. Throughput: 0: 11207.1. Samples: 390129152. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:20:00,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 21:20:01,351][1653645] Updated weights for policy 0, policy_version 761861 (0.0134) [2024-06-15 21:20:05,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1560412160. Throughput: 0: 10865.8. Samples: 390156288. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:20:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:20:06,892][1653645] Updated weights for policy 0, policy_version 761921 (0.0114) [2024-06-15 21:20:08,248][1653645] Updated weights for policy 0, policy_version 761978 (0.0011) [2024-06-15 21:20:09,739][1653645] Updated weights for policy 0, policy_version 762043 (0.0013) [2024-06-15 21:20:10,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1560674304. Throughput: 0: 10991.0. Samples: 390225920. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:20:10,959][1648982] Avg episode reward: [(0, '37.550')] [2024-06-15 21:20:13,052][1653645] Updated weights for policy 0, policy_version 762081 (0.0013) [2024-06-15 21:20:14,968][1653645] Updated weights for policy 0, policy_version 762147 (0.0013) [2024-06-15 21:20:15,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1560936448. Throughput: 0: 10717.9. Samples: 390284800. Policy #0 lag: (min: 11.0, avg: 90.9, max: 267.0) [2024-06-15 21:20:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:20:20,366][1653645] Updated weights for policy 0, policy_version 762208 (0.0012) [2024-06-15 21:20:20,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 1561034752. Throughput: 0: 10820.3. Samples: 390321664. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:20:21,860][1653645] Updated weights for policy 0, policy_version 762280 (0.0015) [2024-06-15 21:20:24,607][1653645] Updated weights for policy 0, policy_version 762339 (0.0030) [2024-06-15 21:20:25,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44783.1, 300 sec: 44098.0). Total num frames: 1561362432. Throughput: 0: 11150.8. Samples: 390394368. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:25,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:20:26,604][1653645] Updated weights for policy 0, policy_version 762421 (0.0013) [2024-06-15 21:20:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1561460736. Throughput: 0: 10854.4. Samples: 390454272. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:20:31,245][1651596] Signal inference workers to stop experience collection... (39600 times) [2024-06-15 21:20:31,285][1653645] InferenceWorker_p0-w0: stopping experience collection (39600 times) [2024-06-15 21:20:31,435][1651596] Signal inference workers to resume experience collection... (39600 times) [2024-06-15 21:20:31,436][1653645] InferenceWorker_p0-w0: resuming experience collection (39600 times) [2024-06-15 21:20:32,191][1653645] Updated weights for policy 0, policy_version 762484 (0.0011) [2024-06-15 21:20:33,655][1653645] Updated weights for policy 0, policy_version 762533 (0.0013) [2024-06-15 21:20:35,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1561755648. Throughput: 0: 10945.4. Samples: 390490624. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:20:36,170][1653645] Updated weights for policy 0, policy_version 762596 (0.0015) [2024-06-15 21:20:37,985][1653645] Updated weights for policy 0, policy_version 762665 (0.0012) [2024-06-15 21:20:40,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43692.3, 300 sec: 44209.0). Total num frames: 1561985024. Throughput: 0: 11036.4. Samples: 390557184. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:20:42,956][1653645] Updated weights for policy 0, policy_version 762704 (0.0015) [2024-06-15 21:20:44,992][1653645] Updated weights for policy 0, policy_version 762770 (0.0012) [2024-06-15 21:20:45,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1562247168. Throughput: 0: 11047.9. Samples: 390626304. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:45,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:20:47,028][1653645] Updated weights for policy 0, policy_version 762832 (0.0012) [2024-06-15 21:20:48,299][1653645] Updated weights for policy 0, policy_version 762880 (0.0094) [2024-06-15 21:20:50,222][1653645] Updated weights for policy 0, policy_version 762944 (0.0013) [2024-06-15 21:20:50,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 44236.6, 300 sec: 44431.1). Total num frames: 1562509312. Throughput: 0: 11138.8. Samples: 390657536. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:20:55,958][1648982] Fps is (10 sec: 32766.8, 60 sec: 43144.2, 300 sec: 43542.5). Total num frames: 1562574848. Throughput: 0: 11036.4. Samples: 390722560. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:20:55,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:20:56,349][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000763008_1562640384.pth... [2024-06-15 21:20:56,390][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000757824_1552023552.pth [2024-06-15 21:20:56,943][1653645] Updated weights for policy 0, policy_version 763009 (0.0012) [2024-06-15 21:20:58,322][1653645] Updated weights for policy 0, policy_version 763066 (0.0011) [2024-06-15 21:21:00,040][1653645] Updated weights for policy 0, policy_version 763120 (0.0012) [2024-06-15 21:21:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 1562902528. Throughput: 0: 11059.2. Samples: 390782464. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:00,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:21:01,246][1653645] Updated weights for policy 0, policy_version 763155 (0.0012) [2024-06-15 21:21:02,083][1653645] Updated weights for policy 0, policy_version 763199 (0.0012) [2024-06-15 21:21:05,962][1648982] Fps is (10 sec: 45855.4, 60 sec: 43687.3, 300 sec: 43986.2). Total num frames: 1563033600. Throughput: 0: 11046.7. Samples: 390818816. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:05,963][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:21:08,251][1653645] Updated weights for policy 0, policy_version 763258 (0.0023) [2024-06-15 21:21:10,433][1653645] Updated weights for policy 0, policy_version 763323 (0.0013) [2024-06-15 21:21:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1563295744. Throughput: 0: 10956.7. Samples: 390887424. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:21:11,930][1653645] Updated weights for policy 0, policy_version 763384 (0.0121) [2024-06-15 21:21:13,594][1653645] Updated weights for policy 0, policy_version 763429 (0.0012) [2024-06-15 21:21:15,958][1648982] Fps is (10 sec: 52453.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1563557888. Throughput: 0: 10990.9. Samples: 390948864. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:21:19,145][1651596] Signal inference workers to stop experience collection... (39650 times) [2024-06-15 21:21:19,181][1653645] InferenceWorker_p0-w0: stopping experience collection (39650 times) [2024-06-15 21:21:19,504][1651596] Signal inference workers to resume experience collection... (39650 times) [2024-06-15 21:21:19,504][1653645] InferenceWorker_p0-w0: resuming experience collection (39650 times) [2024-06-15 21:21:19,665][1653645] Updated weights for policy 0, policy_version 763491 (0.0012) [2024-06-15 21:21:20,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 44236.8, 300 sec: 43542.6). Total num frames: 1563688960. Throughput: 0: 11002.3. Samples: 390985728. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:21:21,074][1653645] Updated weights for policy 0, policy_version 763521 (0.0016) [2024-06-15 21:21:22,797][1653645] Updated weights for policy 0, policy_version 763600 (0.0013) [2024-06-15 21:21:23,689][1653645] Updated weights for policy 0, policy_version 763642 (0.0024) [2024-06-15 21:21:25,663][1653645] Updated weights for policy 0, policy_version 763712 (0.0013) [2024-06-15 21:21:25,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1564082176. Throughput: 0: 10990.9. Samples: 391051776. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:21:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 43653.6). Total num frames: 1564114944. Throughput: 0: 11013.7. Samples: 391121920. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:21:33,337][1653645] Updated weights for policy 0, policy_version 763783 (0.0015) [2024-06-15 21:21:34,519][1653645] Updated weights for policy 0, policy_version 763838 (0.0014) [2024-06-15 21:21:35,745][1653645] Updated weights for policy 0, policy_version 763888 (0.0011) [2024-06-15 21:21:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 1564475392. Throughput: 0: 11025.1. Samples: 391153664. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:21:37,577][1653645] Updated weights for policy 0, policy_version 763962 (0.0045) [2024-06-15 21:21:40,958][1648982] Fps is (10 sec: 49151.0, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 1564606464. Throughput: 0: 10911.3. Samples: 391213568. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:21:43,479][1653645] Updated weights for policy 0, policy_version 764028 (0.0013) [2024-06-15 21:21:45,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 1564803072. Throughput: 0: 11195.8. Samples: 391286272. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:21:46,371][1653645] Updated weights for policy 0, policy_version 764080 (0.0010) [2024-06-15 21:21:48,639][1653645] Updated weights for policy 0, policy_version 764161 (0.0127) [2024-06-15 21:21:50,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1565130752. Throughput: 0: 10866.9. Samples: 391307776. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:21:55,084][1653645] Updated weights for policy 0, policy_version 764227 (0.0011) [2024-06-15 21:21:55,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.9, 300 sec: 43542.6). Total num frames: 1565196288. Throughput: 0: 10865.8. Samples: 391376384. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:21:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:21:56,249][1653645] Updated weights for policy 0, policy_version 764284 (0.0009) [2024-06-15 21:21:59,789][1653645] Updated weights for policy 0, policy_version 764369 (0.0105) [2024-06-15 21:22:00,905][1653645] Updated weights for policy 0, policy_version 764421 (0.0013) [2024-06-15 21:22:00,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.7, 300 sec: 43987.5). Total num frames: 1565523968. Throughput: 0: 10843.0. Samples: 391436800. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:22:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:22:01,521][1651596] Signal inference workers to stop experience collection... (39700 times) [2024-06-15 21:22:01,548][1653645] InferenceWorker_p0-w0: stopping experience collection (39700 times) [2024-06-15 21:22:01,687][1651596] Signal inference workers to resume experience collection... (39700 times) [2024-06-15 21:22:01,695][1653645] InferenceWorker_p0-w0: resuming experience collection (39700 times) [2024-06-15 21:22:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43694.1, 300 sec: 43653.6). Total num frames: 1565655040. Throughput: 0: 10808.9. Samples: 391472128. Policy #0 lag: (min: 15.0, avg: 103.7, max: 271.0) [2024-06-15 21:22:05,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:22:06,572][1653645] Updated weights for policy 0, policy_version 764496 (0.0014) [2024-06-15 21:22:07,810][1653645] Updated weights for policy 0, policy_version 764544 (0.0013) [2024-06-15 21:22:10,958][1648982] Fps is (10 sec: 36043.5, 60 sec: 43144.3, 300 sec: 43431.4). Total num frames: 1565884416. Throughput: 0: 10933.9. Samples: 391543808. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:10,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:22:11,847][1653645] Updated weights for policy 0, policy_version 764627 (0.0012) [2024-06-15 21:22:12,957][1653645] Updated weights for policy 0, policy_version 764688 (0.0076) [2024-06-15 21:22:15,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1566179328. Throughput: 0: 10763.3. Samples: 391606272. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:15,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 21:22:17,904][1653645] Updated weights for policy 0, policy_version 764737 (0.0015) [2024-06-15 21:22:20,958][1648982] Fps is (10 sec: 42600.6, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1566310400. Throughput: 0: 10808.9. Samples: 391640064. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:22:21,408][1653645] Updated weights for policy 0, policy_version 764802 (0.0044) [2024-06-15 21:22:22,662][1653645] Updated weights for policy 0, policy_version 764858 (0.0014) [2024-06-15 21:22:23,834][1653645] Updated weights for policy 0, policy_version 764913 (0.0138) [2024-06-15 21:22:25,240][1653645] Updated weights for policy 0, policy_version 764976 (0.0012) [2024-06-15 21:22:25,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1566703616. Throughput: 0: 11082.0. Samples: 391712256. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:22:29,775][1653645] Updated weights for policy 0, policy_version 765024 (0.0016) [2024-06-15 21:22:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.0, 300 sec: 44098.0). Total num frames: 1566834688. Throughput: 0: 10831.7. Samples: 391773696. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:22:33,275][1653645] Updated weights for policy 0, policy_version 765074 (0.0013) [2024-06-15 21:22:34,685][1653645] Updated weights for policy 0, policy_version 765136 (0.0013) [2024-06-15 21:22:35,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1567096832. Throughput: 0: 11298.1. Samples: 391816192. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:22:36,860][1653645] Updated weights for policy 0, policy_version 765232 (0.0011) [2024-06-15 21:22:40,961][1648982] Fps is (10 sec: 39309.8, 60 sec: 43688.6, 300 sec: 44208.6). Total num frames: 1567227904. Throughput: 0: 11047.1. Samples: 391873536. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:40,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:22:41,866][1653645] Updated weights for policy 0, policy_version 765266 (0.0014) [2024-06-15 21:22:45,586][1653645] Updated weights for policy 0, policy_version 765315 (0.0013) [2024-06-15 21:22:45,958][1648982] Fps is (10 sec: 29492.0, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1567391744. Throughput: 0: 11207.2. Samples: 391941120. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:45,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:22:46,474][1653645] Updated weights for policy 0, policy_version 765364 (0.0013) [2024-06-15 21:22:48,089][1651596] Signal inference workers to stop experience collection... (39750 times) [2024-06-15 21:22:48,130][1653645] InferenceWorker_p0-w0: stopping experience collection (39750 times) [2024-06-15 21:22:48,362][1651596] Signal inference workers to resume experience collection... (39750 times) [2024-06-15 21:22:48,363][1653645] InferenceWorker_p0-w0: resuming experience collection (39750 times) [2024-06-15 21:22:48,365][1653645] Updated weights for policy 0, policy_version 765440 (0.0012) [2024-06-15 21:22:50,958][1648982] Fps is (10 sec: 52443.1, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1567752192. Throughput: 0: 11116.0. Samples: 391972352. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:50,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:22:53,732][1653645] Updated weights for policy 0, policy_version 765520 (0.0013) [2024-06-15 21:22:55,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 44782.9, 300 sec: 43875.8). Total num frames: 1567883264. Throughput: 0: 10945.5. Samples: 392036352. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:22:55,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:22:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000765568_1567883264.pth... [2024-06-15 21:22:56,053][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000760432_1557364736.pth [2024-06-15 21:22:57,870][1653645] Updated weights for policy 0, policy_version 765571 (0.0023) [2024-06-15 21:22:59,650][1653645] Updated weights for policy 0, policy_version 765648 (0.0033) [2024-06-15 21:23:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 1568145408. Throughput: 0: 10968.1. Samples: 392099840. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:00,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:23:01,848][1653645] Updated weights for policy 0, policy_version 765730 (0.0016) [2024-06-15 21:23:05,649][1653645] Updated weights for policy 0, policy_version 765792 (0.0018) [2024-06-15 21:23:05,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44782.8, 300 sec: 44097.9). Total num frames: 1568342016. Throughput: 0: 10865.7. Samples: 392129024. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:23:10,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43144.7, 300 sec: 43431.5). Total num frames: 1568473088. Throughput: 0: 10763.3. Samples: 392196608. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:10,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:23:11,322][1653645] Updated weights for policy 0, policy_version 765872 (0.0012) [2024-06-15 21:23:13,710][1653645] Updated weights for policy 0, policy_version 765952 (0.0190) [2024-06-15 21:23:15,222][1653645] Updated weights for policy 0, policy_version 766008 (0.0013) [2024-06-15 21:23:15,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1568800768. Throughput: 0: 10604.1. Samples: 392250880. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:23:19,227][1653645] Updated weights for policy 0, policy_version 766073 (0.0015) [2024-06-15 21:23:20,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.5, 300 sec: 43653.6). Total num frames: 1568931840. Throughput: 0: 10467.5. Samples: 392287232. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:20,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:23:23,438][1653645] Updated weights for policy 0, policy_version 766113 (0.0012) [2024-06-15 21:23:24,127][1653645] Updated weights for policy 0, policy_version 766144 (0.0012) [2024-06-15 21:23:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 1569193984. Throughput: 0: 10684.4. Samples: 392354304. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:23:26,145][1653645] Updated weights for policy 0, policy_version 766215 (0.0011) [2024-06-15 21:23:27,316][1653645] Updated weights for policy 0, policy_version 766265 (0.0010) [2024-06-15 21:23:30,276][1653645] Updated weights for policy 0, policy_version 766304 (0.0023) [2024-06-15 21:23:30,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1569456128. Throughput: 0: 10615.5. Samples: 392418816. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:23:34,493][1653645] Updated weights for policy 0, policy_version 766353 (0.0029) [2024-06-15 21:23:35,394][1653645] Updated weights for policy 0, policy_version 766398 (0.0013) [2024-06-15 21:23:35,898][1651596] Signal inference workers to stop experience collection... (39800 times) [2024-06-15 21:23:35,935][1653645] InferenceWorker_p0-w0: stopping experience collection (39800 times) [2024-06-15 21:23:35,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 43542.5). Total num frames: 1569587200. Throughput: 0: 10786.2. Samples: 392457728. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:23:36,129][1651596] Signal inference workers to resume experience collection... (39800 times) [2024-06-15 21:23:36,130][1653645] InferenceWorker_p0-w0: resuming experience collection (39800 times) [2024-06-15 21:23:37,062][1653645] Updated weights for policy 0, policy_version 766449 (0.0109) [2024-06-15 21:23:38,564][1653645] Updated weights for policy 0, policy_version 766512 (0.0015) [2024-06-15 21:23:40,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 43692.8, 300 sec: 43653.6). Total num frames: 1569849344. Throughput: 0: 10717.9. Samples: 392518656. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:23:42,528][1653645] Updated weights for policy 0, policy_version 766576 (0.0013) [2024-06-15 21:23:45,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.6, 300 sec: 43209.3). Total num frames: 1570013184. Throughput: 0: 10945.5. Samples: 392592384. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:23:46,482][1653645] Updated weights for policy 0, policy_version 766640 (0.0013) [2024-06-15 21:23:47,984][1653645] Updated weights for policy 0, policy_version 766688 (0.0012) [2024-06-15 21:23:49,237][1653645] Updated weights for policy 0, policy_version 766736 (0.0016) [2024-06-15 21:23:50,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1570373632. Throughput: 0: 10968.2. Samples: 392622592. Policy #0 lag: (min: 15.0, avg: 101.6, max: 271.0) [2024-06-15 21:23:50,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:23:53,648][1653645] Updated weights for policy 0, policy_version 766816 (0.0018) [2024-06-15 21:23:55,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1570504704. Throughput: 0: 11002.4. Samples: 392691712. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:23:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:23:58,047][1653645] Updated weights for policy 0, policy_version 766883 (0.0013) [2024-06-15 21:23:59,395][1653645] Updated weights for policy 0, policy_version 766916 (0.0011) [2024-06-15 21:24:00,940][1653645] Updated weights for policy 0, policy_version 766994 (0.0107) [2024-06-15 21:24:00,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 1570799616. Throughput: 0: 11263.9. Samples: 392757760. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:24:04,469][1653645] Updated weights for policy 0, policy_version 767045 (0.0016) [2024-06-15 21:24:05,695][1653645] Updated weights for policy 0, policy_version 767100 (0.0012) [2024-06-15 21:24:05,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 44783.1, 300 sec: 43986.9). Total num frames: 1571028992. Throughput: 0: 11207.2. Samples: 392791552. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:24:10,012][1653645] Updated weights for policy 0, policy_version 767164 (0.0011) [2024-06-15 21:24:10,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44782.9, 300 sec: 43542.5). Total num frames: 1571160064. Throughput: 0: 11252.6. Samples: 392860672. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:10,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:24:12,147][1653645] Updated weights for policy 0, policy_version 767216 (0.0012) [2024-06-15 21:24:13,704][1653645] Updated weights for policy 0, policy_version 767280 (0.0083) [2024-06-15 21:24:15,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1571422208. Throughput: 0: 11355.0. Samples: 392929792. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:24:16,992][1653645] Updated weights for policy 0, policy_version 767352 (0.0016) [2024-06-15 21:24:20,958][1648982] Fps is (10 sec: 42599.6, 60 sec: 44237.0, 300 sec: 43764.8). Total num frames: 1571586048. Throughput: 0: 11184.4. Samples: 392961024. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:24:20,974][1651596] Signal inference workers to stop experience collection... (39850 times) [2024-06-15 21:24:21,046][1653645] InferenceWorker_p0-w0: stopping experience collection (39850 times) [2024-06-15 21:24:21,270][1651596] Signal inference workers to resume experience collection... (39850 times) [2024-06-15 21:24:21,271][1653645] InferenceWorker_p0-w0: resuming experience collection (39850 times) [2024-06-15 21:24:21,868][1653645] Updated weights for policy 0, policy_version 767416 (0.0016) [2024-06-15 21:24:24,474][1653645] Updated weights for policy 0, policy_version 767480 (0.0013) [2024-06-15 21:24:25,285][1653645] Updated weights for policy 0, policy_version 767508 (0.0030) [2024-06-15 21:24:25,958][1648982] Fps is (10 sec: 49151.0, 60 sec: 45328.9, 300 sec: 44320.1). Total num frames: 1571913728. Throughput: 0: 11172.9. Samples: 393021440. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:24:26,094][1653645] Updated weights for policy 0, policy_version 767550 (0.0013) [2024-06-15 21:24:28,682][1653645] Updated weights for policy 0, policy_version 767611 (0.0016) [2024-06-15 21:24:30,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1572077568. Throughput: 0: 11264.0. Samples: 393099264. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:24:33,035][1653645] Updated weights for policy 0, policy_version 767675 (0.0012) [2024-06-15 21:24:35,589][1653645] Updated weights for policy 0, policy_version 767728 (0.0013) [2024-06-15 21:24:35,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 45329.2, 300 sec: 43876.1). Total num frames: 1572306944. Throughput: 0: 11332.3. Samples: 393132544. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:24:37,417][1653645] Updated weights for policy 0, policy_version 767796 (0.0039) [2024-06-15 21:24:40,567][1653645] Updated weights for policy 0, policy_version 767847 (0.0011) [2024-06-15 21:24:40,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1572569088. Throughput: 0: 11184.4. Samples: 393195008. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:24:44,805][1653645] Updated weights for policy 0, policy_version 767920 (0.0013) [2024-06-15 21:24:45,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 45328.9, 300 sec: 43653.6). Total num frames: 1572732928. Throughput: 0: 11127.5. Samples: 393258496. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:45,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:24:47,612][1653645] Updated weights for policy 0, policy_version 767971 (0.0014) [2024-06-15 21:24:48,903][1653645] Updated weights for policy 0, policy_version 768019 (0.0012) [2024-06-15 21:24:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.8, 300 sec: 44097.9). Total num frames: 1572995072. Throughput: 0: 11070.6. Samples: 393289728. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:24:51,792][1653645] Updated weights for policy 0, policy_version 768065 (0.0016) [2024-06-15 21:24:52,964][1653645] Updated weights for policy 0, policy_version 768115 (0.0014) [2024-06-15 21:24:55,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.6, 300 sec: 43653.7). Total num frames: 1573126144. Throughput: 0: 11025.1. Samples: 393356800. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:24:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:24:55,992][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000768128_1573126144.pth... [2024-06-15 21:24:56,076][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000763008_1562640384.pth [2024-06-15 21:24:56,957][1653645] Updated weights for policy 0, policy_version 768149 (0.0010) [2024-06-15 21:24:59,508][1653645] Updated weights for policy 0, policy_version 768213 (0.0013) [2024-06-15 21:25:00,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 1573421056. Throughput: 0: 10831.7. Samples: 393417216. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:25:01,660][1653645] Updated weights for policy 0, policy_version 768304 (0.0106) [2024-06-15 21:25:04,732][1651596] Signal inference workers to stop experience collection... (39900 times) [2024-06-15 21:25:04,800][1653645] InferenceWorker_p0-w0: stopping experience collection (39900 times) [2024-06-15 21:25:05,006][1651596] Signal inference workers to resume experience collection... (39900 times) [2024-06-15 21:25:05,007][1653645] InferenceWorker_p0-w0: resuming experience collection (39900 times) [2024-06-15 21:25:05,010][1653645] Updated weights for policy 0, policy_version 768368 (0.0014) [2024-06-15 21:25:05,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1573650432. Throughput: 0: 10820.3. Samples: 393447936. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:25:09,330][1653645] Updated weights for policy 0, policy_version 768401 (0.0011) [2024-06-15 21:25:10,289][1653645] Updated weights for policy 0, policy_version 768443 (0.0011) [2024-06-15 21:25:10,958][1648982] Fps is (10 sec: 36043.7, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1573781504. Throughput: 0: 10968.2. Samples: 393515008. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:10,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:25:12,449][1653645] Updated weights for policy 0, policy_version 768502 (0.0134) [2024-06-15 21:25:14,136][1653645] Updated weights for policy 0, policy_version 768576 (0.0081) [2024-06-15 21:25:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1574043648. Throughput: 0: 10570.0. Samples: 393574912. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:25:20,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 43431.4). Total num frames: 1574174720. Throughput: 0: 10592.6. Samples: 393609216. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:20,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:25:21,157][1653645] Updated weights for policy 0, policy_version 768646 (0.0012) [2024-06-15 21:25:22,392][1653645] Updated weights for policy 0, policy_version 768700 (0.0013) [2024-06-15 21:25:24,916][1653645] Updated weights for policy 0, policy_version 768767 (0.0011) [2024-06-15 21:25:25,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 44098.0). Total num frames: 1574469632. Throughput: 0: 10706.5. Samples: 393676800. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:25:28,185][1653645] Updated weights for policy 0, policy_version 768836 (0.0012) [2024-06-15 21:25:30,958][1648982] Fps is (10 sec: 52430.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1574699008. Throughput: 0: 10604.1. Samples: 393735680. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:30,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:25:33,412][1653645] Updated weights for policy 0, policy_version 768900 (0.0012) [2024-06-15 21:25:35,480][1653645] Updated weights for policy 0, policy_version 768962 (0.0031) [2024-06-15 21:25:35,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 43653.6). Total num frames: 1574862848. Throughput: 0: 10854.4. Samples: 393778176. Policy #0 lag: (min: 14.0, avg: 122.2, max: 270.0) [2024-06-15 21:25:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:25:38,312][1653645] Updated weights for policy 0, policy_version 769040 (0.0013) [2024-06-15 21:25:40,100][1653645] Updated weights for policy 0, policy_version 769104 (0.0152) [2024-06-15 21:25:40,958][1648982] Fps is (10 sec: 49150.4, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 1575190528. Throughput: 0: 10524.4. Samples: 393830400. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:25:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:25:41,081][1653645] Updated weights for policy 0, policy_version 769146 (0.0061) [2024-06-15 21:25:45,328][1653645] Updated weights for policy 0, policy_version 769184 (0.0013) [2024-06-15 21:25:45,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1575354368. Throughput: 0: 10979.6. Samples: 393911296. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:25:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:25:48,585][1653645] Updated weights for policy 0, policy_version 769268 (0.0013) [2024-06-15 21:25:50,951][1653645] Updated weights for policy 0, policy_version 769335 (0.0011) [2024-06-15 21:25:50,957][1648982] Fps is (10 sec: 39323.5, 60 sec: 43144.7, 300 sec: 44098.0). Total num frames: 1575583744. Throughput: 0: 10956.8. Samples: 393940992. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:25:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:25:51,679][1651596] Signal inference workers to stop experience collection... (39950 times) [2024-06-15 21:25:51,765][1653645] InferenceWorker_p0-w0: stopping experience collection (39950 times) [2024-06-15 21:25:51,947][1651596] Signal inference workers to resume experience collection... (39950 times) [2024-06-15 21:25:51,949][1653645] InferenceWorker_p0-w0: resuming experience collection (39950 times) [2024-06-15 21:25:51,951][1653645] Updated weights for policy 0, policy_version 769376 (0.0012) [2024-06-15 21:25:52,860][1653645] Updated weights for policy 0, policy_version 769407 (0.0015) [2024-06-15 21:25:55,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1575747584. Throughput: 0: 10888.5. Samples: 394004992. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:25:55,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:25:57,765][1653645] Updated weights for policy 0, policy_version 769472 (0.0088) [2024-06-15 21:26:00,480][1653645] Updated weights for policy 0, policy_version 769536 (0.0013) [2024-06-15 21:26:00,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 43987.6). Total num frames: 1576009728. Throughput: 0: 10968.2. Samples: 394068480. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:26:02,974][1653645] Updated weights for policy 0, policy_version 769599 (0.0012) [2024-06-15 21:26:05,110][1653645] Updated weights for policy 0, policy_version 769661 (0.0013) [2024-06-15 21:26:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.4, 300 sec: 43986.9). Total num frames: 1576271872. Throughput: 0: 10968.2. Samples: 394102784. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:26:09,591][1653645] Updated weights for policy 0, policy_version 769712 (0.0015) [2024-06-15 21:26:10,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44237.0, 300 sec: 43653.6). Total num frames: 1576435712. Throughput: 0: 11002.3. Samples: 394171904. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:26:11,796][1653645] Updated weights for policy 0, policy_version 769784 (0.0125) [2024-06-15 21:26:14,385][1653645] Updated weights for policy 0, policy_version 769825 (0.0013) [2024-06-15 21:26:15,939][1653645] Updated weights for policy 0, policy_version 769872 (0.0011) [2024-06-15 21:26:15,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 44236.7, 300 sec: 44097.9). Total num frames: 1576697856. Throughput: 0: 11081.9. Samples: 394234368. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:26:17,102][1653645] Updated weights for policy 0, policy_version 769920 (0.0013) [2024-06-15 21:26:20,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44783.2, 300 sec: 43320.4). Total num frames: 1576861696. Throughput: 0: 10820.3. Samples: 394265088. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:26:21,241][1653645] Updated weights for policy 0, policy_version 769977 (0.0015) [2024-06-15 21:26:23,828][1653645] Updated weights for policy 0, policy_version 770047 (0.0013) [2024-06-15 21:26:25,958][1648982] Fps is (10 sec: 36045.4, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 1577058304. Throughput: 0: 11195.8. Samples: 394334208. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:26:27,656][1653645] Updated weights for policy 0, policy_version 770128 (0.0081) [2024-06-15 21:26:30,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 1577320448. Throughput: 0: 10877.1. Samples: 394400768. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:26:32,448][1653645] Updated weights for policy 0, policy_version 770192 (0.0013) [2024-06-15 21:26:34,707][1653645] Updated weights for policy 0, policy_version 770241 (0.0017) [2024-06-15 21:26:35,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44236.7, 300 sec: 43764.7). Total num frames: 1577517056. Throughput: 0: 10922.6. Samples: 394432512. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:26:36,334][1653645] Updated weights for policy 0, policy_version 770302 (0.0015) [2024-06-15 21:26:38,911][1653645] Updated weights for policy 0, policy_version 770357 (0.0012) [2024-06-15 21:26:39,516][1651596] Signal inference workers to stop experience collection... (40000 times) [2024-06-15 21:26:39,547][1653645] InferenceWorker_p0-w0: stopping experience collection (40000 times) [2024-06-15 21:26:39,813][1651596] Signal inference workers to resume experience collection... (40000 times) [2024-06-15 21:26:39,814][1653645] InferenceWorker_p0-w0: resuming experience collection (40000 times) [2024-06-15 21:26:40,380][1653645] Updated weights for policy 0, policy_version 770427 (0.0015) [2024-06-15 21:26:40,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44237.1, 300 sec: 44209.1). Total num frames: 1577844736. Throughput: 0: 10945.5. Samples: 394497536. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:40,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:26:45,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 42052.3, 300 sec: 43209.3). Total num frames: 1577877504. Throughput: 0: 11047.8. Samples: 394565632. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:45,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:26:46,689][1653645] Updated weights for policy 0, policy_version 770489 (0.0101) [2024-06-15 21:26:48,496][1653645] Updated weights for policy 0, policy_version 770560 (0.0023) [2024-06-15 21:26:50,958][1648982] Fps is (10 sec: 32767.1, 60 sec: 43144.3, 300 sec: 43986.8). Total num frames: 1578172416. Throughput: 0: 10797.6. Samples: 394588672. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:26:51,648][1653645] Updated weights for policy 0, policy_version 770640 (0.0027) [2024-06-15 21:26:55,958][1648982] Fps is (10 sec: 49150.1, 60 sec: 43690.7, 300 sec: 43542.5). Total num frames: 1578369024. Throughput: 0: 10763.3. Samples: 394656256. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:26:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:26:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000770688_1578369024.pth... [2024-06-15 21:26:56,049][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000765568_1567883264.pth [2024-06-15 21:26:57,963][1653645] Updated weights for policy 0, policy_version 770709 (0.0015) [2024-06-15 21:26:58,749][1653645] Updated weights for policy 0, policy_version 770752 (0.0014) [2024-06-15 21:27:00,209][1653645] Updated weights for policy 0, policy_version 770804 (0.0024) [2024-06-15 21:27:00,962][1648982] Fps is (10 sec: 45855.4, 60 sec: 43687.4, 300 sec: 43986.2). Total num frames: 1578631168. Throughput: 0: 10978.5. Samples: 394728448. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:27:00,963][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:27:01,436][1653645] Updated weights for policy 0, policy_version 770848 (0.0034) [2024-06-15 21:27:04,167][1653645] Updated weights for policy 0, policy_version 770935 (0.0017) [2024-06-15 21:27:05,961][1648982] Fps is (10 sec: 52415.3, 60 sec: 43688.8, 300 sec: 44097.6). Total num frames: 1578893312. Throughput: 0: 11012.9. Samples: 394760704. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:27:05,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:27:09,628][1653645] Updated weights for policy 0, policy_version 770976 (0.0012) [2024-06-15 21:27:10,958][1648982] Fps is (10 sec: 39339.5, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1579024384. Throughput: 0: 11002.3. Samples: 394829312. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:27:10,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:27:11,598][1653645] Updated weights for policy 0, policy_version 771040 (0.0019) [2024-06-15 21:27:14,065][1653645] Updated weights for policy 0, policy_version 771104 (0.0013) [2024-06-15 21:27:15,958][1648982] Fps is (10 sec: 42611.0, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1579319296. Throughput: 0: 10911.3. Samples: 394891776. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:27:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:27:15,961][1653645] Updated weights for policy 0, policy_version 771168 (0.0025) [2024-06-15 21:27:20,737][1653645] Updated weights for policy 0, policy_version 771202 (0.0013) [2024-06-15 21:27:20,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 43209.3). Total num frames: 1579450368. Throughput: 0: 10899.9. Samples: 394923008. Policy #0 lag: (min: 15.0, avg: 137.0, max: 271.0) [2024-06-15 21:27:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:27:23,335][1653645] Updated weights for policy 0, policy_version 771280 (0.0042) [2024-06-15 21:27:25,293][1653645] Updated weights for policy 0, policy_version 771331 (0.0016) [2024-06-15 21:27:25,965][1648982] Fps is (10 sec: 42568.6, 60 sec: 44777.7, 300 sec: 43763.7). Total num frames: 1579745280. Throughput: 0: 11000.6. Samples: 394992640. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:25,965][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:27:26,580][1653645] Updated weights for policy 0, policy_version 771389 (0.0023) [2024-06-15 21:27:28,042][1651596] Signal inference workers to stop experience collection... (40050 times) [2024-06-15 21:27:28,091][1653645] InferenceWorker_p0-w0: stopping experience collection (40050 times) [2024-06-15 21:27:28,337][1651596] Signal inference workers to resume experience collection... (40050 times) [2024-06-15 21:27:28,338][1653645] InferenceWorker_p0-w0: resuming experience collection (40050 times) [2024-06-15 21:27:28,580][1653645] Updated weights for policy 0, policy_version 771451 (0.0012) [2024-06-15 21:27:30,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1579941888. Throughput: 0: 10877.1. Samples: 395055104. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:27:33,649][1653645] Updated weights for policy 0, policy_version 771504 (0.0011) [2024-06-15 21:27:35,595][1653645] Updated weights for policy 0, policy_version 771552 (0.0015) [2024-06-15 21:27:35,960][1648982] Fps is (10 sec: 39348.8, 60 sec: 43690.7, 300 sec: 43765.2). Total num frames: 1580138496. Throughput: 0: 11104.7. Samples: 395088384. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:35,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:27:38,003][1653645] Updated weights for policy 0, policy_version 771601 (0.0024) [2024-06-15 21:27:38,824][1653645] Updated weights for policy 0, policy_version 771648 (0.0013) [2024-06-15 21:27:40,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43144.5, 300 sec: 44209.0). Total num frames: 1580433408. Throughput: 0: 11104.8. Samples: 395155968. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:27:41,089][1653645] Updated weights for policy 0, policy_version 771712 (0.0012) [2024-06-15 21:27:45,261][1653645] Updated weights for policy 0, policy_version 771776 (0.0014) [2024-06-15 21:27:45,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 1580597248. Throughput: 0: 10923.8. Samples: 395219968. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:45,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:27:48,237][1653645] Updated weights for policy 0, policy_version 771840 (0.0012) [2024-06-15 21:27:50,956][1653645] Updated weights for policy 0, policy_version 771904 (0.0011) [2024-06-15 21:27:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44783.1, 300 sec: 43986.9). Total num frames: 1580859392. Throughput: 0: 10912.0. Samples: 395251712. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:27:53,316][1653645] Updated weights for policy 0, policy_version 771967 (0.0012) [2024-06-15 21:27:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44237.0, 300 sec: 43653.7). Total num frames: 1581023232. Throughput: 0: 10831.6. Samples: 395316736. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:27:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:27:56,753][1653645] Updated weights for policy 0, policy_version 772032 (0.0101) [2024-06-15 21:27:59,133][1653645] Updated weights for policy 0, policy_version 772067 (0.0014) [2024-06-15 21:28:00,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43694.0, 300 sec: 43764.8). Total num frames: 1581252608. Throughput: 0: 11116.1. Samples: 395392000. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:28:01,547][1653645] Updated weights for policy 0, policy_version 772112 (0.0012) [2024-06-15 21:28:04,262][1653645] Updated weights for policy 0, policy_version 772193 (0.0012) [2024-06-15 21:28:05,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43692.7, 300 sec: 44209.1). Total num frames: 1581514752. Throughput: 0: 11093.3. Samples: 395422208. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:05,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:28:07,539][1653645] Updated weights for policy 0, policy_version 772258 (0.0013) [2024-06-15 21:28:08,099][1653645] Updated weights for policy 0, policy_version 772288 (0.0045) [2024-06-15 21:28:10,958][1648982] Fps is (10 sec: 52426.8, 60 sec: 45874.9, 300 sec: 43986.8). Total num frames: 1581776896. Throughput: 0: 11106.3. Samples: 395492352. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:28:13,021][1653645] Updated weights for policy 0, policy_version 772357 (0.0015) [2024-06-15 21:28:14,350][1653645] Updated weights for policy 0, policy_version 772416 (0.0015) [2024-06-15 21:28:15,768][1653645] Updated weights for policy 0, policy_version 772476 (0.0017) [2024-06-15 21:28:15,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1582039040. Throughput: 0: 11241.2. Samples: 395560960. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:28:18,466][1651596] Signal inference workers to stop experience collection... (40100 times) [2024-06-15 21:28:18,515][1653645] InferenceWorker_p0-w0: stopping experience collection (40100 times) [2024-06-15 21:28:18,734][1651596] Signal inference workers to resume experience collection... (40100 times) [2024-06-15 21:28:18,734][1653645] InferenceWorker_p0-w0: resuming experience collection (40100 times) [2024-06-15 21:28:19,184][1653645] Updated weights for policy 0, policy_version 772528 (0.0014) [2024-06-15 21:28:20,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 1582170112. Throughput: 0: 11355.0. Samples: 395599360. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:28:22,039][1653645] Updated weights for policy 0, policy_version 772603 (0.0109) [2024-06-15 21:28:25,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 44242.0, 300 sec: 43875.8). Total num frames: 1582399488. Throughput: 0: 11298.2. Samples: 395664384. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:28:26,122][1653645] Updated weights for policy 0, policy_version 772662 (0.0014) [2024-06-15 21:28:27,661][1653645] Updated weights for policy 0, policy_version 772725 (0.0011) [2024-06-15 21:28:30,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1582694400. Throughput: 0: 11366.4. Samples: 395731456. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:28:32,998][1653645] Updated weights for policy 0, policy_version 772806 (0.0098) [2024-06-15 21:28:35,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44782.9, 300 sec: 43986.9). Total num frames: 1582825472. Throughput: 0: 11411.9. Samples: 395765248. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:28:36,616][1653645] Updated weights for policy 0, policy_version 772867 (0.0012) [2024-06-15 21:28:37,961][1653645] Updated weights for policy 0, policy_version 772922 (0.0133) [2024-06-15 21:28:39,199][1653645] Updated weights for policy 0, policy_version 772984 (0.0009) [2024-06-15 21:28:40,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1583087616. Throughput: 0: 11343.6. Samples: 395827200. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:28:42,063][1653645] Updated weights for policy 0, policy_version 773032 (0.0012) [2024-06-15 21:28:45,305][1653645] Updated weights for policy 0, policy_version 773088 (0.0012) [2024-06-15 21:28:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 1583349760. Throughput: 0: 11218.5. Samples: 395896832. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:28:49,238][1653645] Updated weights for policy 0, policy_version 773168 (0.0012) [2024-06-15 21:28:50,957][1648982] Fps is (10 sec: 45876.2, 60 sec: 44783.1, 300 sec: 44209.1). Total num frames: 1583546368. Throughput: 0: 11321.0. Samples: 395931648. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:28:51,173][1653645] Updated weights for policy 0, policy_version 773240 (0.0012) [2024-06-15 21:28:53,566][1653645] Updated weights for policy 0, policy_version 773281 (0.0011) [2024-06-15 21:28:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.0, 300 sec: 43875.8). Total num frames: 1583742976. Throughput: 0: 11355.1. Samples: 396003328. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:28:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:28:55,992][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000773312_1583742976.pth... [2024-06-15 21:28:56,110][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000768128_1573126144.pth [2024-06-15 21:28:56,288][1653645] Updated weights for policy 0, policy_version 773315 (0.0012) [2024-06-15 21:29:00,050][1653645] Updated weights for policy 0, policy_version 773380 (0.0016) [2024-06-15 21:29:00,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 1583939584. Throughput: 0: 11343.6. Samples: 396071424. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:29:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:29:01,125][1653645] Updated weights for policy 0, policy_version 773428 (0.0015) [2024-06-15 21:29:02,746][1653645] Updated weights for policy 0, policy_version 773504 (0.0013) [2024-06-15 21:29:03,595][1651596] Signal inference workers to stop experience collection... (40150 times) [2024-06-15 21:29:03,684][1653645] InferenceWorker_p0-w0: stopping experience collection (40150 times) [2024-06-15 21:29:03,860][1651596] Signal inference workers to resume experience collection... (40150 times) [2024-06-15 21:29:03,861][1653645] InferenceWorker_p0-w0: resuming experience collection (40150 times) [2024-06-15 21:29:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45875.3, 300 sec: 44431.2). Total num frames: 1584267264. Throughput: 0: 11207.1. Samples: 396103680. Policy #0 lag: (min: 31.0, avg: 134.7, max: 287.0) [2024-06-15 21:29:05,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 21:29:07,958][1653645] Updated weights for policy 0, policy_version 773584 (0.0012) [2024-06-15 21:29:09,026][1653645] Updated weights for policy 0, policy_version 773632 (0.0013) [2024-06-15 21:29:10,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 44783.1, 300 sec: 44209.0). Total num frames: 1584463872. Throughput: 0: 11457.4. Samples: 396179968. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:29:12,567][1653645] Updated weights for policy 0, policy_version 773718 (0.0013) [2024-06-15 21:29:13,319][1653645] Updated weights for policy 0, policy_version 773753 (0.0012) [2024-06-15 21:29:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.0, 300 sec: 44653.3). Total num frames: 1584758784. Throughput: 0: 11411.9. Samples: 396244992. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:15,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:29:16,018][1653645] Updated weights for policy 0, policy_version 773816 (0.0101) [2024-06-15 21:29:19,904][1653645] Updated weights for policy 0, policy_version 773861 (0.0018) [2024-06-15 21:29:20,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 45875.3, 300 sec: 44098.0). Total num frames: 1584922624. Throughput: 0: 11582.6. Samples: 396286464. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:29:22,845][1653645] Updated weights for policy 0, policy_version 773920 (0.0014) [2024-06-15 21:29:25,160][1653645] Updated weights for policy 0, policy_version 774016 (0.0122) [2024-06-15 21:29:25,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1585184768. Throughput: 0: 11446.1. Samples: 396342272. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:29:28,276][1653645] Updated weights for policy 0, policy_version 774072 (0.0054) [2024-06-15 21:29:30,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1585315840. Throughput: 0: 11616.7. Samples: 396419584. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:29:32,243][1653645] Updated weights for policy 0, policy_version 774137 (0.0128) [2024-06-15 21:29:35,854][1653645] Updated weights for policy 0, policy_version 774224 (0.0130) [2024-06-15 21:29:35,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 46421.3, 300 sec: 44209.0). Total num frames: 1585610752. Throughput: 0: 11582.5. Samples: 396452864. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:29:38,436][1653645] Updated weights for policy 0, policy_version 774288 (0.0017) [2024-06-15 21:29:40,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 45875.0, 300 sec: 44431.2). Total num frames: 1585840128. Throughput: 0: 11172.9. Samples: 396506112. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:40,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:29:43,653][1653645] Updated weights for policy 0, policy_version 774342 (0.0014) [2024-06-15 21:29:44,714][1653645] Updated weights for policy 0, policy_version 774400 (0.0043) [2024-06-15 21:29:45,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1586003968. Throughput: 0: 11491.6. Samples: 396588544. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:29:47,258][1653645] Updated weights for policy 0, policy_version 774480 (0.0112) [2024-06-15 21:29:47,841][1651596] Signal inference workers to stop experience collection... (40200 times) [2024-06-15 21:29:47,987][1653645] InferenceWorker_p0-w0: stopping experience collection (40200 times) [2024-06-15 21:29:48,222][1651596] Signal inference workers to resume experience collection... (40200 times) [2024-06-15 21:29:48,223][1653645] InferenceWorker_p0-w0: resuming experience collection (40200 times) [2024-06-15 21:29:49,839][1653645] Updated weights for policy 0, policy_version 774530 (0.0035) [2024-06-15 21:29:50,958][1648982] Fps is (10 sec: 49153.3, 60 sec: 46421.2, 300 sec: 44764.4). Total num frames: 1586331648. Throughput: 0: 11264.0. Samples: 396610560. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:29:51,065][1653645] Updated weights for policy 0, policy_version 774586 (0.0014) [2024-06-15 21:29:55,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1586364416. Throughput: 0: 11229.9. Samples: 396685312. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:29:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:29:57,183][1653645] Updated weights for policy 0, policy_version 774657 (0.0015) [2024-06-15 21:29:59,232][1653645] Updated weights for policy 0, policy_version 774736 (0.0108) [2024-06-15 21:30:00,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 46967.5, 300 sec: 44431.2). Total num frames: 1586757632. Throughput: 0: 10899.9. Samples: 396735488. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:30:02,528][1653645] Updated weights for policy 0, policy_version 774791 (0.0014) [2024-06-15 21:30:03,739][1653645] Updated weights for policy 0, policy_version 774845 (0.0022) [2024-06-15 21:30:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1586888704. Throughput: 0: 10683.7. Samples: 396767232. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:30:10,329][1653645] Updated weights for policy 0, policy_version 774901 (0.0012) [2024-06-15 21:30:10,959][1648982] Fps is (10 sec: 26214.3, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1587019776. Throughput: 0: 11116.1. Samples: 396842496. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:10,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:30:12,024][1653645] Updated weights for policy 0, policy_version 774976 (0.0115) [2024-06-15 21:30:13,425][1653645] Updated weights for policy 0, policy_version 775036 (0.0029) [2024-06-15 21:30:15,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 44653.4). Total num frames: 1587347456. Throughput: 0: 10626.8. Samples: 396897792. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:30:16,376][1653645] Updated weights for policy 0, policy_version 775100 (0.0125) [2024-06-15 21:30:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 43875.8). Total num frames: 1587412992. Throughput: 0: 10638.2. Samples: 396931584. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:30:22,353][1653645] Updated weights for policy 0, policy_version 775154 (0.0014) [2024-06-15 21:30:24,412][1653645] Updated weights for policy 0, policy_version 775248 (0.0013) [2024-06-15 21:30:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1587806208. Throughput: 0: 10877.2. Samples: 396995584. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:25,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:30:27,271][1653645] Updated weights for policy 0, policy_version 775314 (0.0013) [2024-06-15 21:30:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44320.1). Total num frames: 1587937280. Throughput: 0: 10478.9. Samples: 397060096. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:30:34,497][1653645] Updated weights for policy 0, policy_version 775392 (0.0013) [2024-06-15 21:30:35,399][1651596] Signal inference workers to stop experience collection... (40250 times) [2024-06-15 21:30:35,440][1653645] InferenceWorker_p0-w0: stopping experience collection (40250 times) [2024-06-15 21:30:35,773][1651596] Signal inference workers to resume experience collection... (40250 times) [2024-06-15 21:30:35,782][1653645] InferenceWorker_p0-w0: resuming experience collection (40250 times) [2024-06-15 21:30:35,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 41506.1, 300 sec: 43764.8). Total num frames: 1588101120. Throughput: 0: 10888.5. Samples: 397100544. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:35,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:30:37,632][1653645] Updated weights for policy 0, policy_version 775520 (0.0013) [2024-06-15 21:30:40,417][1653645] Updated weights for policy 0, policy_version 775592 (0.0014) [2024-06-15 21:30:40,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 43144.7, 300 sec: 44320.1). Total num frames: 1588428800. Throughput: 0: 10251.4. Samples: 397146624. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:30:45,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 40959.9, 300 sec: 43653.6). Total num frames: 1588461568. Throughput: 0: 10831.6. Samples: 397222912. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:45,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:30:46,450][1653645] Updated weights for policy 0, policy_version 775650 (0.0012) [2024-06-15 21:30:48,361][1653645] Updated weights for policy 0, policy_version 775735 (0.0013) [2024-06-15 21:30:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 44431.2). Total num frames: 1588854784. Throughput: 0: 10706.5. Samples: 397249024. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:50,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:30:51,780][1653645] Updated weights for policy 0, policy_version 775809 (0.0095) [2024-06-15 21:30:55,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1588985856. Throughput: 0: 10467.5. Samples: 397313536. Policy #0 lag: (min: 59.0, avg: 164.9, max: 315.0) [2024-06-15 21:30:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:30:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000775872_1588985856.pth... [2024-06-15 21:30:56,040][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000770688_1578369024.pth [2024-06-15 21:30:56,044][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000775872_1588985856.pth [2024-06-15 21:30:57,601][1653645] Updated weights for policy 0, policy_version 775875 (0.0012) [2024-06-15 21:30:59,014][1653645] Updated weights for policy 0, policy_version 775936 (0.0014) [2024-06-15 21:31:00,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 44098.0). Total num frames: 1589280768. Throughput: 0: 10877.1. Samples: 397387264. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:31:01,126][1653645] Updated weights for policy 0, policy_version 776018 (0.0013) [2024-06-15 21:31:03,823][1653645] Updated weights for policy 0, policy_version 776096 (0.0014) [2024-06-15 21:31:05,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1589510144. Throughput: 0: 10695.1. Samples: 397412864. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:31:09,431][1653645] Updated weights for policy 0, policy_version 776154 (0.0011) [2024-06-15 21:31:10,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1589673984. Throughput: 0: 11013.7. Samples: 397491200. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:31:11,214][1653645] Updated weights for policy 0, policy_version 776224 (0.0013) [2024-06-15 21:31:13,599][1653645] Updated weights for policy 0, policy_version 776317 (0.0013) [2024-06-15 21:31:15,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1589968896. Throughput: 0: 10854.3. Samples: 397548544. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:31:16,291][1653645] Updated weights for policy 0, policy_version 776378 (0.0013) [2024-06-15 21:31:20,144][1651596] Signal inference workers to stop experience collection... (40300 times) [2024-06-15 21:31:20,202][1653645] InferenceWorker_p0-w0: stopping experience collection (40300 times) [2024-06-15 21:31:20,350][1651596] Signal inference workers to resume experience collection... (40300 times) [2024-06-15 21:31:20,351][1653645] InferenceWorker_p0-w0: resuming experience collection (40300 times) [2024-06-15 21:31:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 1590099968. Throughput: 0: 10820.3. Samples: 397587456. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:31:21,100][1653645] Updated weights for policy 0, policy_version 776438 (0.0012) [2024-06-15 21:31:23,162][1653645] Updated weights for policy 0, policy_version 776483 (0.0013) [2024-06-15 21:31:25,292][1653645] Updated weights for policy 0, policy_version 776567 (0.0089) [2024-06-15 21:31:25,957][1648982] Fps is (10 sec: 45877.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1590427648. Throughput: 0: 11298.2. Samples: 397655040. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:31:27,197][1653645] Updated weights for policy 0, policy_version 776608 (0.0012) [2024-06-15 21:31:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1590558720. Throughput: 0: 11173.0. Samples: 397725696. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:31:31,651][1653645] Updated weights for policy 0, policy_version 776656 (0.0023) [2024-06-15 21:31:32,718][1653645] Updated weights for policy 0, policy_version 776704 (0.0013) [2024-06-15 21:31:35,409][1653645] Updated weights for policy 0, policy_version 776787 (0.0013) [2024-06-15 21:31:35,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 46421.4, 300 sec: 44209.0). Total num frames: 1590886400. Throughput: 0: 11434.7. Samples: 397763584. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:31:38,718][1653645] Updated weights for policy 0, policy_version 776833 (0.0012) [2024-06-15 21:31:40,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 44236.7, 300 sec: 44764.4). Total num frames: 1591083008. Throughput: 0: 11195.8. Samples: 397817344. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:31:44,952][1653645] Updated weights for policy 0, policy_version 776912 (0.0015) [2024-06-15 21:31:45,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 45329.2, 300 sec: 44098.0). Total num frames: 1591181312. Throughput: 0: 11150.2. Samples: 397889024. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:31:46,794][1653645] Updated weights for policy 0, policy_version 776978 (0.0012) [2024-06-15 21:31:48,667][1653645] Updated weights for policy 0, policy_version 777045 (0.0022) [2024-06-15 21:31:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1591476224. Throughput: 0: 11104.7. Samples: 397912576. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:31:51,200][1653645] Updated weights for policy 0, policy_version 777094 (0.0012) [2024-06-15 21:31:52,202][1653645] Updated weights for policy 0, policy_version 777148 (0.0013) [2024-06-15 21:31:55,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 43690.7, 300 sec: 43987.5). Total num frames: 1591607296. Throughput: 0: 10899.8. Samples: 397981696. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:31:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:31:58,041][1653645] Updated weights for policy 0, policy_version 777202 (0.0018) [2024-06-15 21:31:59,702][1653645] Updated weights for policy 0, policy_version 777268 (0.0099) [2024-06-15 21:32:00,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44236.9, 300 sec: 44209.5). Total num frames: 1591934976. Throughput: 0: 11070.7. Samples: 398046720. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:32:01,071][1651596] Signal inference workers to stop experience collection... (40350 times) [2024-06-15 21:32:01,150][1653645] InferenceWorker_p0-w0: stopping experience collection (40350 times) [2024-06-15 21:32:01,153][1653645] Updated weights for policy 0, policy_version 777314 (0.0012) [2024-06-15 21:32:01,372][1651596] Signal inference workers to resume experience collection... (40350 times) [2024-06-15 21:32:01,373][1653645] InferenceWorker_p0-w0: resuming experience collection (40350 times) [2024-06-15 21:32:03,864][1653645] Updated weights for policy 0, policy_version 777399 (0.0015) [2024-06-15 21:32:05,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1592131584. Throughput: 0: 10808.9. Samples: 398073856. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:32:09,476][1653645] Updated weights for policy 0, policy_version 777456 (0.0104) [2024-06-15 21:32:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 44236.8, 300 sec: 44097.9). Total num frames: 1592328192. Throughput: 0: 11093.3. Samples: 398154240. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:32:11,470][1653645] Updated weights for policy 0, policy_version 777536 (0.0134) [2024-06-15 21:32:12,939][1653645] Updated weights for policy 0, policy_version 777595 (0.0019) [2024-06-15 21:32:15,960][1648982] Fps is (10 sec: 49151.8, 60 sec: 44237.0, 300 sec: 44653.3). Total num frames: 1592623104. Throughput: 0: 10672.3. Samples: 398205952. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:15,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:32:20,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 42598.3, 300 sec: 43765.7). Total num frames: 1592655872. Throughput: 0: 10547.2. Samples: 398238208. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:32:21,074][1653645] Updated weights for policy 0, policy_version 777668 (0.0039) [2024-06-15 21:32:24,082][1653645] Updated weights for policy 0, policy_version 777781 (0.0013) [2024-06-15 21:32:25,547][1653645] Updated weights for policy 0, policy_version 777856 (0.0125) [2024-06-15 21:32:25,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1593049088. Throughput: 0: 10820.3. Samples: 398304256. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:32:27,906][1653645] Updated weights for policy 0, policy_version 777914 (0.0015) [2024-06-15 21:32:30,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1593180160. Throughput: 0: 10604.1. Samples: 398366208. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:32:34,351][1653645] Updated weights for policy 0, policy_version 777978 (0.0014) [2024-06-15 21:32:35,814][1653645] Updated weights for policy 0, policy_version 778048 (0.0080) [2024-06-15 21:32:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 1593442304. Throughput: 0: 11093.4. Samples: 398411776. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:32:37,015][1653645] Updated weights for policy 0, policy_version 778106 (0.0015) [2024-06-15 21:32:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1593704448. Throughput: 0: 10843.1. Samples: 398469632. Policy #0 lag: (min: 15.0, avg: 72.1, max: 271.0) [2024-06-15 21:32:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:32:45,366][1653645] Updated weights for policy 0, policy_version 778208 (0.0014) [2024-06-15 21:32:45,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1593802752. Throughput: 0: 11116.1. Samples: 398546944. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:32:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:32:46,787][1653645] Updated weights for policy 0, policy_version 778261 (0.0015) [2024-06-15 21:32:47,077][1651596] Signal inference workers to stop experience collection... (40400 times) [2024-06-15 21:32:47,166][1653645] InferenceWorker_p0-w0: stopping experience collection (40400 times) [2024-06-15 21:32:47,287][1651596] Signal inference workers to resume experience collection... (40400 times) [2024-06-15 21:32:47,288][1653645] InferenceWorker_p0-w0: resuming experience collection (40400 times) [2024-06-15 21:32:48,518][1653645] Updated weights for policy 0, policy_version 778339 (0.0014) [2024-06-15 21:32:49,792][1653645] Updated weights for policy 0, policy_version 778401 (0.0012) [2024-06-15 21:32:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1594228736. Throughput: 0: 11195.7. Samples: 398577664. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:32:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:32:55,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1594228736. Throughput: 0: 11059.2. Samples: 398651904. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:32:55,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 21:32:56,284][1653645] Updated weights for policy 0, policy_version 778449 (0.0013) [2024-06-15 21:32:56,543][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000778464_1594294272.pth... [2024-06-15 21:32:56,671][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000773312_1583742976.pth [2024-06-15 21:32:58,046][1653645] Updated weights for policy 0, policy_version 778529 (0.0010) [2024-06-15 21:32:59,291][1653645] Updated weights for policy 0, policy_version 778596 (0.0014) [2024-06-15 21:33:00,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 46421.3, 300 sec: 44764.4). Total num frames: 1594720256. Throughput: 0: 11366.4. Samples: 398717440. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:33:01,036][1653645] Updated weights for policy 0, policy_version 778680 (0.0014) [2024-06-15 21:33:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1594753024. Throughput: 0: 11446.0. Samples: 398753280. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:33:08,167][1653645] Updated weights for policy 0, policy_version 778739 (0.0012) [2024-06-15 21:33:09,512][1653645] Updated weights for policy 0, policy_version 778813 (0.0013) [2024-06-15 21:33:10,838][1653645] Updated weights for policy 0, policy_version 778864 (0.0015) [2024-06-15 21:33:10,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 46421.2, 300 sec: 44320.1). Total num frames: 1595113472. Throughput: 0: 11593.9. Samples: 398825984. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:33:12,940][1653645] Updated weights for policy 0, policy_version 778942 (0.0014) [2024-06-15 21:33:15,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1595277312. Throughput: 0: 11628.1. Samples: 398889472. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:33:19,604][1653645] Updated weights for policy 0, policy_version 778992 (0.0013) [2024-06-15 21:33:20,930][1653645] Updated weights for policy 0, policy_version 779056 (0.0012) [2024-06-15 21:33:20,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 47513.6, 300 sec: 44431.2). Total num frames: 1595506688. Throughput: 0: 11525.7. Samples: 398930432. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:33:22,075][1653645] Updated weights for policy 0, policy_version 779104 (0.0059) [2024-06-15 21:33:23,535][1651596] Signal inference workers to stop experience collection... (40450 times) [2024-06-15 21:33:23,588][1653645] InferenceWorker_p0-w0: stopping experience collection (40450 times) [2024-06-15 21:33:23,759][1651596] Signal inference workers to resume experience collection... (40450 times) [2024-06-15 21:33:23,767][1653645] InferenceWorker_p0-w0: resuming experience collection (40450 times) [2024-06-15 21:33:23,769][1653645] Updated weights for policy 0, policy_version 779168 (0.0012) [2024-06-15 21:33:25,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1595801600. Throughput: 0: 11616.7. Samples: 398992384. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:33:29,308][1653645] Updated weights for policy 0, policy_version 779206 (0.0015) [2024-06-15 21:33:30,915][1653645] Updated weights for policy 0, policy_version 779296 (0.0013) [2024-06-15 21:33:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 46967.5, 300 sec: 44653.4). Total num frames: 1595998208. Throughput: 0: 11650.9. Samples: 399071232. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:33:33,132][1653645] Updated weights for policy 0, policy_version 779360 (0.0017) [2024-06-15 21:33:35,397][1653645] Updated weights for policy 0, policy_version 779427 (0.0013) [2024-06-15 21:33:35,957][1648982] Fps is (10 sec: 49153.5, 60 sec: 47513.7, 300 sec: 44764.4). Total num frames: 1596293120. Throughput: 0: 11650.9. Samples: 399101952. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:35,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:33:40,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1596325888. Throughput: 0: 11673.6. Samples: 399177216. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:33:41,164][1653645] Updated weights for policy 0, policy_version 779472 (0.0144) [2024-06-15 21:33:42,853][1653645] Updated weights for policy 0, policy_version 779552 (0.0156) [2024-06-15 21:33:45,186][1653645] Updated weights for policy 0, policy_version 779620 (0.0012) [2024-06-15 21:33:45,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 48605.8, 300 sec: 44653.3). Total num frames: 1596719104. Throughput: 0: 11468.8. Samples: 399233536. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:45,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:33:46,580][1653645] Updated weights for policy 0, policy_version 779664 (0.0013) [2024-06-15 21:33:47,774][1653645] Updated weights for policy 0, policy_version 779710 (0.0011) [2024-06-15 21:33:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1596850176. Throughput: 0: 11355.0. Samples: 399264256. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:33:53,367][1653645] Updated weights for policy 0, policy_version 779762 (0.0012) [2024-06-15 21:33:54,769][1653645] Updated weights for policy 0, policy_version 779835 (0.0012) [2024-06-15 21:33:55,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 48605.9, 300 sec: 44764.4). Total num frames: 1597145088. Throughput: 0: 11548.5. Samples: 399345664. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:33:55,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:33:56,805][1653645] Updated weights for policy 0, policy_version 779902 (0.0013) [2024-06-15 21:33:59,008][1653645] Updated weights for policy 0, policy_version 779968 (0.0012) [2024-06-15 21:34:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1597374464. Throughput: 0: 11571.2. Samples: 399410176. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:34:00,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:34:04,964][1653645] Updated weights for policy 0, policy_version 780048 (0.0013) [2024-06-15 21:34:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 47513.6, 300 sec: 44542.3). Total num frames: 1597603840. Throughput: 0: 11685.0. Samples: 399456256. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:34:05,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:34:06,374][1653645] Updated weights for policy 0, policy_version 780096 (0.0012) [2024-06-15 21:34:07,859][1651596] Signal inference workers to stop experience collection... (40500 times) [2024-06-15 21:34:07,906][1653645] InferenceWorker_p0-w0: stopping experience collection (40500 times) [2024-06-15 21:34:08,212][1651596] Signal inference workers to resume experience collection... (40500 times) [2024-06-15 21:34:08,213][1653645] InferenceWorker_p0-w0: resuming experience collection (40500 times) [2024-06-15 21:34:08,515][1653645] Updated weights for policy 0, policy_version 780155 (0.0013) [2024-06-15 21:34:10,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1597865984. Throughput: 0: 11502.9. Samples: 399510016. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:34:10,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:34:11,088][1653645] Updated weights for policy 0, policy_version 780224 (0.0015) [2024-06-15 21:34:15,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 44782.8, 300 sec: 44209.0). Total num frames: 1597964288. Throughput: 0: 11480.1. Samples: 399587840. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:34:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:34:16,490][1653645] Updated weights for policy 0, policy_version 780275 (0.0122) [2024-06-15 21:34:17,769][1653645] Updated weights for policy 0, policy_version 780325 (0.0013) [2024-06-15 21:34:19,018][1653645] Updated weights for policy 0, policy_version 780368 (0.0022) [2024-06-15 21:34:20,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1598291968. Throughput: 0: 11389.1. Samples: 399614464. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:34:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:34:21,269][1653645] Updated weights for policy 0, policy_version 780419 (0.0016) [2024-06-15 21:34:22,830][1653645] Updated weights for policy 0, policy_version 780480 (0.0155) [2024-06-15 21:34:25,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1598423040. Throughput: 0: 11116.1. Samples: 399677440. Policy #0 lag: (min: 15.0, avg: 80.5, max: 271.0) [2024-06-15 21:34:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:34:27,963][1653645] Updated weights for policy 0, policy_version 780530 (0.0013) [2024-06-15 21:34:29,607][1653645] Updated weights for policy 0, policy_version 780592 (0.0031) [2024-06-15 21:34:30,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1598717952. Throughput: 0: 11434.7. Samples: 399748096. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:34:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:34:31,343][1653645] Updated weights for policy 0, policy_version 780640 (0.0136) [2024-06-15 21:34:34,234][1653645] Updated weights for policy 0, policy_version 780707 (0.0039) [2024-06-15 21:34:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.7, 300 sec: 44431.2). Total num frames: 1598947328. Throughput: 0: 11423.3. Samples: 399778304. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:34:35,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:34:39,795][1653645] Updated weights for policy 0, policy_version 780785 (0.0081) [2024-06-15 21:34:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1599111168. Throughput: 0: 11241.2. Samples: 399851520. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:34:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:34:42,848][1653645] Updated weights for policy 0, policy_version 780880 (0.0020) [2024-06-15 21:34:44,222][1653645] Updated weights for policy 0, policy_version 780926 (0.0057) [2024-06-15 21:34:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 1599373312. Throughput: 0: 11047.8. Samples: 399907328. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:34:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:34:46,759][1653645] Updated weights for policy 0, policy_version 780987 (0.0013) [2024-06-15 21:34:50,959][1648982] Fps is (10 sec: 39317.8, 60 sec: 44236.1, 300 sec: 44542.1). Total num frames: 1599504384. Throughput: 0: 10808.6. Samples: 399942656. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:34:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:34:51,773][1653645] Updated weights for policy 0, policy_version 781044 (0.0012) [2024-06-15 21:34:53,450][1653645] Updated weights for policy 0, policy_version 781114 (0.0013) [2024-06-15 21:34:54,613][1651596] Signal inference workers to stop experience collection... (40550 times) [2024-06-15 21:34:54,650][1653645] InferenceWorker_p0-w0: stopping experience collection (40550 times) [2024-06-15 21:34:54,970][1651596] Signal inference workers to resume experience collection... (40550 times) [2024-06-15 21:34:54,972][1653645] InferenceWorker_p0-w0: resuming experience collection (40550 times) [2024-06-15 21:34:55,965][1648982] Fps is (10 sec: 45841.2, 60 sec: 44777.4, 300 sec: 44319.0). Total num frames: 1599832064. Throughput: 0: 11159.8. Samples: 400012288. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:34:55,966][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:34:56,082][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000781184_1599864832.pth... [2024-06-15 21:34:56,098][1653645] Updated weights for policy 0, policy_version 781184 (0.0018) [2024-06-15 21:34:56,171][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000775872_1588985856.pth [2024-06-15 21:35:00,958][1648982] Fps is (10 sec: 49157.1, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1599995904. Throughput: 0: 10706.6. Samples: 400069632. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:35:02,901][1653645] Updated weights for policy 0, policy_version 781249 (0.0019) [2024-06-15 21:35:04,605][1653645] Updated weights for policy 0, policy_version 781320 (0.0012) [2024-06-15 21:35:05,926][1653645] Updated weights for policy 0, policy_version 781372 (0.0016) [2024-06-15 21:35:05,960][1648982] Fps is (10 sec: 39340.9, 60 sec: 43688.8, 300 sec: 44764.0). Total num frames: 1600225280. Throughput: 0: 11035.8. Samples: 400111104. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:05,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:35:09,044][1653645] Updated weights for policy 0, policy_version 781441 (0.0012) [2024-06-15 21:35:10,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44237.0, 300 sec: 44653.3). Total num frames: 1600520192. Throughput: 0: 10865.8. Samples: 400166400. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:35:15,170][1653645] Updated weights for policy 0, policy_version 781520 (0.0012) [2024-06-15 21:35:15,958][1648982] Fps is (10 sec: 36054.4, 60 sec: 43690.9, 300 sec: 44653.4). Total num frames: 1600585728. Throughput: 0: 11002.3. Samples: 400243200. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:35:16,849][1653645] Updated weights for policy 0, policy_version 781584 (0.0015) [2024-06-15 21:35:18,300][1653645] Updated weights for policy 0, policy_version 781632 (0.0012) [2024-06-15 21:35:20,461][1653645] Updated weights for policy 0, policy_version 781696 (0.0013) [2024-06-15 21:35:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1600946176. Throughput: 0: 10888.5. Samples: 400268288. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:35:21,706][1653645] Updated weights for policy 0, policy_version 781754 (0.0012) [2024-06-15 21:35:25,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1601044480. Throughput: 0: 10865.8. Samples: 400340480. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:35:27,741][1653645] Updated weights for policy 0, policy_version 781795 (0.0012) [2024-06-15 21:35:30,150][1653645] Updated weights for policy 0, policy_version 781877 (0.0013) [2024-06-15 21:35:30,960][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.5, 300 sec: 44764.4). Total num frames: 1601306624. Throughput: 0: 10911.3. Samples: 400398336. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:30,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:35:32,814][1653645] Updated weights for policy 0, policy_version 781943 (0.0013) [2024-06-15 21:35:34,359][1653645] Updated weights for policy 0, policy_version 782012 (0.0014) [2024-06-15 21:35:35,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.8, 300 sec: 44542.3). Total num frames: 1601568768. Throughput: 0: 10775.0. Samples: 400427520. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:35:40,716][1651596] Signal inference workers to stop experience collection... (40600 times) [2024-06-15 21:35:40,775][1653645] InferenceWorker_p0-w0: stopping experience collection (40600 times) [2024-06-15 21:35:40,781][1653645] Updated weights for policy 0, policy_version 782068 (0.0012) [2024-06-15 21:35:40,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 42598.3, 300 sec: 44764.4). Total num frames: 1601667072. Throughput: 0: 10844.8. Samples: 400500224. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:35:41,022][1651596] Signal inference workers to resume experience collection... (40600 times) [2024-06-15 21:35:41,022][1653645] InferenceWorker_p0-w0: resuming experience collection (40600 times) [2024-06-15 21:35:42,545][1653645] Updated weights for policy 0, policy_version 782141 (0.0013) [2024-06-15 21:35:45,167][1653645] Updated weights for policy 0, policy_version 782208 (0.0013) [2024-06-15 21:35:45,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1601994752. Throughput: 0: 10877.1. Samples: 400559104. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:35:46,721][1653645] Updated weights for policy 0, policy_version 782266 (0.0026) [2024-06-15 21:35:50,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43145.2, 300 sec: 44431.2). Total num frames: 1602093056. Throughput: 0: 10650.2. Samples: 400590336. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:35:52,448][1653645] Updated weights for policy 0, policy_version 782310 (0.0017) [2024-06-15 21:35:54,063][1653645] Updated weights for policy 0, policy_version 782385 (0.0013) [2024-06-15 21:35:55,958][1648982] Fps is (10 sec: 39319.7, 60 sec: 42603.3, 300 sec: 44431.1). Total num frames: 1602387968. Throughput: 0: 10979.4. Samples: 400660480. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:35:55,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:35:56,393][1653645] Updated weights for policy 0, policy_version 782438 (0.0013) [2024-06-15 21:35:58,297][1653645] Updated weights for policy 0, policy_version 782516 (0.0147) [2024-06-15 21:36:00,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1602617344. Throughput: 0: 10729.2. Samples: 400726016. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:36:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:36:04,357][1653645] Updated weights for policy 0, policy_version 782580 (0.0110) [2024-06-15 21:36:05,958][1648982] Fps is (10 sec: 49153.9, 60 sec: 44238.6, 300 sec: 44764.4). Total num frames: 1602879488. Throughput: 0: 11025.0. Samples: 400764416. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:36:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:36:07,592][1653645] Updated weights for policy 0, policy_version 782672 (0.0014) [2024-06-15 21:36:09,237][1653645] Updated weights for policy 0, policy_version 782740 (0.0012) [2024-06-15 21:36:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44653.4). Total num frames: 1603141632. Throughput: 0: 10729.3. Samples: 400823296. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:36:10,958][1648982] Avg episode reward: [(0, '36.930')] [2024-06-15 21:36:14,891][1653645] Updated weights for policy 0, policy_version 782788 (0.0023) [2024-06-15 21:36:15,958][1648982] Fps is (10 sec: 36044.3, 60 sec: 44236.5, 300 sec: 44542.2). Total num frames: 1603239936. Throughput: 0: 11104.7. Samples: 400898048. Policy #0 lag: (min: 15.0, avg: 142.1, max: 271.0) [2024-06-15 21:36:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:36:16,337][1653645] Updated weights for policy 0, policy_version 782864 (0.0091) [2024-06-15 21:36:17,273][1653645] Updated weights for policy 0, policy_version 782912 (0.0108) [2024-06-15 21:36:20,178][1651596] Signal inference workers to stop experience collection... (40650 times) [2024-06-15 21:36:20,229][1653645] InferenceWorker_p0-w0: stopping experience collection (40650 times) [2024-06-15 21:36:20,408][1651596] Signal inference workers to resume experience collection... (40650 times) [2024-06-15 21:36:20,410][1653645] InferenceWorker_p0-w0: resuming experience collection (40650 times) [2024-06-15 21:36:20,416][1653645] Updated weights for policy 0, policy_version 782992 (0.0013) [2024-06-15 21:36:20,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1603600384. Throughput: 0: 11275.3. Samples: 400934912. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:20,958][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 21:36:21,784][1653645] Updated weights for policy 0, policy_version 783040 (0.0013) [2024-06-15 21:36:25,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1603665920. Throughput: 0: 11059.2. Samples: 400997888. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:36:28,056][1653645] Updated weights for policy 0, policy_version 783108 (0.0018) [2024-06-15 21:36:29,471][1653645] Updated weights for policy 0, policy_version 783168 (0.0013) [2024-06-15 21:36:30,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1603928064. Throughput: 0: 11082.0. Samples: 401057792. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:36:33,271][1653645] Updated weights for policy 0, policy_version 783235 (0.0014) [2024-06-15 21:36:34,591][1653645] Updated weights for policy 0, policy_version 783296 (0.0016) [2024-06-15 21:36:35,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 1604190208. Throughput: 0: 11036.4. Samples: 401086976. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:35,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:36:40,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 1604354048. Throughput: 0: 10956.9. Samples: 401153536. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:36:40,992][1653645] Updated weights for policy 0, policy_version 783378 (0.0012) [2024-06-15 21:36:41,888][1653645] Updated weights for policy 0, policy_version 783424 (0.0012) [2024-06-15 21:36:44,714][1653645] Updated weights for policy 0, policy_version 783476 (0.0014) [2024-06-15 21:36:45,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1604648960. Throughput: 0: 10979.5. Samples: 401220096. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:36:46,331][1653645] Updated weights for policy 0, policy_version 783552 (0.0139) [2024-06-15 21:36:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 44653.4). Total num frames: 1604780032. Throughput: 0: 10843.0. Samples: 401252352. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:50,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:36:51,584][1653645] Updated weights for policy 0, policy_version 783618 (0.0012) [2024-06-15 21:36:52,951][1653645] Updated weights for policy 0, policy_version 783677 (0.0012) [2024-06-15 21:36:55,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 43144.8, 300 sec: 44209.0). Total num frames: 1604976640. Throughput: 0: 11070.5. Samples: 401321472. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:36:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:36:56,363][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000783712_1605042176.pth... [2024-06-15 21:36:56,524][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000778464_1594294272.pth [2024-06-15 21:36:57,610][1653645] Updated weights for policy 0, policy_version 783760 (0.0014) [2024-06-15 21:37:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1605238784. Throughput: 0: 10774.8. Samples: 401382912. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:37:02,344][1653645] Updated weights for policy 0, policy_version 783809 (0.0013) [2024-06-15 21:37:03,720][1653645] Updated weights for policy 0, policy_version 783859 (0.0013) [2024-06-15 21:37:04,949][1651596] Signal inference workers to stop experience collection... (40700 times) [2024-06-15 21:37:04,990][1653645] InferenceWorker_p0-w0: stopping experience collection (40700 times) [2024-06-15 21:37:05,207][1651596] Signal inference workers to resume experience collection... (40700 times) [2024-06-15 21:37:05,208][1653645] InferenceWorker_p0-w0: resuming experience collection (40700 times) [2024-06-15 21:37:05,433][1653645] Updated weights for policy 0, policy_version 783930 (0.0012) [2024-06-15 21:37:05,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.8, 300 sec: 44653.3). Total num frames: 1605500928. Throughput: 0: 10774.8. Samples: 401419776. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:37:09,092][1653645] Updated weights for policy 0, policy_version 783984 (0.0016) [2024-06-15 21:37:10,784][1653645] Updated weights for policy 0, policy_version 784048 (0.0164) [2024-06-15 21:37:10,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 1605730304. Throughput: 0: 10774.8. Samples: 401482752. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:37:14,940][1653645] Updated weights for policy 0, policy_version 784096 (0.0041) [2024-06-15 21:37:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44237.0, 300 sec: 44875.5). Total num frames: 1605894144. Throughput: 0: 10899.9. Samples: 401548288. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:37:16,667][1653645] Updated weights for policy 0, policy_version 784146 (0.0013) [2024-06-15 21:37:20,185][1653645] Updated weights for policy 0, policy_version 784200 (0.0014) [2024-06-15 21:37:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 44320.1). Total num frames: 1606123520. Throughput: 0: 10956.9. Samples: 401580032. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:37:21,881][1653645] Updated weights for policy 0, policy_version 784272 (0.0031) [2024-06-15 21:37:25,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1606287360. Throughput: 0: 10934.0. Samples: 401645568. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:37:25,979][1653645] Updated weights for policy 0, policy_version 784329 (0.0010) [2024-06-15 21:37:28,854][1653645] Updated weights for policy 0, policy_version 784404 (0.0022) [2024-06-15 21:37:29,734][1653645] Updated weights for policy 0, policy_version 784447 (0.0012) [2024-06-15 21:37:30,958][1648982] Fps is (10 sec: 42596.8, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1606549504. Throughput: 0: 10899.8. Samples: 401710592. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:37:33,727][1653645] Updated weights for policy 0, policy_version 784513 (0.0014) [2024-06-15 21:37:35,012][1653645] Updated weights for policy 0, policy_version 784575 (0.0015) [2024-06-15 21:37:35,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1606811648. Throughput: 0: 10911.3. Samples: 401743360. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:37:39,206][1653645] Updated weights for policy 0, policy_version 784633 (0.0013) [2024-06-15 21:37:40,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1606975488. Throughput: 0: 10786.1. Samples: 401806848. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:40,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:37:41,595][1653645] Updated weights for policy 0, policy_version 784695 (0.0135) [2024-06-15 21:37:44,795][1653645] Updated weights for policy 0, policy_version 784743 (0.0017) [2024-06-15 21:37:45,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 44098.0). Total num frames: 1607237632. Throughput: 0: 10956.8. Samples: 401875968. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:37:46,719][1653645] Updated weights for policy 0, policy_version 784825 (0.0087) [2024-06-15 21:37:50,918][1653645] Updated weights for policy 0, policy_version 784890 (0.0012) [2024-06-15 21:37:50,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 44236.9, 300 sec: 44764.4). Total num frames: 1607434240. Throughput: 0: 10797.5. Samples: 401905664. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:50,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:37:52,056][1651596] Signal inference workers to stop experience collection... (40750 times) [2024-06-15 21:37:52,118][1653645] InferenceWorker_p0-w0: stopping experience collection (40750 times) [2024-06-15 21:37:52,343][1651596] Signal inference workers to resume experience collection... (40750 times) [2024-06-15 21:37:52,346][1653645] InferenceWorker_p0-w0: resuming experience collection (40750 times) [2024-06-15 21:37:53,043][1653645] Updated weights for policy 0, policy_version 784946 (0.0013) [2024-06-15 21:37:55,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 1607598080. Throughput: 0: 10956.8. Samples: 401975808. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:37:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:37:56,701][1653645] Updated weights for policy 0, policy_version 785008 (0.0012) [2024-06-15 21:37:58,489][1653645] Updated weights for policy 0, policy_version 785080 (0.0015) [2024-06-15 21:38:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1607860224. Throughput: 0: 11025.1. Samples: 402044416. Policy #0 lag: (min: 79.0, avg: 143.9, max: 323.0) [2024-06-15 21:38:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:38:02,136][1653645] Updated weights for policy 0, policy_version 785128 (0.0012) [2024-06-15 21:38:04,157][1653645] Updated weights for policy 0, policy_version 785153 (0.0021) [2024-06-15 21:38:05,482][1653645] Updated weights for policy 0, policy_version 785212 (0.0013) [2024-06-15 21:38:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1608122368. Throughput: 0: 10990.9. Samples: 402074624. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:38:08,360][1653645] Updated weights for policy 0, policy_version 785251 (0.0011) [2024-06-15 21:38:10,158][1653645] Updated weights for policy 0, policy_version 785328 (0.0013) [2024-06-15 21:38:10,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1608384512. Throughput: 0: 11104.7. Samples: 402145280. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:38:14,030][1653645] Updated weights for policy 0, policy_version 785398 (0.0015) [2024-06-15 21:38:15,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 1608515584. Throughput: 0: 11013.8. Samples: 402206208. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:15,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:38:16,936][1653645] Updated weights for policy 0, policy_version 785460 (0.0023) [2024-06-15 21:38:20,858][1653645] Updated weights for policy 0, policy_version 785530 (0.0152) [2024-06-15 21:38:20,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1608777728. Throughput: 0: 11036.5. Samples: 402240000. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:38:22,526][1653645] Updated weights for policy 0, policy_version 785596 (0.0012) [2024-06-15 21:38:25,958][1648982] Fps is (10 sec: 45874.2, 60 sec: 44782.8, 300 sec: 43986.8). Total num frames: 1608974336. Throughput: 0: 11081.9. Samples: 402305536. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:38:26,579][1653645] Updated weights for policy 0, policy_version 785660 (0.0016) [2024-06-15 21:38:29,576][1653645] Updated weights for policy 0, policy_version 785713 (0.0032) [2024-06-15 21:38:30,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.9, 300 sec: 43653.6). Total num frames: 1609170944. Throughput: 0: 11002.3. Samples: 402371072. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:38:32,439][1653645] Updated weights for policy 0, policy_version 785788 (0.0121) [2024-06-15 21:38:34,048][1653645] Updated weights for policy 0, policy_version 785845 (0.0010) [2024-06-15 21:38:35,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1609433088. Throughput: 0: 11002.3. Samples: 402400768. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:38:37,130][1653645] Updated weights for policy 0, policy_version 785888 (0.0013) [2024-06-15 21:38:37,275][1651596] Signal inference workers to stop experience collection... (40800 times) [2024-06-15 21:38:37,346][1653645] InferenceWorker_p0-w0: stopping experience collection (40800 times) [2024-06-15 21:38:37,560][1651596] Signal inference workers to resume experience collection... (40800 times) [2024-06-15 21:38:37,560][1653645] InferenceWorker_p0-w0: resuming experience collection (40800 times) [2024-06-15 21:38:40,351][1653645] Updated weights for policy 0, policy_version 785968 (0.0017) [2024-06-15 21:38:40,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.2, 300 sec: 43986.9). Total num frames: 1609695232. Throughput: 0: 11161.6. Samples: 402478080. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:40,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:38:42,900][1653645] Updated weights for policy 0, policy_version 786004 (0.0013) [2024-06-15 21:38:44,512][1653645] Updated weights for policy 0, policy_version 786064 (0.0039) [2024-06-15 21:38:45,808][1653645] Updated weights for policy 0, policy_version 786112 (0.0015) [2024-06-15 21:38:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1609957376. Throughput: 0: 10922.7. Samples: 402535936. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:38:49,815][1653645] Updated weights for policy 0, policy_version 786173 (0.0020) [2024-06-15 21:38:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1610088448. Throughput: 0: 11207.1. Samples: 402578944. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:38:54,779][1653645] Updated weights for policy 0, policy_version 786245 (0.0020) [2024-06-15 21:38:55,958][1648982] Fps is (10 sec: 36042.6, 60 sec: 45328.6, 300 sec: 43875.7). Total num frames: 1610317824. Throughput: 0: 11013.5. Samples: 402640896. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:38:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:38:56,410][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000786304_1610350592.pth... [2024-06-15 21:38:56,411][1653645] Updated weights for policy 0, policy_version 786304 (0.0012) [2024-06-15 21:38:56,554][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000781184_1599864832.pth [2024-06-15 21:38:58,007][1653645] Updated weights for policy 0, policy_version 786367 (0.0012) [2024-06-15 21:39:00,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1610514432. Throughput: 0: 11150.2. Samples: 402707968. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:39:01,721][1653645] Updated weights for policy 0, policy_version 786423 (0.0012) [2024-06-15 21:39:04,578][1653645] Updated weights for policy 0, policy_version 786452 (0.0017) [2024-06-15 21:39:05,958][1648982] Fps is (10 sec: 42601.4, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1610743808. Throughput: 0: 11184.4. Samples: 402743296. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:39:06,397][1653645] Updated weights for policy 0, policy_version 786500 (0.0012) [2024-06-15 21:39:08,340][1653645] Updated weights for policy 0, policy_version 786576 (0.0270) [2024-06-15 21:39:10,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1611005952. Throughput: 0: 10979.6. Samples: 402799616. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:39:11,938][1653645] Updated weights for policy 0, policy_version 786640 (0.0012) [2024-06-15 21:39:15,881][1653645] Updated weights for policy 0, policy_version 786704 (0.0014) [2024-06-15 21:39:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.9, 300 sec: 43653.6). Total num frames: 1611169792. Throughput: 0: 11309.5. Samples: 402880000. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:39:18,277][1653645] Updated weights for policy 0, policy_version 786756 (0.0015) [2024-06-15 21:39:19,346][1653645] Updated weights for policy 0, policy_version 786804 (0.0011) [2024-06-15 21:39:20,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1611497472. Throughput: 0: 11264.0. Samples: 402907648. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:20,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:39:20,974][1653645] Updated weights for policy 0, policy_version 786874 (0.0022) [2024-06-15 21:39:22,984][1651596] Signal inference workers to stop experience collection... (40850 times) [2024-06-15 21:39:23,036][1653645] InferenceWorker_p0-w0: stopping experience collection (40850 times) [2024-06-15 21:39:23,308][1651596] Signal inference workers to resume experience collection... (40850 times) [2024-06-15 21:39:23,309][1653645] InferenceWorker_p0-w0: resuming experience collection (40850 times) [2024-06-15 21:39:23,914][1653645] Updated weights for policy 0, policy_version 786918 (0.0012) [2024-06-15 21:39:25,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 44783.2, 300 sec: 43875.8). Total num frames: 1611661312. Throughput: 0: 11093.3. Samples: 402977280. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:39:28,124][1653645] Updated weights for policy 0, policy_version 786976 (0.0022) [2024-06-15 21:39:30,690][1653645] Updated weights for policy 0, policy_version 787040 (0.0037) [2024-06-15 21:39:30,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 1611857920. Throughput: 0: 11286.7. Samples: 403043840. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:39:32,729][1653645] Updated weights for policy 0, policy_version 787120 (0.0103) [2024-06-15 21:39:35,870][1653645] Updated weights for policy 0, policy_version 787184 (0.0025) [2024-06-15 21:39:35,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 1612152832. Throughput: 0: 10934.0. Samples: 403070976. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:39:40,018][1653645] Updated weights for policy 0, policy_version 787232 (0.0015) [2024-06-15 21:39:40,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1612316672. Throughput: 0: 11241.4. Samples: 403146752. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:39:42,363][1653645] Updated weights for policy 0, policy_version 787296 (0.0013) [2024-06-15 21:39:44,455][1653645] Updated weights for policy 0, policy_version 787362 (0.0012) [2024-06-15 21:39:45,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.6, 300 sec: 44320.2). Total num frames: 1612578816. Throughput: 0: 11013.7. Samples: 403203584. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:39:47,562][1653645] Updated weights for policy 0, policy_version 787408 (0.0013) [2024-06-15 21:39:50,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.6, 300 sec: 43654.7). Total num frames: 1612709888. Throughput: 0: 11047.8. Samples: 403240448. Policy #0 lag: (min: 54.0, avg: 169.2, max: 310.0) [2024-06-15 21:39:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:39:51,553][1653645] Updated weights for policy 0, policy_version 787472 (0.0013) [2024-06-15 21:39:54,606][1653645] Updated weights for policy 0, policy_version 787537 (0.0012) [2024-06-15 21:39:55,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44237.2, 300 sec: 43986.9). Total num frames: 1612972032. Throughput: 0: 11195.8. Samples: 403303424. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:39:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:39:56,841][1653645] Updated weights for policy 0, policy_version 787632 (0.0141) [2024-06-15 21:40:00,761][1653645] Updated weights for policy 0, policy_version 787680 (0.0022) [2024-06-15 21:40:00,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44236.7, 300 sec: 43876.1). Total num frames: 1613168640. Throughput: 0: 10831.6. Samples: 403367424. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:40:04,607][1653645] Updated weights for policy 0, policy_version 787744 (0.0012) [2024-06-15 21:40:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1613365248. Throughput: 0: 10979.6. Samples: 403401728. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:40:06,999][1653645] Updated weights for policy 0, policy_version 787824 (0.0113) [2024-06-15 21:40:08,024][1653645] Updated weights for policy 0, policy_version 787859 (0.0012) [2024-06-15 21:40:08,336][1651596] Signal inference workers to stop experience collection... (40900 times) [2024-06-15 21:40:08,423][1653645] InferenceWorker_p0-w0: stopping experience collection (40900 times) [2024-06-15 21:40:08,549][1651596] Signal inference workers to resume experience collection... (40900 times) [2024-06-15 21:40:08,550][1653645] InferenceWorker_p0-w0: resuming experience collection (40900 times) [2024-06-15 21:40:08,852][1653645] Updated weights for policy 0, policy_version 787903 (0.0013) [2024-06-15 21:40:10,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1613627392. Throughput: 0: 10843.0. Samples: 403465216. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:40:12,945][1653645] Updated weights for policy 0, policy_version 787957 (0.0015) [2024-06-15 21:40:15,253][1653645] Updated weights for policy 0, policy_version 787988 (0.0012) [2024-06-15 21:40:15,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 1613856768. Throughput: 0: 11150.3. Samples: 403545600. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:40:17,495][1653645] Updated weights for policy 0, policy_version 788086 (0.0173) [2024-06-15 21:40:19,744][1653645] Updated weights for policy 0, policy_version 788144 (0.0014) [2024-06-15 21:40:20,959][1648982] Fps is (10 sec: 52430.1, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1614151680. Throughput: 0: 11104.7. Samples: 403570688. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:20,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:40:24,696][1653645] Updated weights for policy 0, policy_version 788222 (0.0011) [2024-06-15 21:40:25,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1614282752. Throughput: 0: 11082.0. Samples: 403645440. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:40:27,052][1653645] Updated weights for policy 0, policy_version 788272 (0.0013) [2024-06-15 21:40:28,810][1653645] Updated weights for policy 0, policy_version 788350 (0.0015) [2024-06-15 21:40:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45329.2, 300 sec: 44097.9). Total num frames: 1614577664. Throughput: 0: 11218.5. Samples: 403708416. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:40:31,645][1653645] Updated weights for policy 0, policy_version 788404 (0.0025) [2024-06-15 21:40:35,958][1648982] Fps is (10 sec: 49150.7, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1614774272. Throughput: 0: 11184.3. Samples: 403743744. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:35,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:40:36,182][1653645] Updated weights for policy 0, policy_version 788480 (0.0016) [2024-06-15 21:40:38,616][1653645] Updated weights for policy 0, policy_version 788543 (0.0013) [2024-06-15 21:40:39,785][1653645] Updated weights for policy 0, policy_version 788580 (0.0013) [2024-06-15 21:40:40,958][1648982] Fps is (10 sec: 49150.8, 60 sec: 45875.0, 300 sec: 44320.1). Total num frames: 1615069184. Throughput: 0: 11275.3. Samples: 403810816. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:40,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:40:42,336][1653645] Updated weights for policy 0, policy_version 788626 (0.0015) [2024-06-15 21:40:45,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1615200256. Throughput: 0: 11468.9. Samples: 403883520. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:40:46,708][1653645] Updated weights for policy 0, policy_version 788673 (0.0012) [2024-06-15 21:40:47,978][1653645] Updated weights for policy 0, policy_version 788736 (0.0016) [2024-06-15 21:40:50,816][1653645] Updated weights for policy 0, policy_version 788805 (0.0014) [2024-06-15 21:40:50,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 45875.2, 300 sec: 44320.2). Total num frames: 1615462400. Throughput: 0: 11446.0. Samples: 403916800. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:50,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:40:51,948][1653645] Updated weights for policy 0, policy_version 788858 (0.0013) [2024-06-15 21:40:53,779][1651596] Signal inference workers to stop experience collection... (40950 times) [2024-06-15 21:40:53,811][1653645] InferenceWorker_p0-w0: stopping experience collection (40950 times) [2024-06-15 21:40:54,061][1651596] Signal inference workers to resume experience collection... (40950 times) [2024-06-15 21:40:54,062][1653645] InferenceWorker_p0-w0: resuming experience collection (40950 times) [2024-06-15 21:40:54,725][1653645] Updated weights for policy 0, policy_version 788919 (0.0013) [2024-06-15 21:40:55,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1615724544. Throughput: 0: 11434.7. Samples: 403979776. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:40:55,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 21:40:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000788928_1615724544.pth... [2024-06-15 21:40:56,026][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000783712_1605042176.pth [2024-06-15 21:40:59,243][1653645] Updated weights for policy 0, policy_version 788977 (0.0012) [2024-06-15 21:41:00,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1615855616. Throughput: 0: 11195.7. Samples: 404049408. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:41:02,511][1653645] Updated weights for policy 0, policy_version 789056 (0.0013) [2024-06-15 21:41:03,562][1653645] Updated weights for policy 0, policy_version 789092 (0.0016) [2024-06-15 21:41:04,109][1653645] Updated weights for policy 0, policy_version 789118 (0.0010) [2024-06-15 21:41:05,960][1648982] Fps is (10 sec: 49152.6, 60 sec: 47513.5, 300 sec: 44320.1). Total num frames: 1616216064. Throughput: 0: 11332.3. Samples: 404080640. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:05,960][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 21:41:06,123][1653645] Updated weights for policy 0, policy_version 789175 (0.0012) [2024-06-15 21:41:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44783.1, 300 sec: 44320.1). Total num frames: 1616314368. Throughput: 0: 11275.4. Samples: 404152832. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:41:11,293][1653645] Updated weights for policy 0, policy_version 789239 (0.0213) [2024-06-15 21:41:13,905][1653645] Updated weights for policy 0, policy_version 789296 (0.0011) [2024-06-15 21:41:15,807][1653645] Updated weights for policy 0, policy_version 789370 (0.0012) [2024-06-15 21:41:15,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 46421.3, 300 sec: 44209.0). Total num frames: 1616642048. Throughput: 0: 11207.1. Samples: 404212736. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:41:18,014][1653645] Updated weights for policy 0, policy_version 789433 (0.0011) [2024-06-15 21:41:20,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1616773120. Throughput: 0: 11161.7. Samples: 404246016. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:41:22,918][1653645] Updated weights for policy 0, policy_version 789502 (0.0022) [2024-06-15 21:41:25,958][1648982] Fps is (10 sec: 29491.1, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1616936960. Throughput: 0: 11127.5. Samples: 404311552. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:25,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:41:26,851][1653645] Updated weights for policy 0, policy_version 789555 (0.0012) [2024-06-15 21:41:28,239][1653645] Updated weights for policy 0, policy_version 789616 (0.0010) [2024-06-15 21:41:29,944][1653645] Updated weights for policy 0, policy_version 789688 (0.0011) [2024-06-15 21:41:30,958][1648982] Fps is (10 sec: 52426.7, 60 sec: 45328.9, 300 sec: 44431.2). Total num frames: 1617297408. Throughput: 0: 10956.7. Samples: 404376576. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:30,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:41:34,830][1653645] Updated weights for policy 0, policy_version 789733 (0.0013) [2024-06-15 21:41:35,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 44237.0, 300 sec: 44320.1). Total num frames: 1617428480. Throughput: 0: 11002.3. Samples: 404411904. Policy #0 lag: (min: 10.0, avg: 103.2, max: 266.0) [2024-06-15 21:41:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:41:37,722][1653645] Updated weights for policy 0, policy_version 789779 (0.0028) [2024-06-15 21:41:39,093][1651596] Signal inference workers to stop experience collection... (41000 times) [2024-06-15 21:41:39,159][1653645] InferenceWorker_p0-w0: stopping experience collection (41000 times) [2024-06-15 21:41:39,372][1651596] Signal inference workers to resume experience collection... (41000 times) [2024-06-15 21:41:39,382][1653645] InferenceWorker_p0-w0: resuming experience collection (41000 times) [2024-06-15 21:41:39,384][1653645] Updated weights for policy 0, policy_version 789856 (0.0012) [2024-06-15 21:41:40,959][1648982] Fps is (10 sec: 45876.4, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1617756160. Throughput: 0: 11070.6. Samples: 404477952. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:41:40,961][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:41:41,000][1653645] Updated weights for policy 0, policy_version 789922 (0.0014) [2024-06-15 21:41:45,713][1653645] Updated weights for policy 0, policy_version 789957 (0.0035) [2024-06-15 21:41:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 1617854464. Throughput: 0: 11070.6. Samples: 404547584. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:41:45,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:41:46,921][1653645] Updated weights for policy 0, policy_version 790016 (0.0030) [2024-06-15 21:41:50,591][1653645] Updated weights for policy 0, policy_version 790083 (0.0013) [2024-06-15 21:41:50,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 44542.3). Total num frames: 1618116608. Throughput: 0: 11138.9. Samples: 404581888. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:41:50,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:41:52,364][1653645] Updated weights for policy 0, policy_version 790160 (0.0011) [2024-06-15 21:41:55,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1618345984. Throughput: 0: 11002.3. Samples: 404647936. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:41:55,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:41:57,187][1653645] Updated weights for policy 0, policy_version 790210 (0.0034) [2024-06-15 21:42:00,236][1653645] Updated weights for policy 0, policy_version 790273 (0.0021) [2024-06-15 21:42:00,962][1648982] Fps is (10 sec: 42579.5, 60 sec: 44779.7, 300 sec: 44208.4). Total num frames: 1618542592. Throughput: 0: 11262.9. Samples: 404719616. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:00,963][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:42:03,202][1653645] Updated weights for policy 0, policy_version 790375 (0.0101) [2024-06-15 21:42:04,485][1653645] Updated weights for policy 0, policy_version 790434 (0.0018) [2024-06-15 21:42:05,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 44236.7, 300 sec: 44542.2). Total num frames: 1618870272. Throughput: 0: 11081.9. Samples: 404744704. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:42:09,879][1653645] Updated weights for policy 0, policy_version 790480 (0.0014) [2024-06-15 21:42:10,958][1648982] Fps is (10 sec: 42617.4, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1618968576. Throughput: 0: 11138.8. Samples: 404812800. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:42:12,864][1653645] Updated weights for policy 0, policy_version 790544 (0.0016) [2024-06-15 21:42:13,950][1653645] Updated weights for policy 0, policy_version 790592 (0.0030) [2024-06-15 21:42:15,707][1653645] Updated weights for policy 0, policy_version 790646 (0.0012) [2024-06-15 21:42:15,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1619263488. Throughput: 0: 11036.5. Samples: 404873216. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:42:17,074][1653645] Updated weights for policy 0, policy_version 790713 (0.0013) [2024-06-15 21:42:20,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1619394560. Throughput: 0: 10990.9. Samples: 404906496. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:42:22,670][1653645] Updated weights for policy 0, policy_version 790768 (0.0013) [2024-06-15 21:42:25,355][1651596] Signal inference workers to stop experience collection... (41050 times) [2024-06-15 21:42:25,453][1653645] InferenceWorker_p0-w0: stopping experience collection (41050 times) [2024-06-15 21:42:25,612][1651596] Signal inference workers to resume experience collection... (41050 times) [2024-06-15 21:42:25,613][1653645] InferenceWorker_p0-w0: resuming experience collection (41050 times) [2024-06-15 21:42:25,615][1653645] Updated weights for policy 0, policy_version 790832 (0.0012) [2024-06-15 21:42:25,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44782.9, 300 sec: 44320.2). Total num frames: 1619623936. Throughput: 0: 11025.1. Samples: 404974080. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:25,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 21:42:26,770][1653645] Updated weights for policy 0, policy_version 790865 (0.0011) [2024-06-15 21:42:27,955][1653645] Updated weights for policy 0, policy_version 790914 (0.0012) [2024-06-15 21:42:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1619918848. Throughput: 0: 10945.4. Samples: 405040128. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:42:33,281][1653645] Updated weights for policy 0, policy_version 790981 (0.0012) [2024-06-15 21:42:34,454][1653645] Updated weights for policy 0, policy_version 791038 (0.0012) [2024-06-15 21:42:35,958][1648982] Fps is (10 sec: 42597.2, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1620049920. Throughput: 0: 10934.0. Samples: 405073920. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:35,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:42:37,770][1653645] Updated weights for policy 0, policy_version 791088 (0.0012) [2024-06-15 21:42:38,919][1653645] Updated weights for policy 0, policy_version 791144 (0.0110) [2024-06-15 21:42:39,344][1653645] Updated weights for policy 0, policy_version 791168 (0.0018) [2024-06-15 21:42:40,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1620410368. Throughput: 0: 10968.2. Samples: 405141504. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:42:41,061][1653645] Updated weights for policy 0, policy_version 791230 (0.0013) [2024-06-15 21:42:45,958][1648982] Fps is (10 sec: 49153.6, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1620541440. Throughput: 0: 10935.1. Samples: 405211648. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:42:46,087][1653645] Updated weights for policy 0, policy_version 791295 (0.0014) [2024-06-15 21:42:48,635][1653645] Updated weights for policy 0, policy_version 791344 (0.0014) [2024-06-15 21:42:50,284][1653645] Updated weights for policy 0, policy_version 791418 (0.0015) [2024-06-15 21:42:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1620836352. Throughput: 0: 11127.5. Samples: 405245440. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:50,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:42:52,836][1653645] Updated weights for policy 0, policy_version 791478 (0.0012) [2024-06-15 21:42:55,958][1648982] Fps is (10 sec: 42596.6, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1620967424. Throughput: 0: 11150.1. Samples: 405314560. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:42:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:42:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000791488_1620967424.pth... [2024-06-15 21:42:56,169][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000786304_1610350592.pth [2024-06-15 21:42:57,440][1653645] Updated weights for policy 0, policy_version 791545 (0.0146) [2024-06-15 21:42:59,986][1653645] Updated weights for policy 0, policy_version 791613 (0.0105) [2024-06-15 21:43:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44786.2, 300 sec: 44431.2). Total num frames: 1621229568. Throughput: 0: 11275.4. Samples: 405380608. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:43:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:43:02,009][1653645] Updated weights for policy 0, policy_version 791671 (0.0011) [2024-06-15 21:43:03,760][1653645] Updated weights for policy 0, policy_version 791712 (0.0014) [2024-06-15 21:43:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1621491712. Throughput: 0: 11389.1. Samples: 405419008. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:43:05,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:43:06,916][1653645] Updated weights for policy 0, policy_version 791746 (0.0012) [2024-06-15 21:43:08,225][1653645] Updated weights for policy 0, policy_version 791799 (0.0012) [2024-06-15 21:43:10,932][1653645] Updated weights for policy 0, policy_version 791840 (0.0057) [2024-06-15 21:43:10,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45329.1, 300 sec: 44653.4). Total num frames: 1621688320. Throughput: 0: 11400.5. Samples: 405487104. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:43:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:43:11,019][1651596] Signal inference workers to stop experience collection... (41100 times) [2024-06-15 21:43:11,069][1653645] InferenceWorker_p0-w0: stopping experience collection (41100 times) [2024-06-15 21:43:11,278][1651596] Signal inference workers to resume experience collection... (41100 times) [2024-06-15 21:43:11,279][1653645] InferenceWorker_p0-w0: resuming experience collection (41100 times) [2024-06-15 21:43:13,737][1653645] Updated weights for policy 0, policy_version 791920 (0.0013) [2024-06-15 21:43:14,936][1653645] Updated weights for policy 0, policy_version 791952 (0.0011) [2024-06-15 21:43:15,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1621983232. Throughput: 0: 11343.6. Samples: 405550592. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:43:15,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:43:16,191][1653645] Updated weights for policy 0, policy_version 791999 (0.0016) [2024-06-15 21:43:19,960][1653645] Updated weights for policy 0, policy_version 792058 (0.0149) [2024-06-15 21:43:20,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45875.2, 300 sec: 44653.4). Total num frames: 1622147072. Throughput: 0: 11400.6. Samples: 405586944. Policy #0 lag: (min: 18.0, avg: 107.4, max: 274.0) [2024-06-15 21:43:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:43:23,668][1653645] Updated weights for policy 0, policy_version 792126 (0.0013) [2024-06-15 21:43:25,855][1653645] Updated weights for policy 0, policy_version 792185 (0.0013) [2024-06-15 21:43:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 45875.1, 300 sec: 44764.4). Total num frames: 1622376448. Throughput: 0: 11286.7. Samples: 405649408. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:43:28,172][1653645] Updated weights for policy 0, policy_version 792254 (0.0013) [2024-06-15 21:43:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1622540288. Throughput: 0: 11150.2. Samples: 405713408. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:43:33,091][1653645] Updated weights for policy 0, policy_version 792314 (0.0014) [2024-06-15 21:43:34,447][1653645] Updated weights for policy 0, policy_version 792352 (0.0013) [2024-06-15 21:43:35,823][1653645] Updated weights for policy 0, policy_version 792400 (0.0012) [2024-06-15 21:43:35,960][1648982] Fps is (10 sec: 45875.3, 60 sec: 46421.5, 300 sec: 44542.3). Total num frames: 1622835200. Throughput: 0: 11207.1. Samples: 405749760. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:35,963][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 21:43:38,946][1653645] Updated weights for policy 0, policy_version 792464 (0.0059) [2024-06-15 21:43:39,796][1653645] Updated weights for policy 0, policy_version 792507 (0.0021) [2024-06-15 21:43:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1623064576. Throughput: 0: 11082.1. Samples: 405813248. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:43:44,367][1653645] Updated weights for policy 0, policy_version 792572 (0.0012) [2024-06-15 21:43:45,895][1653645] Updated weights for policy 0, policy_version 792624 (0.0013) [2024-06-15 21:43:45,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1623293952. Throughput: 0: 11298.1. Samples: 405889024. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:43:47,371][1653645] Updated weights for policy 0, policy_version 792662 (0.0015) [2024-06-15 21:43:49,688][1653645] Updated weights for policy 0, policy_version 792706 (0.0018) [2024-06-15 21:43:50,955][1653645] Updated weights for policy 0, policy_version 792763 (0.0013) [2024-06-15 21:43:50,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 45329.0, 300 sec: 44875.6). Total num frames: 1623556096. Throughput: 0: 11150.2. Samples: 405920768. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:43:55,834][1653645] Updated weights for policy 0, policy_version 792827 (0.0012) [2024-06-15 21:43:55,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45875.5, 300 sec: 44764.4). Total num frames: 1623719936. Throughput: 0: 11252.6. Samples: 405993472. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:43:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:43:57,190][1653645] Updated weights for policy 0, policy_version 792865 (0.0010) [2024-06-15 21:43:57,976][1651596] Signal inference workers to stop experience collection... (41150 times) [2024-06-15 21:43:58,007][1653645] InferenceWorker_p0-w0: stopping experience collection (41150 times) [2024-06-15 21:43:58,336][1651596] Signal inference workers to resume experience collection... (41150 times) [2024-06-15 21:43:58,346][1653645] InferenceWorker_p0-w0: resuming experience collection (41150 times) [2024-06-15 21:43:58,574][1653645] Updated weights for policy 0, policy_version 792919 (0.0142) [2024-06-15 21:44:00,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1623982080. Throughput: 0: 11366.4. Samples: 406062080. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:44:01,880][1653645] Updated weights for policy 0, policy_version 792992 (0.0012) [2024-06-15 21:44:05,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1624113152. Throughput: 0: 11195.7. Samples: 406090752. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:05,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:44:06,614][1653645] Updated weights for policy 0, policy_version 793041 (0.0013) [2024-06-15 21:44:08,652][1653645] Updated weights for policy 0, policy_version 793108 (0.0013) [2024-06-15 21:44:09,960][1653645] Updated weights for policy 0, policy_version 793153 (0.0014) [2024-06-15 21:44:10,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1624440832. Throughput: 0: 11332.3. Samples: 406159360. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:44:11,399][1653645] Updated weights for policy 0, policy_version 793215 (0.0015) [2024-06-15 21:44:15,184][1653645] Updated weights for policy 0, policy_version 793280 (0.0014) [2024-06-15 21:44:15,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 1624637440. Throughput: 0: 11286.7. Samples: 406221312. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:15,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:44:20,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1624768512. Throughput: 0: 11286.8. Samples: 406257664. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:44:21,533][1653645] Updated weights for policy 0, policy_version 793362 (0.0012) [2024-06-15 21:44:22,894][1653645] Updated weights for policy 0, policy_version 793429 (0.0012) [2024-06-15 21:44:25,582][1653645] Updated weights for policy 0, policy_version 793479 (0.0134) [2024-06-15 21:44:25,960][1648982] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 44764.4). Total num frames: 1625063424. Throughput: 0: 11275.4. Samples: 406320640. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:25,961][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 21:44:26,752][1653645] Updated weights for policy 0, policy_version 793536 (0.0012) [2024-06-15 21:44:30,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1625260032. Throughput: 0: 11082.0. Samples: 406387712. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:44:31,085][1653645] Updated weights for policy 0, policy_version 793598 (0.0012) [2024-06-15 21:44:34,590][1653645] Updated weights for policy 0, policy_version 793667 (0.0012) [2024-06-15 21:44:35,964][1648982] Fps is (10 sec: 49119.2, 60 sec: 45324.0, 300 sec: 44874.5). Total num frames: 1625554944. Throughput: 0: 11216.8. Samples: 406425600. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:35,965][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:44:36,003][1653645] Updated weights for policy 0, policy_version 793728 (0.0012) [2024-06-15 21:44:38,796][1653645] Updated weights for policy 0, policy_version 793779 (0.0033) [2024-06-15 21:44:40,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 1625686016. Throughput: 0: 10956.7. Samples: 406486528. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:40,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:44:41,694][1653645] Updated weights for policy 0, policy_version 793812 (0.0013) [2024-06-15 21:44:44,741][1653645] Updated weights for policy 0, policy_version 793857 (0.0022) [2024-06-15 21:44:45,935][1653645] Updated weights for policy 0, policy_version 793904 (0.0011) [2024-06-15 21:44:45,958][1648982] Fps is (10 sec: 36069.1, 60 sec: 43690.7, 300 sec: 44764.4). Total num frames: 1625915392. Throughput: 0: 10956.8. Samples: 406555136. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:44:46,270][1651596] Signal inference workers to stop experience collection... (41200 times) [2024-06-15 21:44:46,304][1653645] InferenceWorker_p0-w0: stopping experience collection (41200 times) [2024-06-15 21:44:46,384][1651596] Signal inference workers to resume experience collection... (41200 times) [2024-06-15 21:44:46,385][1653645] InferenceWorker_p0-w0: resuming experience collection (41200 times) [2024-06-15 21:44:46,972][1653645] Updated weights for policy 0, policy_version 793952 (0.0013) [2024-06-15 21:44:49,946][1653645] Updated weights for policy 0, policy_version 793987 (0.0014) [2024-06-15 21:44:50,958][1648982] Fps is (10 sec: 49153.5, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 1626177536. Throughput: 0: 11036.5. Samples: 406587392. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:44:51,033][1653645] Updated weights for policy 0, policy_version 794041 (0.0013) [2024-06-15 21:44:54,232][1653645] Updated weights for policy 0, policy_version 794101 (0.0064) [2024-06-15 21:44:55,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1626341376. Throughput: 0: 10934.0. Samples: 406651392. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:44:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:44:56,419][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000794128_1626374144.pth... [2024-06-15 21:44:56,538][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000788928_1615724544.pth [2024-06-15 21:44:57,271][1653645] Updated weights for policy 0, policy_version 794167 (0.0013) [2024-06-15 21:44:58,765][1653645] Updated weights for policy 0, policy_version 794211 (0.0013) [2024-06-15 21:45:00,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1626603520. Throughput: 0: 11184.4. Samples: 406724608. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:45:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:45:01,981][1653645] Updated weights for policy 0, policy_version 794258 (0.0012) [2024-06-15 21:45:05,071][1653645] Updated weights for policy 0, policy_version 794320 (0.0014) [2024-06-15 21:45:05,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45329.1, 300 sec: 44764.5). Total num frames: 1626832896. Throughput: 0: 11025.1. Samples: 406753792. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:45:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:45:07,978][1653645] Updated weights for policy 0, policy_version 794379 (0.0014) [2024-06-15 21:45:10,252][1653645] Updated weights for policy 0, policy_version 794464 (0.0011) [2024-06-15 21:45:10,958][1648982] Fps is (10 sec: 49148.5, 60 sec: 44236.3, 300 sec: 44875.4). Total num frames: 1627095040. Throughput: 0: 11138.7. Samples: 406821888. Policy #0 lag: (min: 15.0, avg: 130.6, max: 271.0) [2024-06-15 21:45:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:45:13,346][1653645] Updated weights for policy 0, policy_version 794512 (0.0012) [2024-06-15 21:45:15,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1627258880. Throughput: 0: 11093.3. Samples: 406886912. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:45:16,961][1653645] Updated weights for policy 0, policy_version 794584 (0.0013) [2024-06-15 21:45:20,434][1653645] Updated weights for policy 0, policy_version 794656 (0.0013) [2024-06-15 21:45:20,959][1648982] Fps is (10 sec: 39324.6, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1627488256. Throughput: 0: 10935.7. Samples: 406917632. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:20,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:45:22,365][1653645] Updated weights for policy 0, policy_version 794708 (0.0021) [2024-06-15 21:45:25,669][1653645] Updated weights for policy 0, policy_version 794755 (0.0011) [2024-06-15 21:45:25,959][1648982] Fps is (10 sec: 42593.7, 60 sec: 43689.9, 300 sec: 44431.0). Total num frames: 1627684864. Throughput: 0: 11115.9. Samples: 406986752. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:45:26,906][1653645] Updated weights for policy 0, policy_version 794816 (0.0129) [2024-06-15 21:45:29,561][1653645] Updated weights for policy 0, policy_version 794880 (0.0015) [2024-06-15 21:45:30,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 1627914240. Throughput: 0: 11013.7. Samples: 407050752. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:45:33,810][1653645] Updated weights for policy 0, policy_version 794944 (0.0021) [2024-06-15 21:45:33,954][1651596] Signal inference workers to stop experience collection... (41250 times) [2024-06-15 21:45:33,996][1653645] InferenceWorker_p0-w0: stopping experience collection (41250 times) [2024-06-15 21:45:34,264][1651596] Signal inference workers to resume experience collection... (41250 times) [2024-06-15 21:45:34,265][1653645] InferenceWorker_p0-w0: resuming experience collection (41250 times) [2024-06-15 21:45:35,274][1653645] Updated weights for policy 0, policy_version 795002 (0.0011) [2024-06-15 21:45:35,958][1648982] Fps is (10 sec: 49157.3, 60 sec: 43695.6, 300 sec: 44431.2). Total num frames: 1628176384. Throughput: 0: 11013.7. Samples: 407083008. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:45:37,888][1653645] Updated weights for policy 0, policy_version 795056 (0.0010) [2024-06-15 21:45:39,813][1653645] Updated weights for policy 0, policy_version 795088 (0.0140) [2024-06-15 21:45:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1628438528. Throughput: 0: 11252.6. Samples: 407157760. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:45:44,235][1653645] Updated weights for policy 0, policy_version 795137 (0.0084) [2024-06-15 21:45:45,639][1653645] Updated weights for policy 0, policy_version 795200 (0.0013) [2024-06-15 21:45:45,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1628569600. Throughput: 0: 11093.3. Samples: 407223808. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:45:46,822][1653645] Updated weights for policy 0, policy_version 795257 (0.0010) [2024-06-15 21:45:49,101][1653645] Updated weights for policy 0, policy_version 795312 (0.0015) [2024-06-15 21:45:50,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1628831744. Throughput: 0: 11184.3. Samples: 407257088. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:45:52,323][1653645] Updated weights for policy 0, policy_version 795376 (0.0012) [2024-06-15 21:45:55,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 43690.5, 300 sec: 44431.1). Total num frames: 1628962816. Throughput: 0: 11173.1. Samples: 407324672. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:45:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:45:57,753][1653645] Updated weights for policy 0, policy_version 795461 (0.0014) [2024-06-15 21:45:58,978][1653645] Updated weights for policy 0, policy_version 795508 (0.0139) [2024-06-15 21:46:00,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 1629290496. Throughput: 0: 11082.0. Samples: 407385600. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:46:01,337][1653645] Updated weights for policy 0, policy_version 795568 (0.0011) [2024-06-15 21:46:03,354][1653645] Updated weights for policy 0, policy_version 795616 (0.0012) [2024-06-15 21:46:05,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1629487104. Throughput: 0: 11104.7. Samples: 407417344. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:46:08,444][1653645] Updated weights for policy 0, policy_version 795650 (0.0049) [2024-06-15 21:46:10,841][1653645] Updated weights for policy 0, policy_version 795744 (0.0013) [2024-06-15 21:46:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43145.0, 300 sec: 44209.0). Total num frames: 1629683712. Throughput: 0: 11070.8. Samples: 407484928. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:10,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:46:13,416][1653645] Updated weights for policy 0, policy_version 795812 (0.0015) [2024-06-15 21:46:15,675][1653645] Updated weights for policy 0, policy_version 795888 (0.0014) [2024-06-15 21:46:15,958][1648982] Fps is (10 sec: 49151.7, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1629978624. Throughput: 0: 10888.6. Samples: 407540736. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:46:20,958][1648982] Fps is (10 sec: 32767.2, 60 sec: 42052.0, 300 sec: 44320.1). Total num frames: 1630011392. Throughput: 0: 10945.4. Samples: 407575552. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:20,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 21:46:22,135][1651596] Signal inference workers to stop experience collection... (41300 times) [2024-06-15 21:46:22,310][1651596] Signal inference workers to resume experience collection... (41300 times) [2024-06-15 21:46:22,322][1653645] InferenceWorker_p0-w0: stopping experience collection (41300 times) [2024-06-15 21:46:22,336][1653645] Updated weights for policy 0, policy_version 795952 (0.0032) [2024-06-15 21:46:22,348][1653645] InferenceWorker_p0-w0: resuming experience collection (41300 times) [2024-06-15 21:46:23,777][1653645] Updated weights for policy 0, policy_version 796016 (0.0139) [2024-06-15 21:46:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44783.8, 300 sec: 44320.2). Total num frames: 1630371840. Throughput: 0: 10808.9. Samples: 407644160. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:46:26,105][1653645] Updated weights for policy 0, policy_version 796096 (0.0012) [2024-06-15 21:46:27,950][1653645] Updated weights for policy 0, policy_version 796159 (0.0013) [2024-06-15 21:46:30,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1630535680. Throughput: 0: 10661.0. Samples: 407703552. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:30,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:46:35,739][1653645] Updated weights for policy 0, policy_version 796211 (0.0012) [2024-06-15 21:46:35,958][1648982] Fps is (10 sec: 26214.1, 60 sec: 40960.0, 300 sec: 43653.6). Total num frames: 1630633984. Throughput: 0: 10752.0. Samples: 407740928. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:46:37,599][1653645] Updated weights for policy 0, policy_version 796284 (0.0103) [2024-06-15 21:46:39,220][1653645] Updated weights for policy 0, policy_version 796352 (0.0017) [2024-06-15 21:46:40,729][1653645] Updated weights for policy 0, policy_version 796415 (0.0013) [2024-06-15 21:46:40,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.8, 300 sec: 44764.4). Total num frames: 1631059968. Throughput: 0: 10444.9. Samples: 407794688. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:40,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:46:45,958][1648982] Fps is (10 sec: 42596.9, 60 sec: 41505.9, 300 sec: 43875.7). Total num frames: 1631059968. Throughput: 0: 10683.6. Samples: 407866368. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:45,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:46:48,213][1653645] Updated weights for policy 0, policy_version 796465 (0.0014) [2024-06-15 21:46:50,152][1653645] Updated weights for policy 0, policy_version 796548 (0.0014) [2024-06-15 21:46:50,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1631387648. Throughput: 0: 10672.3. Samples: 407897600. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:50,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:46:51,462][1653645] Updated weights for policy 0, policy_version 796605 (0.0013) [2024-06-15 21:46:52,624][1653645] Updated weights for policy 0, policy_version 796642 (0.0012) [2024-06-15 21:46:55,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.8, 300 sec: 44209.7). Total num frames: 1631584256. Throughput: 0: 10626.8. Samples: 407963136. Policy #0 lag: (min: 15.0, avg: 129.0, max: 271.0) [2024-06-15 21:46:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:46:55,993][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000796672_1631584256.pth... [2024-06-15 21:46:56,090][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000791488_1620967424.pth [2024-06-15 21:46:59,261][1653645] Updated weights for policy 0, policy_version 796674 (0.0013) [2024-06-15 21:47:00,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 40413.9, 300 sec: 43542.6). Total num frames: 1631715328. Throughput: 0: 10888.5. Samples: 408030720. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:47:01,448][1653645] Updated weights for policy 0, policy_version 796768 (0.0088) [2024-06-15 21:47:02,748][1653645] Updated weights for policy 0, policy_version 796819 (0.0012) [2024-06-15 21:47:03,039][1651596] Signal inference workers to stop experience collection... (41350 times) [2024-06-15 21:47:03,078][1653645] InferenceWorker_p0-w0: stopping experience collection (41350 times) [2024-06-15 21:47:03,222][1651596] Signal inference workers to resume experience collection... (41350 times) [2024-06-15 21:47:03,223][1653645] InferenceWorker_p0-w0: resuming experience collection (41350 times) [2024-06-15 21:47:03,723][1653645] Updated weights for policy 0, policy_version 796868 (0.0012) [2024-06-15 21:47:04,830][1653645] Updated weights for policy 0, policy_version 796923 (0.0014) [2024-06-15 21:47:05,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.6, 300 sec: 44542.3). Total num frames: 1632108544. Throughput: 0: 10752.1. Samples: 408059392. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:47:10,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 40960.0, 300 sec: 43653.6). Total num frames: 1632141312. Throughput: 0: 10934.0. Samples: 408136192. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:47:11,432][1653645] Updated weights for policy 0, policy_version 796962 (0.0014) [2024-06-15 21:47:12,907][1653645] Updated weights for policy 0, policy_version 797024 (0.0012) [2024-06-15 21:47:14,976][1653645] Updated weights for policy 0, policy_version 797114 (0.0033) [2024-06-15 21:47:15,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 44542.3). Total num frames: 1632534528. Throughput: 0: 10786.1. Samples: 408188928. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:47:16,550][1653645] Updated weights for policy 0, policy_version 797176 (0.0013) [2024-06-15 21:47:20,958][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 1632632832. Throughput: 0: 10797.5. Samples: 408226816. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:47:22,662][1653645] Updated weights for policy 0, policy_version 797205 (0.0012) [2024-06-15 21:47:24,167][1653645] Updated weights for policy 0, policy_version 797264 (0.0068) [2024-06-15 21:47:25,302][1653645] Updated weights for policy 0, policy_version 797312 (0.0012) [2024-06-15 21:47:25,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 1632927744. Throughput: 0: 11082.0. Samples: 408293376. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:47:26,761][1653645] Updated weights for policy 0, policy_version 797376 (0.0013) [2024-06-15 21:47:28,117][1653645] Updated weights for policy 0, policy_version 797431 (0.0023) [2024-06-15 21:47:30,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1633157120. Throughput: 0: 11082.0. Samples: 408365056. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:30,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 21:47:35,120][1653645] Updated weights for policy 0, policy_version 797488 (0.0012) [2024-06-15 21:47:35,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 44783.0, 300 sec: 43764.7). Total num frames: 1633320960. Throughput: 0: 11195.7. Samples: 408401408. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:47:36,975][1653645] Updated weights for policy 0, policy_version 797568 (0.0078) [2024-06-15 21:47:38,448][1653645] Updated weights for policy 0, policy_version 797622 (0.0014) [2024-06-15 21:47:39,988][1653645] Updated weights for policy 0, policy_version 797689 (0.0011) [2024-06-15 21:47:40,966][1648982] Fps is (10 sec: 52387.8, 60 sec: 43684.9, 300 sec: 44541.1). Total num frames: 1633681408. Throughput: 0: 10909.4. Samples: 408454144. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:40,966][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:47:45,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 44237.1, 300 sec: 43653.6). Total num frames: 1633714176. Throughput: 0: 11116.1. Samples: 408530944. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:47:46,670][1653645] Updated weights for policy 0, policy_version 797744 (0.0013) [2024-06-15 21:47:47,687][1651596] Signal inference workers to stop experience collection... (41400 times) [2024-06-15 21:47:47,771][1653645] InferenceWorker_p0-w0: stopping experience collection (41400 times) [2024-06-15 21:47:47,934][1651596] Signal inference workers to resume experience collection... (41400 times) [2024-06-15 21:47:47,934][1653645] InferenceWorker_p0-w0: resuming experience collection (41400 times) [2024-06-15 21:47:48,820][1653645] Updated weights for policy 0, policy_version 797810 (0.0014) [2024-06-15 21:47:50,332][1653645] Updated weights for policy 0, policy_version 797872 (0.0019) [2024-06-15 21:47:50,958][1648982] Fps is (10 sec: 39351.6, 60 sec: 44782.7, 300 sec: 44431.2). Total num frames: 1634074624. Throughput: 0: 11104.6. Samples: 408559104. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:47:51,641][1653645] Updated weights for policy 0, policy_version 797926 (0.0016) [2024-06-15 21:47:52,264][1653645] Updated weights for policy 0, policy_version 797951 (0.0013) [2024-06-15 21:47:55,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1634205696. Throughput: 0: 10899.9. Samples: 408626688. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:47:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:47:59,747][1653645] Updated weights for policy 0, policy_version 798034 (0.0016) [2024-06-15 21:48:00,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 1634467840. Throughput: 0: 11195.7. Samples: 408692736. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:00,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 21:48:01,236][1653645] Updated weights for policy 0, policy_version 798096 (0.0020) [2024-06-15 21:48:03,475][1653645] Updated weights for policy 0, policy_version 798178 (0.0013) [2024-06-15 21:48:05,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1634729984. Throughput: 0: 10877.1. Samples: 408716288. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:48:09,657][1653645] Updated weights for policy 0, policy_version 798224 (0.0012) [2024-06-15 21:48:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.1, 300 sec: 43653.6). Total num frames: 1634861056. Throughput: 0: 11172.9. Samples: 408796160. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:48:11,848][1653645] Updated weights for policy 0, policy_version 798288 (0.0017) [2024-06-15 21:48:14,252][1653645] Updated weights for policy 0, policy_version 798374 (0.0012) [2024-06-15 21:48:15,958][1648982] Fps is (10 sec: 45872.5, 60 sec: 44236.4, 300 sec: 44208.9). Total num frames: 1635188736. Throughput: 0: 10569.8. Samples: 408840704. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:15,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:48:20,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1635254272. Throughput: 0: 10478.9. Samples: 408872960. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:48:22,803][1653645] Updated weights for policy 0, policy_version 798465 (0.0017) [2024-06-15 21:48:24,092][1653645] Updated weights for policy 0, policy_version 798519 (0.0012) [2024-06-15 21:48:25,827][1653645] Updated weights for policy 0, policy_version 798576 (0.0084) [2024-06-15 21:48:25,958][1648982] Fps is (10 sec: 29492.1, 60 sec: 42598.1, 300 sec: 43875.7). Total num frames: 1635483648. Throughput: 0: 10844.9. Samples: 408942080. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:48:27,165][1653645] Updated weights for policy 0, policy_version 798640 (0.0019) [2024-06-15 21:48:27,629][1651596] Signal inference workers to stop experience collection... (41450 times) [2024-06-15 21:48:27,658][1653645] InferenceWorker_p0-w0: stopping experience collection (41450 times) [2024-06-15 21:48:27,858][1651596] Signal inference workers to resume experience collection... (41450 times) [2024-06-15 21:48:27,858][1653645] InferenceWorker_p0-w0: resuming experience collection (41450 times) [2024-06-15 21:48:28,938][1653645] Updated weights for policy 0, policy_version 798720 (0.0012) [2024-06-15 21:48:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1635778560. Throughput: 0: 10376.6. Samples: 408997888. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:48:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 42052.0, 300 sec: 43320.4). Total num frames: 1635844096. Throughput: 0: 10615.5. Samples: 409036800. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:35,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:48:36,959][1653645] Updated weights for policy 0, policy_version 798800 (0.0095) [2024-06-15 21:48:38,847][1653645] Updated weights for policy 0, policy_version 798880 (0.0014) [2024-06-15 21:48:40,776][1653645] Updated weights for policy 0, policy_version 798948 (0.0167) [2024-06-15 21:48:40,957][1648982] Fps is (10 sec: 49152.3, 60 sec: 43150.3, 300 sec: 43986.9). Total num frames: 1636270080. Throughput: 0: 10353.8. Samples: 409092608. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:48:45,958][1648982] Fps is (10 sec: 45876.8, 60 sec: 43144.6, 300 sec: 43209.4). Total num frames: 1636302848. Throughput: 0: 10444.8. Samples: 409162752. Policy #0 lag: (min: 47.0, avg: 94.5, max: 287.0) [2024-06-15 21:48:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:48:47,620][1653645] Updated weights for policy 0, policy_version 798995 (0.0022) [2024-06-15 21:48:49,115][1653645] Updated weights for policy 0, policy_version 799056 (0.0013) [2024-06-15 21:48:50,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 42052.5, 300 sec: 43653.7). Total num frames: 1636597760. Throughput: 0: 10797.5. Samples: 409202176. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:48:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:48:51,175][1653645] Updated weights for policy 0, policy_version 799136 (0.0012) [2024-06-15 21:48:52,377][1653645] Updated weights for policy 0, policy_version 799184 (0.0018) [2024-06-15 21:48:53,269][1653645] Updated weights for policy 0, policy_version 799230 (0.0082) [2024-06-15 21:48:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1636827136. Throughput: 0: 10467.6. Samples: 409267200. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:48:55,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:48:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000799232_1636827136.pth... [2024-06-15 21:48:56,060][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000794128_1626374144.pth [2024-06-15 21:48:59,316][1653645] Updated weights for policy 0, policy_version 799280 (0.0011) [2024-06-15 21:49:00,819][1653645] Updated weights for policy 0, policy_version 799329 (0.0013) [2024-06-15 21:49:00,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 43764.7). Total num frames: 1637023744. Throughput: 0: 11207.3. Samples: 409345024. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:49:02,349][1653645] Updated weights for policy 0, policy_version 799392 (0.0084) [2024-06-15 21:49:04,116][1653645] Updated weights for policy 0, policy_version 799458 (0.0013) [2024-06-15 21:49:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1637351424. Throughput: 0: 11093.3. Samples: 409372160. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:49:09,924][1653645] Updated weights for policy 0, policy_version 799523 (0.0020) [2024-06-15 21:49:10,503][1653645] Updated weights for policy 0, policy_version 799552 (0.0013) [2024-06-15 21:49:10,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1637482496. Throughput: 0: 11116.2. Samples: 409442304. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:49:13,385][1651596] Signal inference workers to stop experience collection... (41500 times) [2024-06-15 21:49:13,422][1653645] InferenceWorker_p0-w0: stopping experience collection (41500 times) [2024-06-15 21:49:13,635][1651596] Signal inference workers to resume experience collection... (41500 times) [2024-06-15 21:49:13,636][1653645] InferenceWorker_p0-w0: resuming experience collection (41500 times) [2024-06-15 21:49:14,071][1653645] Updated weights for policy 0, policy_version 799632 (0.0014) [2024-06-15 21:49:15,429][1653645] Updated weights for policy 0, policy_version 799696 (0.0013) [2024-06-15 21:49:15,958][1648982] Fps is (10 sec: 45872.8, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1637810176. Throughput: 0: 11241.1. Samples: 409503744. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:15,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:49:16,512][1653645] Updated weights for policy 0, policy_version 799741 (0.0014) [2024-06-15 21:49:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1637875712. Throughput: 0: 11127.5. Samples: 409537536. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:49:22,216][1653645] Updated weights for policy 0, policy_version 799792 (0.0015) [2024-06-15 21:49:24,350][1653645] Updated weights for policy 0, policy_version 799827 (0.0026) [2024-06-15 21:49:25,958][1648982] Fps is (10 sec: 36046.5, 60 sec: 44783.1, 300 sec: 43764.7). Total num frames: 1638170624. Throughput: 0: 11525.6. Samples: 409611264. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:25,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:49:26,124][1653645] Updated weights for policy 0, policy_version 799907 (0.0014) [2024-06-15 21:49:27,990][1653645] Updated weights for policy 0, policy_version 799991 (0.0013) [2024-06-15 21:49:30,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43543.6). Total num frames: 1638400000. Throughput: 0: 11229.9. Samples: 409668096. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:49:33,871][1653645] Updated weights for policy 0, policy_version 800033 (0.0013) [2024-06-15 21:49:35,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44783.2, 300 sec: 43542.6). Total num frames: 1638531072. Throughput: 0: 11195.7. Samples: 409705984. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:49:37,128][1653645] Updated weights for policy 0, policy_version 800112 (0.0011) [2024-06-15 21:49:38,614][1653645] Updated weights for policy 0, policy_version 800176 (0.0015) [2024-06-15 21:49:40,796][1653645] Updated weights for policy 0, policy_version 800251 (0.0014) [2024-06-15 21:49:40,958][1648982] Fps is (10 sec: 52426.6, 60 sec: 44236.4, 300 sec: 44097.9). Total num frames: 1638924288. Throughput: 0: 11059.1. Samples: 409764864. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:49:45,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1638957056. Throughput: 0: 10786.2. Samples: 409830400. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:49:46,742][1653645] Updated weights for policy 0, policy_version 800313 (0.0013) [2024-06-15 21:49:50,249][1653645] Updated weights for policy 0, policy_version 800384 (0.0067) [2024-06-15 21:49:50,958][1648982] Fps is (10 sec: 29492.5, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1639219200. Throughput: 0: 10979.6. Samples: 409866240. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:49:52,326][1653645] Updated weights for policy 0, policy_version 800464 (0.0028) [2024-06-15 21:49:55,958][1648982] Fps is (10 sec: 49151.1, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1639448576. Throughput: 0: 10535.8. Samples: 409916416. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:49:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:49:58,126][1651596] Signal inference workers to stop experience collection... (41550 times) [2024-06-15 21:49:58,196][1653645] Updated weights for policy 0, policy_version 800513 (0.0013) [2024-06-15 21:49:58,232][1653645] InferenceWorker_p0-w0: stopping experience collection (41550 times) [2024-06-15 21:49:58,499][1651596] Signal inference workers to resume experience collection... (41550 times) [2024-06-15 21:49:58,500][1653645] InferenceWorker_p0-w0: resuming experience collection (41550 times) [2024-06-15 21:49:59,332][1653645] Updated weights for policy 0, policy_version 800570 (0.0019) [2024-06-15 21:50:00,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 42598.3, 300 sec: 43209.3). Total num frames: 1639579648. Throughput: 0: 10809.0. Samples: 409990144. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:00,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:50:02,605][1653645] Updated weights for policy 0, policy_version 800640 (0.0012) [2024-06-15 21:50:03,919][1653645] Updated weights for policy 0, policy_version 800704 (0.0016) [2024-06-15 21:50:05,205][1653645] Updated weights for policy 0, policy_version 800761 (0.0012) [2024-06-15 21:50:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.7, 300 sec: 43653.8). Total num frames: 1639972864. Throughput: 0: 10695.1. Samples: 410018816. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:50:10,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 1640038400. Throughput: 0: 10581.3. Samples: 410087424. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:50:11,456][1653645] Updated weights for policy 0, policy_version 800823 (0.0024) [2024-06-15 21:50:14,146][1653645] Updated weights for policy 0, policy_version 800896 (0.0082) [2024-06-15 21:50:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42598.8, 300 sec: 43653.6). Total num frames: 1640366080. Throughput: 0: 10581.3. Samples: 410144256. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:50:16,528][1653645] Updated weights for policy 0, policy_version 800966 (0.0013) [2024-06-15 21:50:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 43431.6). Total num frames: 1640497152. Throughput: 0: 10433.4. Samples: 410175488. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:50:23,555][1653645] Updated weights for policy 0, policy_version 801042 (0.0012) [2024-06-15 21:50:24,498][1653645] Updated weights for policy 0, policy_version 801088 (0.0012) [2024-06-15 21:50:25,958][1648982] Fps is (10 sec: 32767.5, 60 sec: 42052.2, 300 sec: 43320.4). Total num frames: 1640693760. Throughput: 0: 10683.8. Samples: 410245632. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:50:27,685][1653645] Updated weights for policy 0, policy_version 801200 (0.0195) [2024-06-15 21:50:28,079][1653645] Updated weights for policy 0, policy_version 801213 (0.0010) [2024-06-15 21:50:29,982][1653645] Updated weights for policy 0, policy_version 801275 (0.0016) [2024-06-15 21:50:30,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1641021440. Throughput: 0: 10490.2. Samples: 410302464. Policy #0 lag: (min: 0.0, avg: 62.3, max: 256.0) [2024-06-15 21:50:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:50:35,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1641086976. Throughput: 0: 10615.5. Samples: 410343936. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:50:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:50:36,149][1653645] Updated weights for policy 0, policy_version 801333 (0.0013) [2024-06-15 21:50:38,816][1653645] Updated weights for policy 0, policy_version 801408 (0.0015) [2024-06-15 21:50:40,650][1651596] Signal inference workers to stop experience collection... (41600 times) [2024-06-15 21:50:40,675][1651596] Signal inference workers to resume experience collection... (41600 times) [2024-06-15 21:50:40,682][1653645] Updated weights for policy 0, policy_version 801472 (0.0013) [2024-06-15 21:50:40,705][1653645] InferenceWorker_p0-w0: stopping experience collection (41600 times) [2024-06-15 21:50:40,706][1653645] InferenceWorker_p0-w0: resuming experience collection (41600 times) [2024-06-15 21:50:40,960][1648982] Fps is (10 sec: 39322.0, 60 sec: 41506.3, 300 sec: 43542.6). Total num frames: 1641414656. Throughput: 0: 10786.1. Samples: 410401792. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:50:40,961][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:50:41,839][1653645] Updated weights for policy 0, policy_version 801520 (0.0014) [2024-06-15 21:50:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 43098.3). Total num frames: 1641545728. Throughput: 0: 10592.8. Samples: 410466816. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:50:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:50:47,391][1653645] Updated weights for policy 0, policy_version 801560 (0.0018) [2024-06-15 21:50:50,613][1653645] Updated weights for policy 0, policy_version 801619 (0.0012) [2024-06-15 21:50:50,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 42052.2, 300 sec: 43320.4). Total num frames: 1641742336. Throughput: 0: 10740.6. Samples: 410502144. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:50:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:50:52,889][1653645] Updated weights for policy 0, policy_version 801712 (0.0114) [2024-06-15 21:50:54,440][1653645] Updated weights for policy 0, policy_version 801776 (0.0112) [2024-06-15 21:50:55,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.6, 300 sec: 43320.4). Total num frames: 1642070016. Throughput: 0: 10467.5. Samples: 410558464. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:50:55,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:50:55,987][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000801792_1642070016.pth... [2024-06-15 21:50:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000796672_1631584256.pth [2024-06-15 21:50:56,071][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000801792_1642070016.pth [2024-06-15 21:51:00,076][1653645] Updated weights for policy 0, policy_version 801824 (0.0013) [2024-06-15 21:51:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.8, 300 sec: 43098.2). Total num frames: 1642201088. Throughput: 0: 10740.6. Samples: 410627584. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:51:02,849][1653645] Updated weights for policy 0, policy_version 801875 (0.0013) [2024-06-15 21:51:04,210][1653645] Updated weights for policy 0, policy_version 801936 (0.0012) [2024-06-15 21:51:05,477][1653645] Updated weights for policy 0, policy_version 801986 (0.0012) [2024-06-15 21:51:05,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 1642496000. Throughput: 0: 10843.0. Samples: 410663424. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:05,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:51:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1642594304. Throughput: 0: 10615.5. Samples: 410723328. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:10,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:51:11,652][1653645] Updated weights for policy 0, policy_version 802064 (0.0013) [2024-06-15 21:51:14,145][1653645] Updated weights for policy 0, policy_version 802114 (0.0012) [2024-06-15 21:51:15,359][1653645] Updated weights for policy 0, policy_version 802176 (0.0014) [2024-06-15 21:51:15,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 43764.7). Total num frames: 1642921984. Throughput: 0: 10854.4. Samples: 410790912. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:51:16,535][1653645] Updated weights for policy 0, policy_version 802236 (0.0013) [2024-06-15 21:51:18,338][1653645] Updated weights for policy 0, policy_version 802296 (0.0014) [2024-06-15 21:51:20,958][1648982] Fps is (10 sec: 52426.4, 60 sec: 43690.3, 300 sec: 43209.2). Total num frames: 1643118592. Throughput: 0: 10569.8. Samples: 410819584. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:51:23,208][1653645] Updated weights for policy 0, policy_version 802368 (0.0014) [2024-06-15 21:51:25,960][1648982] Fps is (10 sec: 32768.5, 60 sec: 42598.5, 300 sec: 43098.3). Total num frames: 1643249664. Throughput: 0: 10911.3. Samples: 410892800. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:25,961][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:51:26,915][1651596] Signal inference workers to stop experience collection... (41650 times) [2024-06-15 21:51:26,948][1653645] InferenceWorker_p0-w0: stopping experience collection (41650 times) [2024-06-15 21:51:27,127][1651596] Signal inference workers to resume experience collection... (41650 times) [2024-06-15 21:51:27,128][1653645] InferenceWorker_p0-w0: resuming experience collection (41650 times) [2024-06-15 21:51:27,911][1653645] Updated weights for policy 0, policy_version 802448 (0.0012) [2024-06-15 21:51:29,319][1653645] Updated weights for policy 0, policy_version 802503 (0.0012) [2024-06-15 21:51:30,472][1653645] Updated weights for policy 0, policy_version 802560 (0.0013) [2024-06-15 21:51:30,958][1648982] Fps is (10 sec: 52431.2, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1643642880. Throughput: 0: 10797.5. Samples: 410952704. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:51:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44782.9, 300 sec: 43098.2). Total num frames: 1643773952. Throughput: 0: 10922.7. Samples: 410993664. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:51:38,152][1653645] Updated weights for policy 0, policy_version 802628 (0.0013) [2024-06-15 21:51:39,575][1653645] Updated weights for policy 0, policy_version 802704 (0.0101) [2024-06-15 21:51:40,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1644036096. Throughput: 0: 11195.7. Samples: 411062272. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:51:41,342][1653645] Updated weights for policy 0, policy_version 802768 (0.0012) [2024-06-15 21:51:45,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 1644232704. Throughput: 0: 11173.0. Samples: 411130368. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:51:46,275][1653645] Updated weights for policy 0, policy_version 802864 (0.0012) [2024-06-15 21:51:49,760][1653645] Updated weights for policy 0, policy_version 802896 (0.0014) [2024-06-15 21:51:50,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 44782.4, 300 sec: 43542.5). Total num frames: 1644429312. Throughput: 0: 11081.8. Samples: 411162112. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:50,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:51:51,700][1653645] Updated weights for policy 0, policy_version 802977 (0.0012) [2024-06-15 21:51:52,965][1653645] Updated weights for policy 0, policy_version 803028 (0.0013) [2024-06-15 21:51:54,058][1653645] Updated weights for policy 0, policy_version 803072 (0.0030) [2024-06-15 21:51:55,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1644691456. Throughput: 0: 11229.9. Samples: 411228672. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:51:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:51:57,660][1653645] Updated weights for policy 0, policy_version 803136 (0.0013) [2024-06-15 21:52:00,958][1648982] Fps is (10 sec: 39324.3, 60 sec: 43690.7, 300 sec: 43098.2). Total num frames: 1644822528. Throughput: 0: 11400.6. Samples: 411303936. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:52:00,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:52:02,534][1653645] Updated weights for policy 0, policy_version 803200 (0.0013) [2024-06-15 21:52:03,914][1653645] Updated weights for policy 0, policy_version 803255 (0.0016) [2024-06-15 21:52:05,310][1653645] Updated weights for policy 0, policy_version 803312 (0.0014) [2024-06-15 21:52:05,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1645215744. Throughput: 0: 11457.6. Samples: 411335168. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:52:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:52:07,560][1651596] Signal inference workers to stop experience collection... (41700 times) [2024-06-15 21:52:07,650][1653645] InferenceWorker_p0-w0: stopping experience collection (41700 times) [2024-06-15 21:52:07,866][1651596] Signal inference workers to resume experience collection... (41700 times) [2024-06-15 21:52:07,867][1653645] InferenceWorker_p0-w0: resuming experience collection (41700 times) [2024-06-15 21:52:08,824][1653645] Updated weights for policy 0, policy_version 803387 (0.0015) [2024-06-15 21:52:10,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 45875.2, 300 sec: 43431.5). Total num frames: 1645346816. Throughput: 0: 11286.8. Samples: 411400704. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:52:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:52:14,395][1653645] Updated weights for policy 0, policy_version 803443 (0.0120) [2024-06-15 21:52:15,720][1653645] Updated weights for policy 0, policy_version 803504 (0.0012) [2024-06-15 21:52:15,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44237.0, 300 sec: 43875.8). Total num frames: 1645576192. Throughput: 0: 11423.3. Samples: 411466752. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:52:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 21:52:17,094][1653645] Updated weights for policy 0, policy_version 803553 (0.0011) [2024-06-15 21:52:19,980][1653645] Updated weights for policy 0, policy_version 803605 (0.0011) [2024-06-15 21:52:20,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45875.7, 300 sec: 43875.8). Total num frames: 1645871104. Throughput: 0: 11264.0. Samples: 411500544. Policy #0 lag: (min: 15.0, avg: 89.2, max: 271.0) [2024-06-15 21:52:20,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:52:24,758][1653645] Updated weights for policy 0, policy_version 803665 (0.0018) [2024-06-15 21:52:25,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 45875.2, 300 sec: 43542.6). Total num frames: 1646002176. Throughput: 0: 11457.5. Samples: 411577856. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:52:26,457][1653645] Updated weights for policy 0, policy_version 803732 (0.0012) [2024-06-15 21:52:28,794][1653645] Updated weights for policy 0, policy_version 803831 (0.0100) [2024-06-15 21:52:30,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1646264320. Throughput: 0: 11150.2. Samples: 411632128. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:52:32,147][1653645] Updated weights for policy 0, policy_version 803858 (0.0012) [2024-06-15 21:52:35,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.7, 300 sec: 43099.4). Total num frames: 1646395392. Throughput: 0: 11275.6. Samples: 411669504. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:35,958][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 21:52:36,519][1653645] Updated weights for policy 0, policy_version 803905 (0.0012) [2024-06-15 21:52:38,952][1653645] Updated weights for policy 0, policy_version 804003 (0.0014) [2024-06-15 21:52:40,499][1653645] Updated weights for policy 0, policy_version 804065 (0.0014) [2024-06-15 21:52:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 45329.3, 300 sec: 44209.0). Total num frames: 1646755840. Throughput: 0: 11195.7. Samples: 411732480. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:52:43,456][1653645] Updated weights for policy 0, policy_version 804116 (0.0015) [2024-06-15 21:52:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44783.0, 300 sec: 43542.6). Total num frames: 1646919680. Throughput: 0: 11195.7. Samples: 411807744. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:52:47,861][1653645] Updated weights for policy 0, policy_version 804161 (0.0013) [2024-06-15 21:52:50,005][1653645] Updated weights for policy 0, policy_version 804246 (0.0014) [2024-06-15 21:52:50,303][1651596] Signal inference workers to stop experience collection... (41750 times) [2024-06-15 21:52:50,361][1653645] InferenceWorker_p0-w0: stopping experience collection (41750 times) [2024-06-15 21:52:50,527][1651596] Signal inference workers to resume experience collection... (41750 times) [2024-06-15 21:52:50,528][1653645] InferenceWorker_p0-w0: resuming experience collection (41750 times) [2024-06-15 21:52:50,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 45875.6, 300 sec: 43986.8). Total num frames: 1647181824. Throughput: 0: 11355.0. Samples: 411846144. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:50,959][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 21:52:51,669][1653645] Updated weights for policy 0, policy_version 804320 (0.0036) [2024-06-15 21:52:55,516][1653645] Updated weights for policy 0, policy_version 804386 (0.0017) [2024-06-15 21:52:55,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 45875.1, 300 sec: 43986.9). Total num frames: 1647443968. Throughput: 0: 11138.8. Samples: 411901952. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:52:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:52:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000804416_1647443968.pth... [2024-06-15 21:52:56,004][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000799232_1636827136.pth [2024-06-15 21:53:00,408][1653645] Updated weights for policy 0, policy_version 804448 (0.0019) [2024-06-15 21:53:00,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 45329.1, 300 sec: 43431.5). Total num frames: 1647542272. Throughput: 0: 11457.4. Samples: 411982336. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:53:02,154][1653645] Updated weights for policy 0, policy_version 804516 (0.0015) [2024-06-15 21:53:04,382][1653645] Updated weights for policy 0, policy_version 804606 (0.0013) [2024-06-15 21:53:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1647837184. Throughput: 0: 11241.2. Samples: 412006400. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:53:08,239][1653645] Updated weights for policy 0, policy_version 804665 (0.0010) [2024-06-15 21:53:10,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.8, 300 sec: 43320.5). Total num frames: 1647968256. Throughput: 0: 11059.2. Samples: 412075520. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:10,961][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:53:12,926][1653645] Updated weights for policy 0, policy_version 804736 (0.0227) [2024-06-15 21:53:14,943][1653645] Updated weights for policy 0, policy_version 804816 (0.0039) [2024-06-15 21:53:15,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45875.1, 300 sec: 44320.1). Total num frames: 1648328704. Throughput: 0: 11070.6. Samples: 412130304. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:53:19,313][1653645] Updated weights for policy 0, policy_version 804866 (0.0018) [2024-06-15 21:53:20,661][1653645] Updated weights for policy 0, policy_version 804928 (0.0034) [2024-06-15 21:53:20,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1648492544. Throughput: 0: 11059.2. Samples: 412167168. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:53:25,618][1653645] Updated weights for policy 0, policy_version 805024 (0.0274) [2024-06-15 21:53:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 1648689152. Throughput: 0: 11264.0. Samples: 412239360. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:53:27,773][1653645] Updated weights for policy 0, policy_version 805111 (0.0043) [2024-06-15 21:53:30,959][1648982] Fps is (10 sec: 39316.9, 60 sec: 43689.7, 300 sec: 44208.9). Total num frames: 1648885760. Throughput: 0: 10831.3. Samples: 412295168. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:30,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:53:32,184][1653645] Updated weights for policy 0, policy_version 805153 (0.0016) [2024-06-15 21:53:35,580][1651596] Signal inference workers to stop experience collection... (41800 times) [2024-06-15 21:53:35,612][1653645] InferenceWorker_p0-w0: stopping experience collection (41800 times) [2024-06-15 21:53:35,860][1651596] Signal inference workers to resume experience collection... (41800 times) [2024-06-15 21:53:35,860][1653645] InferenceWorker_p0-w0: resuming experience collection (41800 times) [2024-06-15 21:53:35,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1649049600. Throughput: 0: 10865.8. Samples: 412335104. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:53:36,370][1653645] Updated weights for policy 0, policy_version 805217 (0.0018) [2024-06-15 21:53:38,863][1653645] Updated weights for policy 0, policy_version 805319 (0.0106) [2024-06-15 21:53:40,207][1653645] Updated weights for policy 0, policy_version 805376 (0.0012) [2024-06-15 21:53:40,958][1648982] Fps is (10 sec: 52434.7, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1649410048. Throughput: 0: 11025.1. Samples: 412398080. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:53:43,997][1653645] Updated weights for policy 0, policy_version 805436 (0.0101) [2024-06-15 21:53:45,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1649541120. Throughput: 0: 10922.7. Samples: 412473856. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:53:48,420][1653645] Updated weights for policy 0, policy_version 805506 (0.0040) [2024-06-15 21:53:50,510][1653645] Updated weights for policy 0, policy_version 805584 (0.0029) [2024-06-15 21:53:50,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44237.0, 300 sec: 44098.0). Total num frames: 1649836032. Throughput: 0: 11138.9. Samples: 412507648. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:53:51,633][1653645] Updated weights for policy 0, policy_version 805632 (0.0019) [2024-06-15 21:53:55,571][1653645] Updated weights for policy 0, policy_version 805694 (0.0014) [2024-06-15 21:53:55,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1650065408. Throughput: 0: 10968.2. Samples: 412569088. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:53:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:53:59,362][1653645] Updated weights for policy 0, policy_version 805756 (0.0093) [2024-06-15 21:54:00,958][1648982] Fps is (10 sec: 42596.6, 60 sec: 45328.7, 300 sec: 43764.7). Total num frames: 1650262016. Throughput: 0: 11366.3. Samples: 412641792. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:54:00,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:54:01,847][1653645] Updated weights for policy 0, policy_version 805826 (0.0013) [2024-06-15 21:54:05,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.6, 300 sec: 43986.8). Total num frames: 1650458624. Throughput: 0: 11104.7. Samples: 412666880. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:54:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:54:05,992][1653645] Updated weights for policy 0, policy_version 805892 (0.0021) [2024-06-15 21:54:06,997][1653645] Updated weights for policy 0, policy_version 805946 (0.0020) [2024-06-15 21:54:10,962][1648982] Fps is (10 sec: 42580.9, 60 sec: 45325.6, 300 sec: 43653.1). Total num frames: 1650688000. Throughput: 0: 11251.5. Samples: 412745728. Policy #0 lag: (min: 5.0, avg: 77.3, max: 261.0) [2024-06-15 21:54:10,963][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:54:11,016][1653645] Updated weights for policy 0, policy_version 806002 (0.0014) [2024-06-15 21:54:12,113][1653645] Updated weights for policy 0, policy_version 806036 (0.0012) [2024-06-15 21:54:14,157][1651596] Signal inference workers to stop experience collection... (41850 times) [2024-06-15 21:54:14,179][1653645] InferenceWorker_p0-w0: stopping experience collection (41850 times) [2024-06-15 21:54:14,194][1653645] Updated weights for policy 0, policy_version 806114 (0.0013) [2024-06-15 21:54:14,402][1651596] Signal inference workers to resume experience collection... (41850 times) [2024-06-15 21:54:14,403][1653645] InferenceWorker_p0-w0: resuming experience collection (41850 times) [2024-06-15 21:54:15,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1650982912. Throughput: 0: 11241.5. Samples: 412801024. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:15,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 21:54:18,666][1653645] Updated weights for policy 0, policy_version 806178 (0.0013) [2024-06-15 21:54:20,958][1648982] Fps is (10 sec: 42617.2, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1651113984. Throughput: 0: 11127.4. Samples: 412835840. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:54:22,846][1653645] Updated weights for policy 0, policy_version 806242 (0.0014) [2024-06-15 21:54:25,403][1653645] Updated weights for policy 0, policy_version 806322 (0.0015) [2024-06-15 21:54:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1651376128. Throughput: 0: 11184.4. Samples: 412901376. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:54:27,006][1653645] Updated weights for policy 0, policy_version 806392 (0.0011) [2024-06-15 21:54:30,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43691.5, 300 sec: 43986.9). Total num frames: 1651507200. Throughput: 0: 10888.5. Samples: 412963840. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:54:31,286][1653645] Updated weights for policy 0, policy_version 806422 (0.0012) [2024-06-15 21:54:34,504][1653645] Updated weights for policy 0, policy_version 806485 (0.0014) [2024-06-15 21:54:35,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45329.0, 300 sec: 43542.6). Total num frames: 1651769344. Throughput: 0: 10911.3. Samples: 412998656. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:35,961][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:54:36,156][1653645] Updated weights for policy 0, policy_version 806532 (0.0022) [2024-06-15 21:54:37,646][1653645] Updated weights for policy 0, policy_version 806592 (0.0012) [2024-06-15 21:54:38,881][1653645] Updated weights for policy 0, policy_version 806644 (0.0012) [2024-06-15 21:54:40,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1652031488. Throughput: 0: 10899.9. Samples: 413059584. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:40,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 21:54:43,801][1653645] Updated weights for policy 0, policy_version 806710 (0.0101) [2024-06-15 21:54:45,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44782.9, 300 sec: 44097.9). Total num frames: 1652228096. Throughput: 0: 10968.3. Samples: 413135360. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:45,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 21:54:46,075][1653645] Updated weights for policy 0, policy_version 806758 (0.0022) [2024-06-15 21:54:47,925][1653645] Updated weights for policy 0, policy_version 806802 (0.0014) [2024-06-15 21:54:49,489][1653645] Updated weights for policy 0, policy_version 806864 (0.0013) [2024-06-15 21:54:50,405][1653645] Updated weights for policy 0, policy_version 806912 (0.0017) [2024-06-15 21:54:50,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45329.0, 300 sec: 44431.2). Total num frames: 1652555776. Throughput: 0: 11127.5. Samples: 413167616. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:54:55,692][1653645] Updated weights for policy 0, policy_version 806972 (0.0015) [2024-06-15 21:54:55,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1652686848. Throughput: 0: 10992.0. Samples: 413240320. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:54:55,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 21:54:55,968][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000806976_1652686848.pth... [2024-06-15 21:54:56,018][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000801792_1642070016.pth [2024-06-15 21:54:57,658][1653645] Updated weights for policy 0, policy_version 807024 (0.0018) [2024-06-15 21:54:58,972][1653645] Updated weights for policy 0, policy_version 807056 (0.0013) [2024-06-15 21:55:00,044][1653645] Updated weights for policy 0, policy_version 807104 (0.0012) [2024-06-15 21:55:00,960][1648982] Fps is (10 sec: 39312.9, 60 sec: 44781.6, 300 sec: 43986.5). Total num frames: 1652948992. Throughput: 0: 11161.1. Samples: 413303296. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:00,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:55:01,452][1651596] Signal inference workers to stop experience collection... (41900 times) [2024-06-15 21:55:01,582][1653645] InferenceWorker_p0-w0: stopping experience collection (41900 times) [2024-06-15 21:55:01,642][1651596] Signal inference workers to resume experience collection... (41900 times) [2024-06-15 21:55:01,643][1653645] InferenceWorker_p0-w0: resuming experience collection (41900 times) [2024-06-15 21:55:01,871][1653645] Updated weights for policy 0, policy_version 807165 (0.0013) [2024-06-15 21:55:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1653145600. Throughput: 0: 11184.4. Samples: 413339136. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:05,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 21:55:08,276][1653645] Updated weights for policy 0, policy_version 807233 (0.0014) [2024-06-15 21:55:09,793][1653645] Updated weights for policy 0, policy_version 807296 (0.0014) [2024-06-15 21:55:10,958][1648982] Fps is (10 sec: 49161.3, 60 sec: 45878.4, 300 sec: 44320.1). Total num frames: 1653440512. Throughput: 0: 11104.6. Samples: 413401088. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:10,959][1648982] Avg episode reward: [(0, '36.910')] [2024-06-15 21:55:11,215][1653645] Updated weights for policy 0, policy_version 807358 (0.0013) [2024-06-15 21:55:14,018][1653645] Updated weights for policy 0, policy_version 807414 (0.0043) [2024-06-15 21:55:15,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1653604352. Throughput: 0: 11320.9. Samples: 413473280. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:15,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 21:55:18,525][1653645] Updated weights for policy 0, policy_version 807479 (0.0014) [2024-06-15 21:55:20,820][1653645] Updated weights for policy 0, policy_version 807536 (0.0015) [2024-06-15 21:55:20,958][1648982] Fps is (10 sec: 39323.0, 60 sec: 45329.2, 300 sec: 44542.3). Total num frames: 1653833728. Throughput: 0: 11275.4. Samples: 413506048. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:55:22,473][1653645] Updated weights for policy 0, policy_version 807584 (0.0021) [2024-06-15 21:55:24,296][1653645] Updated weights for policy 0, policy_version 807634 (0.0021) [2024-06-15 21:55:25,957][1648982] Fps is (10 sec: 52430.1, 60 sec: 45875.4, 300 sec: 44431.2). Total num frames: 1654128640. Throughput: 0: 11582.6. Samples: 413580800. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:25,958][1648982] Avg episode reward: [(0, '36.720')] [2024-06-15 21:55:27,952][1653645] Updated weights for policy 0, policy_version 807681 (0.0014) [2024-06-15 21:55:30,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 45875.2, 300 sec: 44653.3). Total num frames: 1654259712. Throughput: 0: 11366.4. Samples: 413646848. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:30,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:55:31,774][1653645] Updated weights for policy 0, policy_version 807781 (0.0103) [2024-06-15 21:55:33,641][1653645] Updated weights for policy 0, policy_version 807813 (0.0013) [2024-06-15 21:55:34,695][1653645] Updated weights for policy 0, policy_version 807872 (0.0014) [2024-06-15 21:55:35,958][1648982] Fps is (10 sec: 45874.3, 60 sec: 46967.6, 300 sec: 44653.4). Total num frames: 1654587392. Throughput: 0: 11366.4. Samples: 413679104. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:55:39,985][1653645] Updated weights for policy 0, policy_version 807942 (0.0012) [2024-06-15 21:55:40,970][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1654751232. Throughput: 0: 11366.4. Samples: 413751808. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:40,976][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:55:41,104][1653645] Updated weights for policy 0, policy_version 807993 (0.0013) [2024-06-15 21:55:43,132][1653645] Updated weights for policy 0, policy_version 808048 (0.0012) [2024-06-15 21:55:45,352][1653645] Updated weights for policy 0, policy_version 808104 (0.0014) [2024-06-15 21:55:45,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 46967.3, 300 sec: 45097.6). Total num frames: 1655046144. Throughput: 0: 11435.2. Samples: 413817856. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:55:46,949][1653645] Updated weights for policy 0, policy_version 808147 (0.0014) [2024-06-15 21:55:50,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1655177216. Throughput: 0: 11355.0. Samples: 413850112. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:50,959][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 21:55:51,575][1651596] Signal inference workers to stop experience collection... (41950 times) [2024-06-15 21:55:51,629][1653645] Updated weights for policy 0, policy_version 808213 (0.0023) [2024-06-15 21:55:51,644][1653645] InferenceWorker_p0-w0: stopping experience collection (41950 times) [2024-06-15 21:55:51,850][1651596] Signal inference workers to resume experience collection... (41950 times) [2024-06-15 21:55:51,851][1653645] InferenceWorker_p0-w0: resuming experience collection (41950 times) [2024-06-15 21:55:54,047][1653645] Updated weights for policy 0, policy_version 808276 (0.0013) [2024-06-15 21:55:54,708][1653645] Updated weights for policy 0, policy_version 808313 (0.0011) [2024-06-15 21:55:55,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1655439360. Throughput: 0: 11571.3. Samples: 413921792. Policy #0 lag: (min: 18.0, avg: 127.6, max: 271.0) [2024-06-15 21:55:55,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 21:55:57,253][1653645] Updated weights for policy 0, policy_version 808374 (0.0013) [2024-06-15 21:55:59,193][1653645] Updated weights for policy 0, policy_version 808418 (0.0013) [2024-06-15 21:56:00,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45876.9, 300 sec: 44764.4). Total num frames: 1655701504. Throughput: 0: 11400.5. Samples: 413986304. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:00,965][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:56:02,762][1653645] Updated weights for policy 0, policy_version 808464 (0.0013) [2024-06-15 21:56:03,558][1653645] Updated weights for policy 0, policy_version 808504 (0.0014) [2024-06-15 21:56:05,251][1653645] Updated weights for policy 0, policy_version 808574 (0.0051) [2024-06-15 21:56:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 46967.6, 300 sec: 45319.8). Total num frames: 1655963648. Throughput: 0: 11639.5. Samples: 414029824. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:56:09,171][1653645] Updated weights for policy 0, policy_version 808633 (0.0021) [2024-06-15 21:56:10,721][1653645] Updated weights for policy 0, policy_version 808674 (0.0011) [2024-06-15 21:56:10,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45329.3, 300 sec: 44875.5). Total num frames: 1656160256. Throughput: 0: 11446.0. Samples: 414095872. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:10,963][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:56:13,854][1653645] Updated weights for policy 0, policy_version 808721 (0.0012) [2024-06-15 21:56:15,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46421.4, 300 sec: 44986.7). Total num frames: 1656389632. Throughput: 0: 11503.0. Samples: 414164480. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 21:56:16,590][1653645] Updated weights for policy 0, policy_version 808821 (0.0014) [2024-06-15 21:56:20,535][1653645] Updated weights for policy 0, policy_version 808865 (0.0012) [2024-06-15 21:56:20,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1656586240. Throughput: 0: 11502.9. Samples: 414196736. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:56:21,802][1653645] Updated weights for policy 0, policy_version 808912 (0.0011) [2024-06-15 21:56:22,808][1653645] Updated weights for policy 0, policy_version 808957 (0.0011) [2024-06-15 21:56:25,970][1648982] Fps is (10 sec: 42546.1, 60 sec: 44773.6, 300 sec: 44651.5). Total num frames: 1656815616. Throughput: 0: 11420.2. Samples: 414265856. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:25,971][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:56:27,247][1653645] Updated weights for policy 0, policy_version 809026 (0.0012) [2024-06-15 21:56:28,509][1653645] Updated weights for policy 0, policy_version 809088 (0.0013) [2024-06-15 21:56:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1657012224. Throughput: 0: 11377.8. Samples: 414329856. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:56:33,383][1653645] Updated weights for policy 0, policy_version 809148 (0.0030) [2024-06-15 21:56:35,120][1653645] Updated weights for policy 0, policy_version 809204 (0.0014) [2024-06-15 21:56:35,958][1648982] Fps is (10 sec: 45931.8, 60 sec: 44782.9, 300 sec: 44875.6). Total num frames: 1657274368. Throughput: 0: 11355.1. Samples: 414361088. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:35,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 21:56:38,065][1653645] Updated weights for policy 0, policy_version 809237 (0.0033) [2024-06-15 21:56:38,811][1653645] Updated weights for policy 0, policy_version 809279 (0.0013) [2024-06-15 21:56:39,095][1651596] Signal inference workers to stop experience collection... (42000 times) [2024-06-15 21:56:39,145][1653645] InferenceWorker_p0-w0: stopping experience collection (42000 times) [2024-06-15 21:56:39,258][1651596] Signal inference workers to resume experience collection... (42000 times) [2024-06-15 21:56:39,259][1653645] InferenceWorker_p0-w0: resuming experience collection (42000 times) [2024-06-15 21:56:40,216][1653645] Updated weights for policy 0, policy_version 809343 (0.0014) [2024-06-15 21:56:40,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 46421.3, 300 sec: 45097.7). Total num frames: 1657536512. Throughput: 0: 11275.4. Samples: 414429184. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 21:56:44,531][1653645] Updated weights for policy 0, policy_version 809398 (0.0013) [2024-06-15 21:56:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 44875.6). Total num frames: 1657667584. Throughput: 0: 11332.3. Samples: 414496256. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:45,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 21:56:46,395][1653645] Updated weights for policy 0, policy_version 809440 (0.0017) [2024-06-15 21:56:47,073][1653645] Updated weights for policy 0, policy_version 809472 (0.0028) [2024-06-15 21:56:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 1657929728. Throughput: 0: 11116.1. Samples: 414530048. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:56:51,165][1653645] Updated weights for policy 0, policy_version 809552 (0.0016) [2024-06-15 21:56:55,656][1653645] Updated weights for policy 0, policy_version 809616 (0.0013) [2024-06-15 21:56:55,958][1648982] Fps is (10 sec: 42597.1, 60 sec: 44236.6, 300 sec: 44986.5). Total num frames: 1658093568. Throughput: 0: 11025.0. Samples: 414592000. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:56:55,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:56:56,590][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000809648_1658159104.pth... [2024-06-15 21:56:56,633][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000804416_1647443968.pth [2024-06-15 21:56:56,793][1653645] Updated weights for policy 0, policy_version 809653 (0.0019) [2024-06-15 21:56:59,118][1653645] Updated weights for policy 0, policy_version 809712 (0.0013) [2024-06-15 21:57:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1658322944. Throughput: 0: 11138.8. Samples: 414665728. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:57:02,027][1653645] Updated weights for policy 0, policy_version 809776 (0.0012) [2024-06-15 21:57:05,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.4, 300 sec: 44875.5). Total num frames: 1658585088. Throughput: 0: 10899.8. Samples: 414687232. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:57:07,614][1653645] Updated weights for policy 0, policy_version 809888 (0.0095) [2024-06-15 21:57:10,394][1653645] Updated weights for policy 0, policy_version 809955 (0.0011) [2024-06-15 21:57:10,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 44986.5). Total num frames: 1658847232. Throughput: 0: 11073.6. Samples: 414764032. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:10,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 21:57:13,334][1653645] Updated weights for policy 0, policy_version 810016 (0.0012) [2024-06-15 21:57:15,498][1653645] Updated weights for policy 0, policy_version 810111 (0.0014) [2024-06-15 21:57:15,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 1659109376. Throughput: 0: 11013.7. Samples: 414825472. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:57:19,632][1653645] Updated weights for policy 0, policy_version 810168 (0.0013) [2024-06-15 21:57:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.7, 300 sec: 44875.5). Total num frames: 1659240448. Throughput: 0: 11218.5. Samples: 414865920. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:20,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 21:57:22,191][1653645] Updated weights for policy 0, policy_version 810230 (0.0014) [2024-06-15 21:57:25,117][1653645] Updated weights for policy 0, policy_version 810272 (0.0012) [2024-06-15 21:57:25,239][1651596] Signal inference workers to stop experience collection... (42050 times) [2024-06-15 21:57:25,294][1653645] InferenceWorker_p0-w0: stopping experience collection (42050 times) [2024-06-15 21:57:25,487][1651596] Signal inference workers to resume experience collection... (42050 times) [2024-06-15 21:57:25,488][1653645] InferenceWorker_p0-w0: resuming experience collection (42050 times) [2024-06-15 21:57:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44792.1, 300 sec: 44875.5). Total num frames: 1659502592. Throughput: 0: 11286.8. Samples: 414937088. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:57:26,220][1653645] Updated weights for policy 0, policy_version 810320 (0.0013) [2024-06-15 21:57:30,760][1653645] Updated weights for policy 0, policy_version 810370 (0.0014) [2024-06-15 21:57:30,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1659633664. Throughput: 0: 11241.3. Samples: 415002112. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:57:32,173][1653645] Updated weights for policy 0, policy_version 810431 (0.0013) [2024-06-15 21:57:34,061][1653645] Updated weights for policy 0, policy_version 810495 (0.0014) [2024-06-15 21:57:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1659895808. Throughput: 0: 11082.0. Samples: 415028736. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:57:38,760][1653645] Updated weights for policy 0, policy_version 810597 (0.0014) [2024-06-15 21:57:40,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1660157952. Throughput: 0: 11059.3. Samples: 415089664. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:57:44,004][1653645] Updated weights for policy 0, policy_version 810665 (0.0037) [2024-06-15 21:57:44,928][1653645] Updated weights for policy 0, policy_version 810704 (0.0013) [2024-06-15 21:57:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1660420096. Throughput: 0: 11002.3. Samples: 415160832. Policy #0 lag: (min: 32.0, avg: 170.2, max: 288.0) [2024-06-15 21:57:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:57:48,083][1653645] Updated weights for policy 0, policy_version 810754 (0.0013) [2024-06-15 21:57:49,398][1653645] Updated weights for policy 0, policy_version 810818 (0.0126) [2024-06-15 21:57:50,619][1653645] Updated weights for policy 0, policy_version 810879 (0.0147) [2024-06-15 21:57:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1660682240. Throughput: 0: 11275.4. Samples: 415194624. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:57:50,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:57:55,958][1648982] Fps is (10 sec: 36043.3, 60 sec: 44782.8, 300 sec: 44875.4). Total num frames: 1660780544. Throughput: 0: 11320.8. Samples: 415273472. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:57:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 21:57:55,959][1653645] Updated weights for policy 0, policy_version 810944 (0.0014) [2024-06-15 21:58:00,509][1653645] Updated weights for policy 0, policy_version 811056 (0.0117) [2024-06-15 21:58:00,970][1648982] Fps is (10 sec: 39275.3, 60 sec: 45866.2, 300 sec: 44873.7). Total num frames: 1661075456. Throughput: 0: 11204.2. Samples: 415329792. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:00,971][1648982] Avg episode reward: [(0, '37.190')] [2024-06-15 21:58:01,656][1653645] Updated weights for policy 0, policy_version 811107 (0.0012) [2024-06-15 21:58:05,958][1648982] Fps is (10 sec: 42599.7, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1661206528. Throughput: 0: 11059.2. Samples: 415363584. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:58:06,852][1653645] Updated weights for policy 0, policy_version 811169 (0.0012) [2024-06-15 21:58:07,882][1653645] Updated weights for policy 0, policy_version 811216 (0.0012) [2024-06-15 21:58:08,358][1651596] Signal inference workers to stop experience collection... (42100 times) [2024-06-15 21:58:08,395][1653645] InferenceWorker_p0-w0: stopping experience collection (42100 times) [2024-06-15 21:58:08,549][1651596] Signal inference workers to resume experience collection... (42100 times) [2024-06-15 21:58:08,551][1653645] InferenceWorker_p0-w0: resuming experience collection (42100 times) [2024-06-15 21:58:10,958][1648982] Fps is (10 sec: 39368.2, 60 sec: 43690.7, 300 sec: 44542.3). Total num frames: 1661468672. Throughput: 0: 11070.6. Samples: 415435264. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:58:11,992][1653645] Updated weights for policy 0, policy_version 811296 (0.0012) [2024-06-15 21:58:13,696][1653645] Updated weights for policy 0, policy_version 811364 (0.0013) [2024-06-15 21:58:14,335][1653645] Updated weights for policy 0, policy_version 811392 (0.0011) [2024-06-15 21:58:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1661730816. Throughput: 0: 11138.8. Samples: 415503360. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:58:19,152][1653645] Updated weights for policy 0, policy_version 811459 (0.0251) [2024-06-15 21:58:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 45875.2, 300 sec: 45097.7). Total num frames: 1661992960. Throughput: 0: 11480.2. Samples: 415545344. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:58:22,888][1653645] Updated weights for policy 0, policy_version 811536 (0.0096) [2024-06-15 21:58:24,867][1653645] Updated weights for policy 0, policy_version 811620 (0.0013) [2024-06-15 21:58:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.1, 300 sec: 45320.0). Total num frames: 1662255104. Throughput: 0: 11411.9. Samples: 415603200. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 21:58:30,220][1653645] Updated weights for policy 0, policy_version 811696 (0.0012) [2024-06-15 21:58:30,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 45875.0, 300 sec: 45208.7). Total num frames: 1662386176. Throughput: 0: 11423.2. Samples: 415674880. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:30,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:58:31,726][1653645] Updated weights for policy 0, policy_version 811746 (0.0010) [2024-06-15 21:58:34,120][1653645] Updated weights for policy 0, policy_version 811798 (0.0018) [2024-06-15 21:58:35,509][1653645] Updated weights for policy 0, policy_version 811856 (0.0049) [2024-06-15 21:58:35,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1662713856. Throughput: 0: 11411.9. Samples: 415708160. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 21:58:40,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1662779392. Throughput: 0: 11104.8. Samples: 415773184. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 21:58:41,198][1653645] Updated weights for policy 0, policy_version 811920 (0.0014) [2024-06-15 21:58:43,207][1653645] Updated weights for policy 0, policy_version 811984 (0.0014) [2024-06-15 21:58:44,085][1653645] Updated weights for policy 0, policy_version 812032 (0.0015) [2024-06-15 21:58:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1663107072. Throughput: 0: 11437.7. Samples: 415844352. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:45,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 21:58:46,464][1653645] Updated weights for policy 0, policy_version 812096 (0.0012) [2024-06-15 21:58:48,847][1653645] Updated weights for policy 0, policy_version 812159 (0.0014) [2024-06-15 21:58:50,957][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.8, 300 sec: 44875.5). Total num frames: 1663303680. Throughput: 0: 11332.3. Samples: 415873536. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 21:58:53,485][1653645] Updated weights for policy 0, policy_version 812214 (0.0014) [2024-06-15 21:58:54,061][1651596] Signal inference workers to stop experience collection... (42150 times) [2024-06-15 21:58:54,094][1653645] InferenceWorker_p0-w0: stopping experience collection (42150 times) [2024-06-15 21:58:54,271][1651596] Signal inference workers to resume experience collection... (42150 times) [2024-06-15 21:58:54,274][1653645] InferenceWorker_p0-w0: resuming experience collection (42150 times) [2024-06-15 21:58:55,132][1653645] Updated weights for policy 0, policy_version 812288 (0.0044) [2024-06-15 21:58:55,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 46421.4, 300 sec: 45097.7). Total num frames: 1663565824. Throughput: 0: 11343.6. Samples: 415945728. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:58:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 21:58:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000812288_1663565824.pth... [2024-06-15 21:58:56,065][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000806976_1652686848.pth [2024-06-15 21:58:58,381][1653645] Updated weights for policy 0, policy_version 812352 (0.0014) [2024-06-15 21:59:00,439][1653645] Updated weights for policy 0, policy_version 812414 (0.0014) [2024-06-15 21:59:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45884.3, 300 sec: 45319.8). Total num frames: 1663827968. Throughput: 0: 11127.5. Samples: 416004096. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:59:04,997][1653645] Updated weights for policy 0, policy_version 812477 (0.0013) [2024-06-15 21:59:05,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 45875.3, 300 sec: 44987.3). Total num frames: 1663959040. Throughput: 0: 11093.3. Samples: 416044544. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 21:59:06,820][1653645] Updated weights for policy 0, policy_version 812517 (0.0015) [2024-06-15 21:59:08,512][1653645] Updated weights for policy 0, policy_version 812546 (0.0014) [2024-06-15 21:59:10,003][1653645] Updated weights for policy 0, policy_version 812605 (0.0012) [2024-06-15 21:59:10,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1664221184. Throughput: 0: 11298.2. Samples: 416111616. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:59:11,767][1653645] Updated weights for policy 0, policy_version 812656 (0.0024) [2024-06-15 21:59:15,990][1648982] Fps is (10 sec: 42459.8, 60 sec: 44212.9, 300 sec: 44981.6). Total num frames: 1664385024. Throughput: 0: 11312.7. Samples: 416184320. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:15,991][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 21:59:16,364][1653645] Updated weights for policy 0, policy_version 812720 (0.0013) [2024-06-15 21:59:19,124][1653645] Updated weights for policy 0, policy_version 812789 (0.0013) [2024-06-15 21:59:20,829][1653645] Updated weights for policy 0, policy_version 812836 (0.0024) [2024-06-15 21:59:20,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 44783.0, 300 sec: 45097.7). Total num frames: 1664679936. Throughput: 0: 11298.1. Samples: 416216576. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 21:59:23,552][1653645] Updated weights for policy 0, policy_version 812917 (0.0094) [2024-06-15 21:59:25,958][1648982] Fps is (10 sec: 49312.8, 60 sec: 43690.8, 300 sec: 45319.8). Total num frames: 1664876544. Throughput: 0: 11218.5. Samples: 416278016. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 21:59:27,861][1653645] Updated weights for policy 0, policy_version 812976 (0.0013) [2024-06-15 21:59:30,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 44237.1, 300 sec: 44986.6). Total num frames: 1665040384. Throughput: 0: 11298.2. Samples: 416352768. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:30,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 21:59:31,539][1653645] Updated weights for policy 0, policy_version 813041 (0.0101) [2024-06-15 21:59:32,759][1653645] Updated weights for policy 0, policy_version 813090 (0.0014) [2024-06-15 21:59:34,082][1653645] Updated weights for policy 0, policy_version 813121 (0.0012) [2024-06-15 21:59:35,307][1653645] Updated weights for policy 0, policy_version 813176 (0.0020) [2024-06-15 21:59:35,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1665400832. Throughput: 0: 11207.1. Samples: 416377856. Policy #0 lag: (min: 15.0, avg: 117.1, max: 271.0) [2024-06-15 21:59:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 21:59:40,040][1653645] Updated weights for policy 0, policy_version 813247 (0.0076) [2024-06-15 21:59:40,971][1648982] Fps is (10 sec: 49084.2, 60 sec: 45864.8, 300 sec: 45095.6). Total num frames: 1665531904. Throughput: 0: 11169.7. Samples: 416448512. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 21:59:40,972][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:59:42,499][1651596] Signal inference workers to stop experience collection... (42200 times) [2024-06-15 21:59:42,534][1653645] InferenceWorker_p0-w0: stopping experience collection (42200 times) [2024-06-15 21:59:42,805][1651596] Signal inference workers to resume experience collection... (42200 times) [2024-06-15 21:59:42,806][1653645] InferenceWorker_p0-w0: resuming experience collection (42200 times) [2024-06-15 21:59:43,351][1653645] Updated weights for policy 0, policy_version 813298 (0.0013) [2024-06-15 21:59:44,606][1653645] Updated weights for policy 0, policy_version 813349 (0.0029) [2024-06-15 21:59:45,419][1653645] Updated weights for policy 0, policy_version 813378 (0.0011) [2024-06-15 21:59:45,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45329.0, 300 sec: 44986.6). Total num frames: 1665826816. Throughput: 0: 11355.0. Samples: 416515072. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 21:59:45,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 21:59:46,720][1653645] Updated weights for policy 0, policy_version 813437 (0.0012) [2024-06-15 21:59:50,958][1648982] Fps is (10 sec: 42656.6, 60 sec: 44236.7, 300 sec: 44986.6). Total num frames: 1665957888. Throughput: 0: 11320.9. Samples: 416553984. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 21:59:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 21:59:53,273][1653645] Updated weights for policy 0, policy_version 813507 (0.0017) [2024-06-15 21:59:54,500][1653645] Updated weights for policy 0, policy_version 813568 (0.0013) [2024-06-15 21:59:55,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45329.3, 300 sec: 45209.1). Total num frames: 1666285568. Throughput: 0: 11332.3. Samples: 416621568. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 21:59:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 21:59:56,075][1653645] Updated weights for policy 0, policy_version 813626 (0.0012) [2024-06-15 21:59:57,730][1653645] Updated weights for policy 0, policy_version 813690 (0.0012) [2024-06-15 22:00:00,958][1648982] Fps is (10 sec: 49150.9, 60 sec: 43690.4, 300 sec: 45097.6). Total num frames: 1666449408. Throughput: 0: 11215.2. Samples: 416688640. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:00,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:00:03,662][1653645] Updated weights for policy 0, policy_version 813748 (0.0085) [2024-06-15 22:00:05,526][1653645] Updated weights for policy 0, policy_version 813808 (0.0015) [2024-06-15 22:00:05,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45329.1, 300 sec: 44875.5). Total num frames: 1666678784. Throughput: 0: 11286.7. Samples: 416724480. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:00:07,478][1653645] Updated weights for policy 0, policy_version 813888 (0.0014) [2024-06-15 22:00:10,317][1653645] Updated weights for policy 0, policy_version 813952 (0.0029) [2024-06-15 22:00:10,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 45875.0, 300 sec: 45319.8). Total num frames: 1666973696. Throughput: 0: 11298.1. Samples: 416786432. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:00:15,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45353.7, 300 sec: 44986.6). Total num frames: 1667104768. Throughput: 0: 11116.1. Samples: 416852992. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:00:16,856][1653645] Updated weights for policy 0, policy_version 814018 (0.0013) [2024-06-15 22:00:18,471][1653645] Updated weights for policy 0, policy_version 814096 (0.0048) [2024-06-15 22:00:20,958][1648982] Fps is (10 sec: 42599.5, 60 sec: 45329.0, 300 sec: 44986.5). Total num frames: 1667399680. Throughput: 0: 11241.2. Samples: 416883712. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:00:21,118][1653645] Updated weights for policy 0, policy_version 814176 (0.0031) [2024-06-15 22:00:25,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1667497984. Throughput: 0: 11233.3. Samples: 416953856. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:00:26,643][1653645] Updated weights for policy 0, policy_version 814212 (0.0014) [2024-06-15 22:00:27,212][1651596] Signal inference workers to stop experience collection... (42250 times) [2024-06-15 22:00:27,338][1653645] InferenceWorker_p0-w0: stopping experience collection (42250 times) [2024-06-15 22:00:27,460][1651596] Signal inference workers to resume experience collection... (42250 times) [2024-06-15 22:00:27,461][1653645] InferenceWorker_p0-w0: resuming experience collection (42250 times) [2024-06-15 22:00:27,608][1653645] Updated weights for policy 0, policy_version 814260 (0.0048) [2024-06-15 22:00:29,129][1653645] Updated weights for policy 0, policy_version 814330 (0.0011) [2024-06-15 22:00:30,769][1653645] Updated weights for policy 0, policy_version 814391 (0.0013) [2024-06-15 22:00:30,958][1648982] Fps is (10 sec: 49150.2, 60 sec: 47513.2, 300 sec: 45097.6). Total num frames: 1667891200. Throughput: 0: 11218.4. Samples: 417019904. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:00:33,384][1653645] Updated weights for policy 0, policy_version 814458 (0.0013) [2024-06-15 22:00:35,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44986.6). Total num frames: 1668022272. Throughput: 0: 11025.1. Samples: 417050112. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:00:39,703][1653645] Updated weights for policy 0, policy_version 814499 (0.0017) [2024-06-15 22:00:40,958][1648982] Fps is (10 sec: 29492.2, 60 sec: 44246.9, 300 sec: 44542.3). Total num frames: 1668186112. Throughput: 0: 11161.6. Samples: 417123840. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:00:41,460][1653645] Updated weights for policy 0, policy_version 814561 (0.0158) [2024-06-15 22:00:43,501][1653645] Updated weights for policy 0, policy_version 814647 (0.0022) [2024-06-15 22:00:45,316][1653645] Updated weights for policy 0, policy_version 814688 (0.0013) [2024-06-15 22:00:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45329.1, 300 sec: 45319.9). Total num frames: 1668546560. Throughput: 0: 10831.7. Samples: 417176064. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:00:50,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.6, 300 sec: 44431.2). Total num frames: 1668546560. Throughput: 0: 10888.5. Samples: 417214464. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:00:51,477][1653645] Updated weights for policy 0, policy_version 814736 (0.0015) [2024-06-15 22:00:52,875][1653645] Updated weights for policy 0, policy_version 814800 (0.0012) [2024-06-15 22:00:55,197][1653645] Updated weights for policy 0, policy_version 814885 (0.0012) [2024-06-15 22:00:55,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1668939776. Throughput: 0: 10900.0. Samples: 417276928. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:00:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:00:55,969][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000814912_1668939776.pth... [2024-06-15 22:00:56,024][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000809648_1658159104.pth [2024-06-15 22:00:57,295][1653645] Updated weights for policy 0, policy_version 814928 (0.0011) [2024-06-15 22:00:58,239][1653645] Updated weights for policy 0, policy_version 814975 (0.0012) [2024-06-15 22:01:00,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1669070848. Throughput: 0: 10922.7. Samples: 417344512. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:01:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:01:05,848][1653645] Updated weights for policy 0, policy_version 815056 (0.0013) [2024-06-15 22:01:05,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 42598.4, 300 sec: 44320.1). Total num frames: 1669234688. Throughput: 0: 11047.8. Samples: 417380864. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:01:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:01:07,792][1653645] Updated weights for policy 0, policy_version 815136 (0.0017) [2024-06-15 22:01:07,885][1651596] Signal inference workers to stop experience collection... (42300 times) [2024-06-15 22:01:07,982][1653645] InferenceWorker_p0-w0: stopping experience collection (42300 times) [2024-06-15 22:01:08,148][1651596] Signal inference workers to resume experience collection... (42300 times) [2024-06-15 22:01:08,149][1653645] InferenceWorker_p0-w0: resuming experience collection (42300 times) [2024-06-15 22:01:10,378][1653645] Updated weights for policy 0, policy_version 815187 (0.0015) [2024-06-15 22:01:10,958][1648982] Fps is (10 sec: 45874.1, 60 sec: 42598.5, 300 sec: 44542.2). Total num frames: 1669529600. Throughput: 0: 10558.5. Samples: 417428992. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:01:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:01:11,321][1653645] Updated weights for policy 0, policy_version 815229 (0.0032) [2024-06-15 22:01:15,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 41506.1, 300 sec: 44097.9). Total num frames: 1669595136. Throughput: 0: 10683.8. Samples: 417500672. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:01:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:01:18,189][1653645] Updated weights for policy 0, policy_version 815299 (0.0013) [2024-06-15 22:01:20,324][1653645] Updated weights for policy 0, policy_version 815395 (0.0014) [2024-06-15 22:01:20,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43144.5, 300 sec: 44655.2). Total num frames: 1669988352. Throughput: 0: 10683.7. Samples: 417530880. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:01:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:01:23,297][1653645] Updated weights for policy 0, policy_version 815472 (0.0012) [2024-06-15 22:01:25,958][1648982] Fps is (10 sec: 52427.5, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1670119424. Throughput: 0: 10399.2. Samples: 417591808. Policy #0 lag: (min: 1.0, avg: 96.0, max: 257.0) [2024-06-15 22:01:25,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:01:29,790][1653645] Updated weights for policy 0, policy_version 815542 (0.0011) [2024-06-15 22:01:30,958][1648982] Fps is (10 sec: 36044.2, 60 sec: 40960.1, 300 sec: 44320.1). Total num frames: 1670348800. Throughput: 0: 10751.9. Samples: 417659904. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:01:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:01:31,186][1653645] Updated weights for policy 0, policy_version 815616 (0.0013) [2024-06-15 22:01:32,626][1653645] Updated weights for policy 0, policy_version 815672 (0.0012) [2024-06-15 22:01:34,642][1653645] Updated weights for policy 0, policy_version 815712 (0.0120) [2024-06-15 22:01:35,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1670643712. Throughput: 0: 10569.9. Samples: 417690112. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:01:35,959][1648982] Avg episode reward: [(0, '37.550')] [2024-06-15 22:01:40,944][1653645] Updated weights for policy 0, policy_version 815778 (0.0015) [2024-06-15 22:01:40,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 42052.3, 300 sec: 44209.0). Total num frames: 1670709248. Throughput: 0: 10843.1. Samples: 417764864. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:01:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:01:41,786][1653645] Updated weights for policy 0, policy_version 815824 (0.0013) [2024-06-15 22:01:43,761][1653645] Updated weights for policy 0, policy_version 815893 (0.0013) [2024-06-15 22:01:45,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 44431.2). Total num frames: 1671036928. Throughput: 0: 10649.6. Samples: 417823744. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:01:45,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 22:01:46,643][1653645] Updated weights for policy 0, policy_version 815956 (0.0032) [2024-06-15 22:01:50,962][1648982] Fps is (10 sec: 45857.2, 60 sec: 43687.8, 300 sec: 44319.6). Total num frames: 1671168000. Throughput: 0: 10534.9. Samples: 417854976. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:01:50,964][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:01:52,748][1653645] Updated weights for policy 0, policy_version 816020 (0.0013) [2024-06-15 22:01:53,701][1651596] Signal inference workers to stop experience collection... (42350 times) [2024-06-15 22:01:53,735][1653645] InferenceWorker_p0-w0: stopping experience collection (42350 times) [2024-06-15 22:01:53,973][1651596] Signal inference workers to resume experience collection... (42350 times) [2024-06-15 22:01:53,973][1653645] InferenceWorker_p0-w0: resuming experience collection (42350 times) [2024-06-15 22:01:54,116][1653645] Updated weights for policy 0, policy_version 816081 (0.0118) [2024-06-15 22:01:55,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 44542.3). Total num frames: 1671462912. Throughput: 0: 11047.9. Samples: 417926144. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:01:55,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:01:56,287][1653645] Updated weights for policy 0, policy_version 816163 (0.0136) [2024-06-15 22:01:59,933][1653645] Updated weights for policy 0, policy_version 816240 (0.0013) [2024-06-15 22:02:00,958][1648982] Fps is (10 sec: 52449.2, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1671692288. Throughput: 0: 10740.6. Samples: 417984000. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:02:05,455][1653645] Updated weights for policy 0, policy_version 816304 (0.0104) [2024-06-15 22:02:05,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.6, 300 sec: 43986.9). Total num frames: 1671823360. Throughput: 0: 11002.3. Samples: 418025984. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:05,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:02:07,641][1653645] Updated weights for policy 0, policy_version 816384 (0.0089) [2024-06-15 22:02:08,964][1653645] Updated weights for policy 0, policy_version 816442 (0.0012) [2024-06-15 22:02:10,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1672085504. Throughput: 0: 10809.0. Samples: 418078208. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:02:11,968][1653645] Updated weights for policy 0, policy_version 816496 (0.0088) [2024-06-15 22:02:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1672216576. Throughput: 0: 10934.1. Samples: 418151936. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:15,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:02:18,051][1653645] Updated weights for policy 0, policy_version 816564 (0.0140) [2024-06-15 22:02:19,416][1653645] Updated weights for policy 0, policy_version 816624 (0.0133) [2024-06-15 22:02:20,691][1653645] Updated weights for policy 0, policy_version 816672 (0.0015) [2024-06-15 22:02:20,958][1648982] Fps is (10 sec: 45873.8, 60 sec: 42598.2, 300 sec: 44209.0). Total num frames: 1672544256. Throughput: 0: 10968.1. Samples: 418183680. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:20,960][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:02:23,195][1653645] Updated weights for policy 0, policy_version 816720 (0.0025) [2024-06-15 22:02:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1672740864. Throughput: 0: 10581.3. Samples: 418241024. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:02:27,755][1653645] Updated weights for policy 0, policy_version 816770 (0.0012) [2024-06-15 22:02:29,438][1653645] Updated weights for policy 0, policy_version 816834 (0.0012) [2024-06-15 22:02:30,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 44236.7, 300 sec: 44431.1). Total num frames: 1673003008. Throughput: 0: 10820.2. Samples: 418310656. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:02:31,072][1653645] Updated weights for policy 0, policy_version 816899 (0.0015) [2024-06-15 22:02:32,520][1653645] Updated weights for policy 0, policy_version 816960 (0.0014) [2024-06-15 22:02:35,794][1651596] Signal inference workers to stop experience collection... (42400 times) [2024-06-15 22:02:35,845][1653645] InferenceWorker_p0-w0: stopping experience collection (42400 times) [2024-06-15 22:02:35,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 1673199616. Throughput: 0: 10787.0. Samples: 418340352. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:35,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:02:36,088][1651596] Signal inference workers to resume experience collection... (42400 times) [2024-06-15 22:02:36,089][1653645] InferenceWorker_p0-w0: resuming experience collection (42400 times) [2024-06-15 22:02:36,298][1653645] Updated weights for policy 0, policy_version 817020 (0.0014) [2024-06-15 22:02:40,960][1648982] Fps is (10 sec: 32759.9, 60 sec: 43688.7, 300 sec: 43764.3). Total num frames: 1673330688. Throughput: 0: 10956.1. Samples: 418419200. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:40,961][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:02:41,465][1653645] Updated weights for policy 0, policy_version 817093 (0.0013) [2024-06-15 22:02:43,237][1653645] Updated weights for policy 0, policy_version 817168 (0.0013) [2024-06-15 22:02:45,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1673658368. Throughput: 0: 10899.9. Samples: 418474496. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:45,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:02:46,780][1653645] Updated weights for policy 0, policy_version 817218 (0.0093) [2024-06-15 22:02:47,999][1653645] Updated weights for policy 0, policy_version 817269 (0.0013) [2024-06-15 22:02:50,958][1648982] Fps is (10 sec: 45887.6, 60 sec: 43693.5, 300 sec: 44098.0). Total num frames: 1673789440. Throughput: 0: 10717.9. Samples: 418508288. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:02:53,165][1653645] Updated weights for policy 0, policy_version 817314 (0.0016) [2024-06-15 22:02:54,764][1653645] Updated weights for policy 0, policy_version 817380 (0.0015) [2024-06-15 22:02:55,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 43690.6, 300 sec: 44099.7). Total num frames: 1674084352. Throughput: 0: 11150.2. Samples: 418579968. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:02:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:02:56,305][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000817440_1674117120.pth... [2024-06-15 22:02:56,500][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000812288_1663565824.pth [2024-06-15 22:02:56,972][1653645] Updated weights for policy 0, policy_version 817464 (0.0016) [2024-06-15 22:02:59,153][1653645] Updated weights for policy 0, policy_version 817506 (0.0012) [2024-06-15 22:03:00,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1674313728. Throughput: 0: 10934.0. Samples: 418643968. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:03:00,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:03:03,864][1653645] Updated weights for policy 0, policy_version 817540 (0.0015) [2024-06-15 22:03:05,228][1653645] Updated weights for policy 0, policy_version 817600 (0.0013) [2024-06-15 22:03:05,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1674477568. Throughput: 0: 11173.0. Samples: 418686464. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:03:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:03:07,717][1653645] Updated weights for policy 0, policy_version 817683 (0.0141) [2024-06-15 22:03:10,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1674706944. Throughput: 0: 10922.7. Samples: 418732544. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:03:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:03:11,749][1653645] Updated weights for policy 0, policy_version 817776 (0.0021) [2024-06-15 22:03:15,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1674838016. Throughput: 0: 11150.2. Samples: 418812416. Policy #0 lag: (min: 3.0, avg: 58.9, max: 259.0) [2024-06-15 22:03:15,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:03:16,590][1653645] Updated weights for policy 0, policy_version 817809 (0.0012) [2024-06-15 22:03:19,076][1651596] Signal inference workers to stop experience collection... (42450 times) [2024-06-15 22:03:19,138][1653645] InferenceWorker_p0-w0: stopping experience collection (42450 times) [2024-06-15 22:03:19,151][1653645] Updated weights for policy 0, policy_version 817909 (0.0012) [2024-06-15 22:03:19,268][1651596] Signal inference workers to resume experience collection... (42450 times) [2024-06-15 22:03:19,274][1653645] InferenceWorker_p0-w0: resuming experience collection (42450 times) [2024-06-15 22:03:20,235][1653645] Updated weights for policy 0, policy_version 817968 (0.0012) [2024-06-15 22:03:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44783.1, 300 sec: 43986.9). Total num frames: 1675231232. Throughput: 0: 11161.6. Samples: 418842624. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:03:23,127][1653645] Updated weights for policy 0, policy_version 818046 (0.0014) [2024-06-15 22:03:25,960][1648982] Fps is (10 sec: 52429.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1675362304. Throughput: 0: 10889.2. Samples: 418909184. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:25,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:03:29,775][1653645] Updated weights for policy 0, policy_version 818144 (0.0219) [2024-06-15 22:03:30,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.9, 300 sec: 43875.8). Total num frames: 1675657216. Throughput: 0: 11082.0. Samples: 418973184. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:03:31,728][1653645] Updated weights for policy 0, policy_version 818232 (0.0014) [2024-06-15 22:03:35,148][1653645] Updated weights for policy 0, policy_version 818288 (0.0012) [2024-06-15 22:03:35,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1675886592. Throughput: 0: 11047.8. Samples: 419005440. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:03:40,958][1648982] Fps is (10 sec: 29490.6, 60 sec: 43692.4, 300 sec: 43542.5). Total num frames: 1675952128. Throughput: 0: 11081.9. Samples: 419078656. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:40,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:03:41,049][1653645] Updated weights for policy 0, policy_version 818352 (0.0029) [2024-06-15 22:03:42,461][1653645] Updated weights for policy 0, policy_version 818403 (0.0084) [2024-06-15 22:03:43,796][1653645] Updated weights for policy 0, policy_version 818465 (0.0017) [2024-06-15 22:03:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1676279808. Throughput: 0: 10922.7. Samples: 419135488. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:45,958][1648982] Avg episode reward: [(0, '37.260')] [2024-06-15 22:03:47,023][1653645] Updated weights for policy 0, policy_version 818530 (0.0012) [2024-06-15 22:03:50,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.4, 300 sec: 43542.6). Total num frames: 1676410880. Throughput: 0: 10695.0. Samples: 419167744. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:50,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 22:03:52,583][1653645] Updated weights for policy 0, policy_version 818597 (0.0108) [2024-06-15 22:03:54,285][1653645] Updated weights for policy 0, policy_version 818672 (0.0012) [2024-06-15 22:03:55,773][1653645] Updated weights for policy 0, policy_version 818742 (0.0013) [2024-06-15 22:03:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1676804096. Throughput: 0: 11195.7. Samples: 419236352. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:03:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:03:59,116][1653645] Updated weights for policy 0, policy_version 818811 (0.0014) [2024-06-15 22:04:00,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1676935168. Throughput: 0: 10922.7. Samples: 419303936. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:00,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:04:03,479][1651596] Signal inference workers to stop experience collection... (42500 times) [2024-06-15 22:04:03,536][1653645] InferenceWorker_p0-w0: stopping experience collection (42500 times) [2024-06-15 22:04:03,706][1651596] Signal inference workers to resume experience collection... (42500 times) [2024-06-15 22:04:03,707][1653645] InferenceWorker_p0-w0: resuming experience collection (42500 times) [2024-06-15 22:04:04,112][1653645] Updated weights for policy 0, policy_version 818876 (0.0040) [2024-06-15 22:04:05,953][1653645] Updated weights for policy 0, policy_version 818931 (0.0041) [2024-06-15 22:04:05,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 44782.8, 300 sec: 43875.8). Total num frames: 1677164544. Throughput: 0: 11127.4. Samples: 419343360. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:04:07,507][1653645] Updated weights for policy 0, policy_version 818992 (0.0018) [2024-06-15 22:04:10,982][1648982] Fps is (10 sec: 45875.7, 60 sec: 44782.9, 300 sec: 44102.8). Total num frames: 1677393920. Throughput: 0: 10945.4. Samples: 419401728. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:10,983][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:04:11,388][1653645] Updated weights for policy 0, policy_version 819061 (0.0013) [2024-06-15 22:04:15,140][1653645] Updated weights for policy 0, policy_version 819104 (0.0036) [2024-06-15 22:04:15,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45875.4, 300 sec: 43764.7). Total num frames: 1677590528. Throughput: 0: 11138.9. Samples: 419474432. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 22:04:17,089][1653645] Updated weights for policy 0, policy_version 819155 (0.0018) [2024-06-15 22:04:18,526][1653645] Updated weights for policy 0, policy_version 819223 (0.0013) [2024-06-15 22:04:19,338][1653645] Updated weights for policy 0, policy_version 819264 (0.0014) [2024-06-15 22:04:20,958][1648982] Fps is (10 sec: 49152.4, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1677885440. Throughput: 0: 11195.8. Samples: 419509248. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:04:22,282][1653645] Updated weights for policy 0, policy_version 819328 (0.0017) [2024-06-15 22:04:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1677983744. Throughput: 0: 10968.2. Samples: 419572224. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:04:28,482][1653645] Updated weights for policy 0, policy_version 819395 (0.0086) [2024-06-15 22:04:30,436][1653645] Updated weights for policy 0, policy_version 819472 (0.0013) [2024-06-15 22:04:30,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1678311424. Throughput: 0: 11127.4. Samples: 419636224. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:04:34,496][1653645] Updated weights for policy 0, policy_version 819568 (0.0017) [2024-06-15 22:04:35,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 43690.5, 300 sec: 43988.9). Total num frames: 1678508032. Throughput: 0: 11104.7. Samples: 419667456. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:04:39,400][1653645] Updated weights for policy 0, policy_version 819619 (0.0021) [2024-06-15 22:04:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 45329.2, 300 sec: 43542.5). Total num frames: 1678671872. Throughput: 0: 11173.0. Samples: 419739136. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:04:41,135][1653645] Updated weights for policy 0, policy_version 819670 (0.0046) [2024-06-15 22:04:43,191][1653645] Updated weights for policy 0, policy_version 819748 (0.0076) [2024-06-15 22:04:45,031][1653645] Updated weights for policy 0, policy_version 819780 (0.0027) [2024-06-15 22:04:45,842][1651596] Signal inference workers to stop experience collection... (42550 times) [2024-06-15 22:04:45,875][1653645] InferenceWorker_p0-w0: stopping experience collection (42550 times) [2024-06-15 22:04:45,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44782.8, 300 sec: 44097.9). Total num frames: 1678966784. Throughput: 0: 11093.3. Samples: 419803136. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:45,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:04:46,146][1651596] Signal inference workers to resume experience collection... (42550 times) [2024-06-15 22:04:46,148][1653645] InferenceWorker_p0-w0: resuming experience collection (42550 times) [2024-06-15 22:04:50,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43690.7, 300 sec: 43209.3). Total num frames: 1679032320. Throughput: 0: 10877.1. Samples: 419832832. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:50,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:04:51,404][1653645] Updated weights for policy 0, policy_version 819872 (0.0013) [2024-06-15 22:04:52,926][1653645] Updated weights for policy 0, policy_version 819908 (0.0013) [2024-06-15 22:04:54,313][1653645] Updated weights for policy 0, policy_version 819974 (0.0104) [2024-06-15 22:04:55,506][1653645] Updated weights for policy 0, policy_version 820022 (0.0012) [2024-06-15 22:04:55,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1679425536. Throughput: 0: 11138.8. Samples: 419902976. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:04:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:04:55,974][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000820032_1679425536.pth... [2024-06-15 22:04:56,117][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000814912_1668939776.pth [2024-06-15 22:04:56,980][1653645] Updated weights for policy 0, policy_version 820066 (0.0012) [2024-06-15 22:05:00,958][1648982] Fps is (10 sec: 52430.7, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 1679556608. Throughput: 0: 10979.6. Samples: 419968512. Policy #0 lag: (min: 17.0, avg: 79.4, max: 273.0) [2024-06-15 22:05:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:05:03,067][1653645] Updated weights for policy 0, policy_version 820128 (0.0015) [2024-06-15 22:05:05,958][1648982] Fps is (10 sec: 32768.8, 60 sec: 43144.7, 300 sec: 43320.4). Total num frames: 1679753216. Throughput: 0: 11036.4. Samples: 420005888. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:05,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:05:06,143][1653645] Updated weights for policy 0, policy_version 820208 (0.0040) [2024-06-15 22:05:07,882][1653645] Updated weights for policy 0, policy_version 820279 (0.0013) [2024-06-15 22:05:09,198][1653645] Updated weights for policy 0, policy_version 820327 (0.0024) [2024-06-15 22:05:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1680080896. Throughput: 0: 10729.2. Samples: 420055040. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:05:15,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 43209.3). Total num frames: 1680146432. Throughput: 0: 11070.6. Samples: 420134400. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:05:16,227][1653645] Updated weights for policy 0, policy_version 820400 (0.0011) [2024-06-15 22:05:18,212][1653645] Updated weights for policy 0, policy_version 820451 (0.0012) [2024-06-15 22:05:20,112][1653645] Updated weights for policy 0, policy_version 820528 (0.0013) [2024-06-15 22:05:20,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 1680506880. Throughput: 0: 10968.2. Samples: 420161024. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:05:21,832][1653645] Updated weights for policy 0, policy_version 820608 (0.0019) [2024-06-15 22:05:25,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 1680605184. Throughput: 0: 10615.5. Samples: 420216832. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:25,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:05:28,927][1653645] Updated weights for policy 0, policy_version 820671 (0.0014) [2024-06-15 22:05:30,825][1653645] Updated weights for policy 0, policy_version 820721 (0.0012) [2024-06-15 22:05:30,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 42052.3, 300 sec: 43431.5). Total num frames: 1680834560. Throughput: 0: 10888.6. Samples: 420293120. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:05:31,537][1651596] Signal inference workers to stop experience collection... (42600 times) [2024-06-15 22:05:31,581][1653645] InferenceWorker_p0-w0: stopping experience collection (42600 times) [2024-06-15 22:05:31,757][1651596] Signal inference workers to resume experience collection... (42600 times) [2024-06-15 22:05:31,758][1653645] InferenceWorker_p0-w0: resuming experience collection (42600 times) [2024-06-15 22:05:32,362][1653645] Updated weights for policy 0, policy_version 820788 (0.0015) [2024-06-15 22:05:33,563][1653645] Updated weights for policy 0, policy_version 820854 (0.0124) [2024-06-15 22:05:35,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.9, 300 sec: 43875.8). Total num frames: 1681129472. Throughput: 0: 10820.4. Samples: 420319744. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:05:40,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 42598.6, 300 sec: 42987.2). Total num frames: 1681227776. Throughput: 0: 11070.7. Samples: 420401152. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:05:41,038][1653645] Updated weights for policy 0, policy_version 820928 (0.0096) [2024-06-15 22:05:42,442][1653645] Updated weights for policy 0, policy_version 820991 (0.0013) [2024-06-15 22:05:44,226][1653645] Updated weights for policy 0, policy_version 821061 (0.0012) [2024-06-15 22:05:45,346][1653645] Updated weights for policy 0, policy_version 821115 (0.0011) [2024-06-15 22:05:45,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1681653760. Throughput: 0: 10661.0. Samples: 420448256. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:05:50,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 43690.9, 300 sec: 43098.3). Total num frames: 1681653760. Throughput: 0: 10661.0. Samples: 420485632. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:05:52,393][1653645] Updated weights for policy 0, policy_version 821180 (0.0012) [2024-06-15 22:05:53,877][1653645] Updated weights for policy 0, policy_version 821220 (0.0012) [2024-06-15 22:05:55,212][1653645] Updated weights for policy 0, policy_version 821282 (0.0013) [2024-06-15 22:05:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1682046976. Throughput: 0: 11127.5. Samples: 420555776. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:05:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:05:56,670][1653645] Updated weights for policy 0, policy_version 821360 (0.0015) [2024-06-15 22:06:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1682178048. Throughput: 0: 11002.3. Samples: 420629504. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:00,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:06:03,294][1653645] Updated weights for policy 0, policy_version 821392 (0.0011) [2024-06-15 22:06:04,767][1653645] Updated weights for policy 0, policy_version 821444 (0.0028) [2024-06-15 22:06:05,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.8, 300 sec: 43653.7). Total num frames: 1682407424. Throughput: 0: 11184.4. Samples: 420664320. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:06:06,710][1653645] Updated weights for policy 0, policy_version 821536 (0.0122) [2024-06-15 22:06:07,930][1653645] Updated weights for policy 0, policy_version 821585 (0.0013) [2024-06-15 22:06:08,880][1653645] Updated weights for policy 0, policy_version 821632 (0.0011) [2024-06-15 22:06:10,958][1648982] Fps is (10 sec: 52426.9, 60 sec: 43690.4, 300 sec: 44431.1). Total num frames: 1682702336. Throughput: 0: 11275.3. Samples: 420724224. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:06:14,543][1651596] Signal inference workers to stop experience collection... (42650 times) [2024-06-15 22:06:14,594][1653645] InferenceWorker_p0-w0: stopping experience collection (42650 times) [2024-06-15 22:06:14,714][1651596] Signal inference workers to resume experience collection... (42650 times) [2024-06-15 22:06:14,715][1653645] InferenceWorker_p0-w0: resuming experience collection (42650 times) [2024-06-15 22:06:15,372][1653645] Updated weights for policy 0, policy_version 821688 (0.0011) [2024-06-15 22:06:15,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 1682833408. Throughput: 0: 11298.1. Samples: 420801536. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:06:16,947][1653645] Updated weights for policy 0, policy_version 821738 (0.0027) [2024-06-15 22:06:19,765][1653645] Updated weights for policy 0, policy_version 821842 (0.0013) [2024-06-15 22:06:20,959][1648982] Fps is (10 sec: 52426.6, 60 sec: 45328.6, 300 sec: 44431.1). Total num frames: 1683226624. Throughput: 0: 11206.9. Samples: 420824064. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:20,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:06:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.5, 300 sec: 43653.7). Total num frames: 1683226624. Throughput: 0: 11104.7. Samples: 420900864. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:06:25,988][1653645] Updated weights for policy 0, policy_version 821904 (0.0014) [2024-06-15 22:06:27,984][1653645] Updated weights for policy 0, policy_version 821968 (0.0014) [2024-06-15 22:06:29,510][1653645] Updated weights for policy 0, policy_version 822035 (0.0013) [2024-06-15 22:06:30,958][1648982] Fps is (10 sec: 42600.2, 60 sec: 46967.2, 300 sec: 44097.9). Total num frames: 1683652608. Throughput: 0: 11423.2. Samples: 420962304. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:30,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:06:31,120][1653645] Updated weights for policy 0, policy_version 822102 (0.0015) [2024-06-15 22:06:35,970][1648982] Fps is (10 sec: 52363.7, 60 sec: 43681.6, 300 sec: 44207.2). Total num frames: 1683750912. Throughput: 0: 11477.0. Samples: 421002240. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:35,971][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:06:37,496][1653645] Updated weights for policy 0, policy_version 822148 (0.0014) [2024-06-15 22:06:39,476][1653645] Updated weights for policy 0, policy_version 822224 (0.0093) [2024-06-15 22:06:40,958][1648982] Fps is (10 sec: 36046.1, 60 sec: 46421.3, 300 sec: 43986.9). Total num frames: 1684013056. Throughput: 0: 11446.1. Samples: 421070848. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:40,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:06:41,905][1653645] Updated weights for policy 0, policy_version 822321 (0.0081) [2024-06-15 22:06:43,625][1653645] Updated weights for policy 0, policy_version 822395 (0.0140) [2024-06-15 22:06:45,958][1648982] Fps is (10 sec: 52494.5, 60 sec: 43690.7, 300 sec: 44431.8). Total num frames: 1684275200. Throughput: 0: 11241.2. Samples: 421135360. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:06:50,336][1653645] Updated weights for policy 0, policy_version 822460 (0.0015) [2024-06-15 22:06:50,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.1, 300 sec: 43875.8). Total num frames: 1684406272. Throughput: 0: 11332.2. Samples: 421174272. Policy #0 lag: (min: 15.0, avg: 107.9, max: 271.0) [2024-06-15 22:06:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:06:53,016][1653645] Updated weights for policy 0, policy_version 822544 (0.0115) [2024-06-15 22:06:53,468][1651596] Signal inference workers to stop experience collection... (42700 times) [2024-06-15 22:06:53,504][1653645] InferenceWorker_p0-w0: stopping experience collection (42700 times) [2024-06-15 22:06:53,794][1651596] Signal inference workers to resume experience collection... (42700 times) [2024-06-15 22:06:53,806][1653645] InferenceWorker_p0-w0: resuming experience collection (42700 times) [2024-06-15 22:06:55,276][1653645] Updated weights for policy 0, policy_version 822640 (0.0013) [2024-06-15 22:06:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1684799488. Throughput: 0: 11127.5. Samples: 421224960. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:06:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:06:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000822656_1684799488.pth... [2024-06-15 22:06:56,008][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000817440_1674117120.pth [2024-06-15 22:07:00,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1684799488. Throughput: 0: 11195.7. Samples: 421305344. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:07:02,103][1653645] Updated weights for policy 0, policy_version 822704 (0.0013) [2024-06-15 22:07:04,400][1653645] Updated weights for policy 0, policy_version 822768 (0.0090) [2024-06-15 22:07:05,958][1648982] Fps is (10 sec: 32768.4, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 1685127168. Throughput: 0: 11389.4. Samples: 421336576. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:05,958][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 22:07:06,790][1653645] Updated weights for policy 0, policy_version 822851 (0.0012) [2024-06-15 22:07:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.9, 300 sec: 44431.2). Total num frames: 1685323776. Throughput: 0: 10763.4. Samples: 421385216. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:07:14,433][1653645] Updated weights for policy 0, policy_version 822944 (0.0013) [2024-06-15 22:07:15,958][1648982] Fps is (10 sec: 32766.4, 60 sec: 43690.4, 300 sec: 43764.7). Total num frames: 1685454848. Throughput: 0: 11059.2. Samples: 421459968. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:07:15,978][1653645] Updated weights for policy 0, policy_version 822977 (0.0014) [2024-06-15 22:07:17,899][1653645] Updated weights for policy 0, policy_version 823042 (0.0012) [2024-06-15 22:07:19,718][1653645] Updated weights for policy 0, policy_version 823123 (0.0011) [2024-06-15 22:07:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43691.2, 300 sec: 44431.2). Total num frames: 1685848064. Throughput: 0: 10698.1. Samples: 421483520. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:07:25,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.5, 300 sec: 43542.6). Total num frames: 1685848064. Throughput: 0: 10717.8. Samples: 421553152. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:07:26,160][1653645] Updated weights for policy 0, policy_version 823170 (0.0012) [2024-06-15 22:07:27,551][1653645] Updated weights for policy 0, policy_version 823231 (0.0011) [2024-06-15 22:07:29,762][1653645] Updated weights for policy 0, policy_version 823296 (0.0014) [2024-06-15 22:07:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.6, 300 sec: 44098.0). Total num frames: 1686208512. Throughput: 0: 10729.2. Samples: 421618176. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:07:31,324][1653645] Updated weights for policy 0, policy_version 823365 (0.0017) [2024-06-15 22:07:35,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 43699.8, 300 sec: 44209.5). Total num frames: 1686372352. Throughput: 0: 10570.0. Samples: 421649920. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:07:38,199][1653645] Updated weights for policy 0, policy_version 823456 (0.0084) [2024-06-15 22:07:39,827][1651596] Signal inference workers to stop experience collection... (42750 times) [2024-06-15 22:07:39,874][1653645] InferenceWorker_p0-w0: stopping experience collection (42750 times) [2024-06-15 22:07:40,108][1651596] Signal inference workers to resume experience collection... (42750 times) [2024-06-15 22:07:40,108][1653645] InferenceWorker_p0-w0: resuming experience collection (42750 times) [2024-06-15 22:07:40,594][1653645] Updated weights for policy 0, policy_version 823520 (0.0012) [2024-06-15 22:07:40,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.3, 300 sec: 43764.7). Total num frames: 1686568960. Throughput: 0: 11093.3. Samples: 421724160. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:07:42,365][1653645] Updated weights for policy 0, policy_version 823589 (0.0013) [2024-06-15 22:07:43,675][1653645] Updated weights for policy 0, policy_version 823665 (0.0013) [2024-06-15 22:07:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1686896640. Throughput: 0: 10592.7. Samples: 421782016. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:07:50,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 43764.7). Total num frames: 1686994944. Throughput: 0: 10865.7. Samples: 421825536. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:50,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:07:51,098][1653645] Updated weights for policy 0, policy_version 823736 (0.0015) [2024-06-15 22:07:52,777][1653645] Updated weights for policy 0, policy_version 823796 (0.0073) [2024-06-15 22:07:54,216][1653645] Updated weights for policy 0, policy_version 823873 (0.0012) [2024-06-15 22:07:55,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1687420928. Throughput: 0: 11070.6. Samples: 421883392. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:07:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:08:00,958][1648982] Fps is (10 sec: 42599.4, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1687420928. Throughput: 0: 11036.5. Samples: 421956608. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:08:02,476][1653645] Updated weights for policy 0, policy_version 823968 (0.0026) [2024-06-15 22:08:04,114][1653645] Updated weights for policy 0, policy_version 824020 (0.0013) [2024-06-15 22:08:05,476][1653645] Updated weights for policy 0, policy_version 824096 (0.0012) [2024-06-15 22:08:05,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44236.7, 300 sec: 44320.1). Total num frames: 1687781376. Throughput: 0: 11173.0. Samples: 421986304. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:08:07,216][1653645] Updated weights for policy 0, policy_version 824183 (0.0031) [2024-06-15 22:08:10,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 43690.4, 300 sec: 44431.2). Total num frames: 1687945216. Throughput: 0: 11093.3. Samples: 422052352. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:10,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:08:14,300][1653645] Updated weights for policy 0, policy_version 824227 (0.0015) [2024-06-15 22:08:15,506][1653645] Updated weights for policy 0, policy_version 824288 (0.0014) [2024-06-15 22:08:15,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 45329.4, 300 sec: 43875.8). Total num frames: 1688174592. Throughput: 0: 11275.4. Samples: 422125568. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:08:16,956][1651596] Signal inference workers to stop experience collection... (42800 times) [2024-06-15 22:08:17,048][1653645] InferenceWorker_p0-w0: stopping experience collection (42800 times) [2024-06-15 22:08:17,050][1653645] Updated weights for policy 0, policy_version 824357 (0.0013) [2024-06-15 22:08:17,258][1651596] Signal inference workers to resume experience collection... (42800 times) [2024-06-15 22:08:17,259][1653645] InferenceWorker_p0-w0: resuming experience collection (42800 times) [2024-06-15 22:08:20,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1688469504. Throughput: 0: 11070.6. Samples: 422148096. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:08:25,205][1653645] Updated weights for policy 0, policy_version 824449 (0.0083) [2024-06-15 22:08:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 44783.2, 300 sec: 43653.6). Total num frames: 1688535040. Throughput: 0: 11241.3. Samples: 422230016. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:25,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:08:26,849][1653645] Updated weights for policy 0, policy_version 824528 (0.0183) [2024-06-15 22:08:28,819][1653645] Updated weights for policy 0, policy_version 824617 (0.0013) [2024-06-15 22:08:30,196][1653645] Updated weights for policy 0, policy_version 824676 (0.0012) [2024-06-15 22:08:30,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 46421.3, 300 sec: 44431.2). Total num frames: 1688993792. Throughput: 0: 11138.8. Samples: 422283264. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:30,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:08:35,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 1688993792. Throughput: 0: 11161.7. Samples: 422327808. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:08:37,284][1653645] Updated weights for policy 0, policy_version 824736 (0.0120) [2024-06-15 22:08:38,434][1653645] Updated weights for policy 0, policy_version 824800 (0.0015) [2024-06-15 22:08:40,692][1653645] Updated weights for policy 0, policy_version 824897 (0.0013) [2024-06-15 22:08:40,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 47513.6, 300 sec: 44542.3). Total num frames: 1689419776. Throughput: 0: 11377.8. Samples: 422395392. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:08:45,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1689518080. Throughput: 0: 11184.4. Samples: 422459904. Policy #0 lag: (min: 111.0, avg: 192.2, max: 367.0) [2024-06-15 22:08:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:08:49,081][1653645] Updated weights for policy 0, policy_version 824976 (0.0016) [2024-06-15 22:08:50,661][1653645] Updated weights for policy 0, policy_version 825056 (0.0013) [2024-06-15 22:08:50,958][1648982] Fps is (10 sec: 29491.5, 60 sec: 45329.3, 300 sec: 43764.7). Total num frames: 1689714688. Throughput: 0: 11411.9. Samples: 422499840. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:08:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:08:53,340][1653645] Updated weights for policy 0, policy_version 825152 (0.0013) [2024-06-15 22:08:55,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 44431.2). Total num frames: 1690042368. Throughput: 0: 10854.4. Samples: 422540800. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:08:55,959][1648982] Avg episode reward: [(0, '37.180')] [2024-06-15 22:08:55,977][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000825216_1690042368.pth... [2024-06-15 22:08:56,031][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000820032_1679425536.pth [2024-06-15 22:09:00,958][1648982] Fps is (10 sec: 32767.8, 60 sec: 43690.7, 300 sec: 43653.7). Total num frames: 1690042368. Throughput: 0: 11036.4. Samples: 422622208. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:09:01,302][1651596] Signal inference workers to stop experience collection... (42850 times) [2024-06-15 22:09:01,390][1653645] InferenceWorker_p0-w0: stopping experience collection (42850 times) [2024-06-15 22:09:01,535][1651596] Signal inference workers to resume experience collection... (42850 times) [2024-06-15 22:09:01,536][1653645] InferenceWorker_p0-w0: resuming experience collection (42850 times) [2024-06-15 22:09:01,538][1653645] Updated weights for policy 0, policy_version 825232 (0.0013) [2024-06-15 22:09:03,982][1653645] Updated weights for policy 0, policy_version 825328 (0.0014) [2024-06-15 22:09:05,822][1653645] Updated weights for policy 0, policy_version 825408 (0.0013) [2024-06-15 22:09:05,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 44236.9, 300 sec: 44209.1). Total num frames: 1690435584. Throughput: 0: 11104.7. Samples: 422647808. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:09:07,226][1653645] Updated weights for policy 0, policy_version 825467 (0.0014) [2024-06-15 22:09:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1690566656. Throughput: 0: 10581.3. Samples: 422706176. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:09:14,915][1653645] Updated weights for policy 0, policy_version 825521 (0.0013) [2024-06-15 22:09:15,958][1648982] Fps is (10 sec: 29491.0, 60 sec: 42598.4, 300 sec: 43542.6). Total num frames: 1690730496. Throughput: 0: 10956.8. Samples: 422776320. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:15,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:09:16,913][1653645] Updated weights for policy 0, policy_version 825600 (0.0014) [2024-06-15 22:09:19,136][1653645] Updated weights for policy 0, policy_version 825682 (0.0013) [2024-06-15 22:09:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1691090944. Throughput: 0: 10467.5. Samples: 422798848. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:20,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:09:25,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 1691090944. Throughput: 0: 10535.8. Samples: 422869504. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:25,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:09:26,029][1653645] Updated weights for policy 0, policy_version 825729 (0.0014) [2024-06-15 22:09:27,282][1653645] Updated weights for policy 0, policy_version 825786 (0.0013) [2024-06-15 22:09:28,808][1653645] Updated weights for policy 0, policy_version 825840 (0.0013) [2024-06-15 22:09:30,399][1653645] Updated weights for policy 0, policy_version 825893 (0.0023) [2024-06-15 22:09:30,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 43986.9). Total num frames: 1691484160. Throughput: 0: 10376.5. Samples: 422926848. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:09:31,936][1653645] Updated weights for policy 0, policy_version 825968 (0.0113) [2024-06-15 22:09:35,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1691615232. Throughput: 0: 10171.7. Samples: 422957568. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:09:39,098][1653645] Updated weights for policy 0, policy_version 826032 (0.0014) [2024-06-15 22:09:40,313][1653645] Updated weights for policy 0, policy_version 826064 (0.0011) [2024-06-15 22:09:40,762][1651596] Signal inference workers to stop experience collection... (42900 times) [2024-06-15 22:09:40,796][1653645] InferenceWorker_p0-w0: stopping experience collection (42900 times) [2024-06-15 22:09:40,958][1648982] Fps is (10 sec: 32768.6, 60 sec: 39867.8, 300 sec: 43542.6). Total num frames: 1691811840. Throughput: 0: 10900.0. Samples: 423031296. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:09:41,034][1651596] Signal inference workers to resume experience collection... (42900 times) [2024-06-15 22:09:41,035][1653645] InferenceWorker_p0-w0: resuming experience collection (42900 times) [2024-06-15 22:09:42,566][1653645] Updated weights for policy 0, policy_version 826146 (0.0014) [2024-06-15 22:09:44,329][1653645] Updated weights for policy 0, policy_version 826235 (0.0086) [2024-06-15 22:09:45,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1692139520. Throughput: 0: 10365.2. Samples: 423088640. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:09:50,469][1653645] Updated weights for policy 0, policy_version 826272 (0.0012) [2024-06-15 22:09:50,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 43431.5). Total num frames: 1692237824. Throughput: 0: 10604.1. Samples: 423124992. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:09:52,543][1653645] Updated weights for policy 0, policy_version 826322 (0.0013) [2024-06-15 22:09:54,434][1653645] Updated weights for policy 0, policy_version 826403 (0.0026) [2024-06-15 22:09:55,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 42598.3, 300 sec: 44209.0). Total num frames: 1692598272. Throughput: 0: 10717.8. Samples: 423188480. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:09:55,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:09:56,081][1653645] Updated weights for policy 0, policy_version 826470 (0.0118) [2024-06-15 22:10:00,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1692663808. Throughput: 0: 10695.1. Samples: 423257600. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:00,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 22:10:02,129][1653645] Updated weights for policy 0, policy_version 826512 (0.0013) [2024-06-15 22:10:03,274][1653645] Updated weights for policy 0, policy_version 826558 (0.0015) [2024-06-15 22:10:05,249][1653645] Updated weights for policy 0, policy_version 826609 (0.0109) [2024-06-15 22:10:05,958][1648982] Fps is (10 sec: 36046.0, 60 sec: 42052.2, 300 sec: 43653.6). Total num frames: 1692958720. Throughput: 0: 10945.4. Samples: 423291392. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:10:06,804][1653645] Updated weights for policy 0, policy_version 826673 (0.0012) [2024-06-15 22:10:08,420][1653645] Updated weights for policy 0, policy_version 826748 (0.0013) [2024-06-15 22:10:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1693188096. Throughput: 0: 10672.4. Samples: 423349760. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:10:14,889][1653645] Updated weights for policy 0, policy_version 826806 (0.0011) [2024-06-15 22:10:15,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.6, 300 sec: 43431.5). Total num frames: 1693319168. Throughput: 0: 10968.2. Samples: 423420416. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:10:17,340][1653645] Updated weights for policy 0, policy_version 826880 (0.0013) [2024-06-15 22:10:19,043][1653645] Updated weights for policy 0, policy_version 826960 (0.0014) [2024-06-15 22:10:19,568][1651596] Signal inference workers to stop experience collection... (42950 times) [2024-06-15 22:10:19,621][1653645] InferenceWorker_p0-w0: stopping experience collection (42950 times) [2024-06-15 22:10:19,872][1651596] Signal inference workers to resume experience collection... (42950 times) [2024-06-15 22:10:19,874][1653645] InferenceWorker_p0-w0: resuming experience collection (42950 times) [2024-06-15 22:10:20,975][1648982] Fps is (10 sec: 52336.5, 60 sec: 43677.9, 300 sec: 44428.5). Total num frames: 1693712384. Throughput: 0: 10918.4. Samples: 423449088. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:20,976][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:10:25,958][1648982] Fps is (10 sec: 42598.0, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1693745152. Throughput: 0: 10911.3. Samples: 423522304. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:10:26,045][1653645] Updated weights for policy 0, policy_version 827025 (0.0024) [2024-06-15 22:10:26,828][1653645] Updated weights for policy 0, policy_version 827067 (0.0010) [2024-06-15 22:10:28,632][1653645] Updated weights for policy 0, policy_version 827120 (0.0012) [2024-06-15 22:10:30,958][1648982] Fps is (10 sec: 39391.5, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1694105600. Throughput: 0: 10979.6. Samples: 423582720. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:10:31,485][1653645] Updated weights for policy 0, policy_version 827234 (0.0146) [2024-06-15 22:10:35,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.6, 300 sec: 44097.9). Total num frames: 1694236672. Throughput: 0: 10922.7. Samples: 423616512. Policy #0 lag: (min: 1.0, avg: 48.7, max: 257.0) [2024-06-15 22:10:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:10:37,799][1653645] Updated weights for policy 0, policy_version 827292 (0.0037) [2024-06-15 22:10:38,389][1653645] Updated weights for policy 0, policy_version 827328 (0.0011) [2024-06-15 22:10:40,866][1653645] Updated weights for policy 0, policy_version 827395 (0.0013) [2024-06-15 22:10:40,958][1648982] Fps is (10 sec: 39321.0, 60 sec: 44782.9, 300 sec: 43542.6). Total num frames: 1694498816. Throughput: 0: 11332.3. Samples: 423698432. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:10:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:10:42,490][1653645] Updated weights for policy 0, policy_version 827471 (0.0128) [2024-06-15 22:10:43,458][1653645] Updated weights for policy 0, policy_version 827512 (0.0014) [2024-06-15 22:10:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1694760960. Throughput: 0: 11241.2. Samples: 423763456. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:10:45,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:10:48,981][1653645] Updated weights for policy 0, policy_version 827568 (0.0014) [2024-06-15 22:10:50,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44783.0, 300 sec: 43653.7). Total num frames: 1694924800. Throughput: 0: 11298.1. Samples: 423799808. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:10:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:10:51,931][1653645] Updated weights for policy 0, policy_version 827648 (0.0012) [2024-06-15 22:10:53,161][1653645] Updated weights for policy 0, policy_version 827710 (0.0132) [2024-06-15 22:10:55,548][1653645] Updated weights for policy 0, policy_version 827770 (0.0013) [2024-06-15 22:10:55,958][1648982] Fps is (10 sec: 52425.8, 60 sec: 44782.6, 300 sec: 44431.1). Total num frames: 1695285248. Throughput: 0: 11275.2. Samples: 423857152. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:10:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:10:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000827776_1695285248.pth... [2024-06-15 22:10:56,037][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000822656_1684799488.pth [2024-06-15 22:10:56,040][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000827776_1695285248.pth [2024-06-15 22:11:00,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.8, 300 sec: 43875.8). Total num frames: 1695350784. Throughput: 0: 11320.8. Samples: 423929856. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:11:01,186][1653645] Updated weights for policy 0, policy_version 827811 (0.0012) [2024-06-15 22:11:02,779][1653645] Updated weights for policy 0, policy_version 827858 (0.0044) [2024-06-15 22:11:04,346][1651596] Signal inference workers to stop experience collection... (43000 times) [2024-06-15 22:11:04,384][1653645] InferenceWorker_p0-w0: stopping experience collection (43000 times) [2024-06-15 22:11:04,434][1653645] Updated weights for policy 0, policy_version 827923 (0.0029) [2024-06-15 22:11:04,619][1651596] Signal inference workers to resume experience collection... (43000 times) [2024-06-15 22:11:04,620][1653645] InferenceWorker_p0-w0: resuming experience collection (43000 times) [2024-06-15 22:11:05,958][1648982] Fps is (10 sec: 39324.5, 60 sec: 45329.1, 300 sec: 43986.9). Total num frames: 1695678464. Throughput: 0: 11370.9. Samples: 423960576. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:05,962][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:11:06,607][1653645] Updated weights for policy 0, policy_version 827984 (0.0013) [2024-06-15 22:11:10,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1695809536. Throughput: 0: 10979.6. Samples: 424016384. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:11:13,240][1653645] Updated weights for policy 0, policy_version 828064 (0.0013) [2024-06-15 22:11:15,199][1653645] Updated weights for policy 0, policy_version 828112 (0.0028) [2024-06-15 22:11:15,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44782.9, 300 sec: 43320.5). Total num frames: 1696006144. Throughput: 0: 11252.6. Samples: 424089088. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:11:17,373][1653645] Updated weights for policy 0, policy_version 828192 (0.0013) [2024-06-15 22:11:19,296][1653645] Updated weights for policy 0, policy_version 828240 (0.0012) [2024-06-15 22:11:20,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43703.5, 300 sec: 44431.2). Total num frames: 1696333824. Throughput: 0: 11036.5. Samples: 424113152. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:11:24,873][1653645] Updated weights for policy 0, policy_version 828309 (0.0016) [2024-06-15 22:11:25,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 45329.1, 300 sec: 43431.5). Total num frames: 1696464896. Throughput: 0: 10843.0. Samples: 424186368. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:11:27,392][1653645] Updated weights for policy 0, policy_version 828371 (0.0015) [2024-06-15 22:11:29,492][1653645] Updated weights for policy 0, policy_version 828451 (0.0012) [2024-06-15 22:11:30,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.8, 300 sec: 44099.8). Total num frames: 1696759808. Throughput: 0: 10717.9. Samples: 424245760. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:11:31,287][1653645] Updated weights for policy 0, policy_version 828512 (0.0014) [2024-06-15 22:11:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1696858112. Throughput: 0: 10626.8. Samples: 424278016. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:11:37,129][1653645] Updated weights for policy 0, policy_version 828560 (0.0013) [2024-06-15 22:11:39,420][1653645] Updated weights for policy 0, policy_version 828624 (0.0015) [2024-06-15 22:11:40,958][1648982] Fps is (10 sec: 36043.6, 60 sec: 43690.5, 300 sec: 43542.5). Total num frames: 1697120256. Throughput: 0: 10900.0. Samples: 424347648. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:11:41,216][1653645] Updated weights for policy 0, policy_version 828688 (0.0011) [2024-06-15 22:11:42,873][1653645] Updated weights for policy 0, policy_version 828752 (0.0019) [2024-06-15 22:11:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1697382400. Throughput: 0: 10501.7. Samples: 424402432. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:11:48,999][1653645] Updated weights for policy 0, policy_version 828802 (0.0013) [2024-06-15 22:11:50,957][1648982] Fps is (10 sec: 39323.3, 60 sec: 43144.6, 300 sec: 43098.3). Total num frames: 1697513472. Throughput: 0: 10740.7. Samples: 424443904. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:50,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:11:51,437][1653645] Updated weights for policy 0, policy_version 828867 (0.0066) [2024-06-15 22:11:52,377][1651596] Signal inference workers to stop experience collection... (43050 times) [2024-06-15 22:11:52,463][1653645] InferenceWorker_p0-w0: stopping experience collection (43050 times) [2024-06-15 22:11:52,702][1651596] Signal inference workers to resume experience collection... (43050 times) [2024-06-15 22:11:52,703][1653645] InferenceWorker_p0-w0: resuming experience collection (43050 times) [2024-06-15 22:11:53,276][1653645] Updated weights for policy 0, policy_version 828946 (0.0100) [2024-06-15 22:11:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 42052.7, 300 sec: 44097.9). Total num frames: 1697808384. Throughput: 0: 10752.0. Samples: 424500224. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:11:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:11:56,255][1653645] Updated weights for policy 0, policy_version 829026 (0.0110) [2024-06-15 22:12:00,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 42598.3, 300 sec: 43320.4). Total num frames: 1697906688. Throughput: 0: 10706.4. Samples: 424570880. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:12:00,959][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:12:02,649][1653645] Updated weights for policy 0, policy_version 829104 (0.0148) [2024-06-15 22:12:05,402][1653645] Updated weights for policy 0, policy_version 829168 (0.0148) [2024-06-15 22:12:05,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 41506.1, 300 sec: 43542.6). Total num frames: 1698168832. Throughput: 0: 10786.1. Samples: 424598528. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:12:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:12:06,823][1653645] Updated weights for policy 0, policy_version 829244 (0.0012) [2024-06-15 22:12:09,314][1653645] Updated weights for policy 0, policy_version 829299 (0.0013) [2024-06-15 22:12:10,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1698430976. Throughput: 0: 10456.2. Samples: 424656896. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:12:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:12:13,917][1653645] Updated weights for policy 0, policy_version 829332 (0.0014) [2024-06-15 22:12:15,970][1648982] Fps is (10 sec: 42545.0, 60 sec: 43135.5, 300 sec: 43207.5). Total num frames: 1698594816. Throughput: 0: 10726.2. Samples: 424728576. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:12:15,971][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:12:16,434][1653645] Updated weights for policy 0, policy_version 829415 (0.0018) [2024-06-15 22:12:17,874][1653645] Updated weights for policy 0, policy_version 829457 (0.0014) [2024-06-15 22:12:20,397][1653645] Updated weights for policy 0, policy_version 829520 (0.0016) [2024-06-15 22:12:20,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 44209.1). Total num frames: 1698889728. Throughput: 0: 10683.7. Samples: 424758784. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:12:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:12:25,876][1653645] Updated weights for policy 0, policy_version 829600 (0.0014) [2024-06-15 22:12:25,958][1648982] Fps is (10 sec: 42651.9, 60 sec: 42598.5, 300 sec: 43431.5). Total num frames: 1699020800. Throughput: 0: 10740.7. Samples: 424830976. Policy #0 lag: (min: 18.0, avg: 91.3, max: 274.0) [2024-06-15 22:12:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:12:27,848][1653645] Updated weights for policy 0, policy_version 829648 (0.0025) [2024-06-15 22:12:28,860][1653645] Updated weights for policy 0, policy_version 829691 (0.0016) [2024-06-15 22:12:30,212][1653645] Updated weights for policy 0, policy_version 829744 (0.0051) [2024-06-15 22:12:30,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1699348480. Throughput: 0: 10786.1. Samples: 424887808. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:12:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:12:33,304][1653645] Updated weights for policy 0, policy_version 829808 (0.0013) [2024-06-15 22:12:35,957][1648982] Fps is (10 sec: 45876.1, 60 sec: 43690.8, 300 sec: 43764.8). Total num frames: 1699479552. Throughput: 0: 10524.4. Samples: 424917504. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:12:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:12:38,029][1653645] Updated weights for policy 0, policy_version 829872 (0.0013) [2024-06-15 22:12:40,488][1653645] Updated weights for policy 0, policy_version 829951 (0.0105) [2024-06-15 22:12:40,962][1648982] Fps is (10 sec: 39302.9, 60 sec: 43687.4, 300 sec: 43541.9). Total num frames: 1699741696. Throughput: 0: 10989.8. Samples: 424994816. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:12:40,963][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:12:41,097][1651596] Signal inference workers to stop experience collection... (43100 times) [2024-06-15 22:12:41,218][1653645] InferenceWorker_p0-w0: stopping experience collection (43100 times) [2024-06-15 22:12:41,382][1651596] Signal inference workers to resume experience collection... (43100 times) [2024-06-15 22:12:41,382][1653645] InferenceWorker_p0-w0: resuming experience collection (43100 times) [2024-06-15 22:12:42,480][1653645] Updated weights for policy 0, policy_version 830010 (0.0012) [2024-06-15 22:12:44,848][1653645] Updated weights for policy 0, policy_version 830065 (0.0013) [2024-06-15 22:12:45,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.7, 300 sec: 44098.0). Total num frames: 1700003840. Throughput: 0: 10752.1. Samples: 425054720. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:12:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:12:50,126][1653645] Updated weights for policy 0, policy_version 830141 (0.0013) [2024-06-15 22:12:50,958][1648982] Fps is (10 sec: 39339.8, 60 sec: 43690.4, 300 sec: 43098.2). Total num frames: 1700134912. Throughput: 0: 11093.3. Samples: 425097728. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:12:50,959][1648982] Avg episode reward: [(0, '37.170')] [2024-06-15 22:12:52,519][1653645] Updated weights for policy 0, policy_version 830202 (0.0117) [2024-06-15 22:12:53,826][1653645] Updated weights for policy 0, policy_version 830244 (0.0012) [2024-06-15 22:12:55,909][1653645] Updated weights for policy 0, policy_version 830320 (0.0014) [2024-06-15 22:12:55,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 44782.9, 300 sec: 44320.1). Total num frames: 1700495360. Throughput: 0: 11138.8. Samples: 425158144. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:12:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:12:56,160][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000830336_1700528128.pth... [2024-06-15 22:12:56,202][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000825216_1690042368.pth [2024-06-15 22:13:00,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 44782.9, 300 sec: 43431.5). Total num frames: 1700593664. Throughput: 0: 11267.1. Samples: 425235456. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:13:01,073][1653645] Updated weights for policy 0, policy_version 830384 (0.0013) [2024-06-15 22:13:03,657][1653645] Updated weights for policy 0, policy_version 830458 (0.0017) [2024-06-15 22:13:04,927][1653645] Updated weights for policy 0, policy_version 830498 (0.0012) [2024-06-15 22:13:05,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 45875.2, 300 sec: 43986.9). Total num frames: 1700921344. Throughput: 0: 11241.3. Samples: 425264640. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:13:07,228][1653645] Updated weights for policy 0, policy_version 830548 (0.0015) [2024-06-15 22:13:08,251][1653645] Updated weights for policy 0, policy_version 830592 (0.0124) [2024-06-15 22:13:10,958][1648982] Fps is (10 sec: 45876.3, 60 sec: 43690.7, 300 sec: 43653.6). Total num frames: 1701052416. Throughput: 0: 11173.0. Samples: 425333760. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:13:12,631][1653645] Updated weights for policy 0, policy_version 830655 (0.0013) [2024-06-15 22:13:15,387][1653645] Updated weights for policy 0, policy_version 830720 (0.0095) [2024-06-15 22:13:15,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 45884.8, 300 sec: 43653.6). Total num frames: 1701347328. Throughput: 0: 11446.0. Samples: 425402880. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:13:16,400][1653645] Updated weights for policy 0, policy_version 830768 (0.0015) [2024-06-15 22:13:19,069][1653645] Updated weights for policy 0, policy_version 830832 (0.0014) [2024-06-15 22:13:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 1701576704. Throughput: 0: 11571.2. Samples: 425438208. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:13:22,979][1653645] Updated weights for policy 0, policy_version 830883 (0.0013) [2024-06-15 22:13:25,083][1653645] Updated weights for policy 0, policy_version 830928 (0.0012) [2024-06-15 22:13:25,958][1648982] Fps is (10 sec: 45872.6, 60 sec: 46420.9, 300 sec: 43431.4). Total num frames: 1701806080. Throughput: 0: 11560.9. Samples: 425515008. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:25,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:13:26,352][1651596] Signal inference workers to stop experience collection... (43150 times) [2024-06-15 22:13:26,400][1653645] InferenceWorker_p0-w0: stopping experience collection (43150 times) [2024-06-15 22:13:26,497][1651596] Signal inference workers to resume experience collection... (43150 times) [2024-06-15 22:13:26,498][1653645] InferenceWorker_p0-w0: resuming experience collection (43150 times) [2024-06-15 22:13:26,620][1653645] Updated weights for policy 0, policy_version 830993 (0.0098) [2024-06-15 22:13:29,056][1653645] Updated weights for policy 0, policy_version 831041 (0.0015) [2024-06-15 22:13:30,323][1653645] Updated weights for policy 0, policy_version 831104 (0.0024) [2024-06-15 22:13:30,962][1648982] Fps is (10 sec: 52404.7, 60 sec: 45871.7, 300 sec: 44430.5). Total num frames: 1702100992. Throughput: 0: 11706.5. Samples: 425581568. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:30,963][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:13:34,872][1653645] Updated weights for policy 0, policy_version 831164 (0.0012) [2024-06-15 22:13:35,958][1648982] Fps is (10 sec: 42600.9, 60 sec: 45875.0, 300 sec: 43431.5). Total num frames: 1702232064. Throughput: 0: 11685.0. Samples: 425623552. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:13:36,398][1653645] Updated weights for policy 0, policy_version 831203 (0.0012) [2024-06-15 22:13:38,219][1653645] Updated weights for policy 0, policy_version 831280 (0.0012) [2024-06-15 22:13:40,958][1648982] Fps is (10 sec: 42617.0, 60 sec: 46424.8, 300 sec: 44097.9). Total num frames: 1702526976. Throughput: 0: 11741.8. Samples: 425686528. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:40,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:13:42,047][1653645] Updated weights for policy 0, policy_version 831353 (0.0015) [2024-06-15 22:13:45,260][1653645] Updated weights for policy 0, policy_version 831396 (0.0049) [2024-06-15 22:13:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.2, 300 sec: 44209.0). Total num frames: 1702756352. Throughput: 0: 11605.4. Samples: 425757696. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:13:46,901][1653645] Updated weights for policy 0, policy_version 831440 (0.0024) [2024-06-15 22:13:47,908][1653645] Updated weights for policy 0, policy_version 831487 (0.0013) [2024-06-15 22:13:50,477][1653645] Updated weights for policy 0, policy_version 831548 (0.0030) [2024-06-15 22:13:50,958][1648982] Fps is (10 sec: 49153.2, 60 sec: 48059.8, 300 sec: 43986.9). Total num frames: 1703018496. Throughput: 0: 11696.4. Samples: 425790976. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:13:53,929][1653645] Updated weights for policy 0, policy_version 831608 (0.0014) [2024-06-15 22:13:55,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44236.9, 300 sec: 44431.2). Total num frames: 1703149568. Throughput: 0: 11525.7. Samples: 425852416. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:13:55,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 22:13:56,836][1653645] Updated weights for policy 0, policy_version 831651 (0.0011) [2024-06-15 22:13:58,665][1653645] Updated weights for policy 0, policy_version 831696 (0.0016) [2024-06-15 22:14:00,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 46967.5, 300 sec: 43986.8). Total num frames: 1703411712. Throughput: 0: 11548.4. Samples: 425922560. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:14:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:14:01,773][1653645] Updated weights for policy 0, policy_version 831750 (0.0013) [2024-06-15 22:14:02,960][1653645] Updated weights for policy 0, policy_version 831798 (0.0012) [2024-06-15 22:14:05,591][1653645] Updated weights for policy 0, policy_version 831856 (0.0105) [2024-06-15 22:14:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1703673856. Throughput: 0: 11468.8. Samples: 425954304. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:14:05,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:14:07,968][1653645] Updated weights for policy 0, policy_version 831894 (0.0020) [2024-06-15 22:14:10,119][1653645] Updated weights for policy 0, policy_version 831952 (0.0015) [2024-06-15 22:14:10,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 46967.4, 300 sec: 44542.3). Total num frames: 1703870464. Throughput: 0: 11412.1. Samples: 426028544. Policy #0 lag: (min: 53.0, avg: 131.3, max: 309.0) [2024-06-15 22:14:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:14:11,348][1653645] Updated weights for policy 0, policy_version 831998 (0.0012) [2024-06-15 22:14:14,188][1651596] Signal inference workers to stop experience collection... (43200 times) [2024-06-15 22:14:14,226][1653645] InferenceWorker_p0-w0: stopping experience collection (43200 times) [2024-06-15 22:14:14,383][1651596] Signal inference workers to resume experience collection... (43200 times) [2024-06-15 22:14:14,383][1653645] InferenceWorker_p0-w0: resuming experience collection (43200 times) [2024-06-15 22:14:14,576][1653645] Updated weights for policy 0, policy_version 832053 (0.0125) [2024-06-15 22:14:15,938][1653645] Updated weights for policy 0, policy_version 832085 (0.0013) [2024-06-15 22:14:15,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 45875.2, 300 sec: 44098.0). Total num frames: 1704099840. Throughput: 0: 11401.7. Samples: 426094592. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:14:18,549][1653645] Updated weights for policy 0, policy_version 832131 (0.0038) [2024-06-15 22:14:20,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.1, 300 sec: 44875.5). Total num frames: 1704329216. Throughput: 0: 11286.7. Samples: 426131456. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:20,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:14:21,809][1653645] Updated weights for policy 0, policy_version 832211 (0.0015) [2024-06-15 22:14:22,665][1653645] Updated weights for policy 0, policy_version 832256 (0.0013) [2024-06-15 22:14:25,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45329.5, 300 sec: 44209.0). Total num frames: 1704525824. Throughput: 0: 11434.7. Samples: 426201088. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:14:26,447][1653645] Updated weights for policy 0, policy_version 832309 (0.0013) [2024-06-15 22:14:27,984][1653645] Updated weights for policy 0, policy_version 832375 (0.0119) [2024-06-15 22:14:30,961][1648982] Fps is (10 sec: 45862.0, 60 sec: 44784.2, 300 sec: 44652.9). Total num frames: 1704787968. Throughput: 0: 11445.3. Samples: 426272768. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:30,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:14:30,978][1653645] Updated weights for policy 0, policy_version 832432 (0.0023) [2024-06-15 22:14:32,485][1653645] Updated weights for policy 0, policy_version 832480 (0.0012) [2024-06-15 22:14:35,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 45875.1, 300 sec: 44653.3). Total num frames: 1704984576. Throughput: 0: 11366.4. Samples: 426302464. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:35,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:14:36,812][1653645] Updated weights for policy 0, policy_version 832528 (0.0012) [2024-06-15 22:14:38,403][1653645] Updated weights for policy 0, policy_version 832578 (0.0012) [2024-06-15 22:14:39,740][1653645] Updated weights for policy 0, policy_version 832639 (0.0014) [2024-06-15 22:14:40,958][1648982] Fps is (10 sec: 45888.7, 60 sec: 45329.2, 300 sec: 44431.2). Total num frames: 1705246720. Throughput: 0: 11502.9. Samples: 426370048. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:14:43,361][1653645] Updated weights for policy 0, policy_version 832704 (0.0088) [2024-06-15 22:14:44,735][1653645] Updated weights for policy 0, policy_version 832767 (0.0014) [2024-06-15 22:14:45,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 45875.3, 300 sec: 44986.6). Total num frames: 1705508864. Throughput: 0: 11400.6. Samples: 426435584. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:14:49,381][1653645] Updated weights for policy 0, policy_version 832823 (0.0013) [2024-06-15 22:14:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44236.8, 300 sec: 44320.2). Total num frames: 1705672704. Throughput: 0: 11525.7. Samples: 426472960. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:14:51,206][1653645] Updated weights for policy 0, policy_version 832868 (0.0109) [2024-06-15 22:14:54,041][1653645] Updated weights for policy 0, policy_version 832930 (0.0014) [2024-06-15 22:14:55,803][1653645] Updated weights for policy 0, policy_version 833008 (0.0014) [2024-06-15 22:14:55,958][1648982] Fps is (10 sec: 49149.9, 60 sec: 47513.4, 300 sec: 45208.7). Total num frames: 1706000384. Throughput: 0: 11457.4. Samples: 426544128. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:14:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:14:56,021][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000833024_1706033152.pth... [2024-06-15 22:14:56,078][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000827776_1695285248.pth [2024-06-15 22:15:00,035][1651596] Signal inference workers to stop experience collection... (43250 times) [2024-06-15 22:15:00,112][1653645] InferenceWorker_p0-w0: stopping experience collection (43250 times) [2024-06-15 22:15:00,136][1653645] Updated weights for policy 0, policy_version 833049 (0.0015) [2024-06-15 22:15:00,238][1651596] Signal inference workers to resume experience collection... (43250 times) [2024-06-15 22:15:00,239][1653645] InferenceWorker_p0-w0: resuming experience collection (43250 times) [2024-06-15 22:15:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45329.2, 300 sec: 44653.3). Total num frames: 1706131456. Throughput: 0: 11434.7. Samples: 426609152. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:15:01,002][1653645] Updated weights for policy 0, policy_version 833088 (0.0011) [2024-06-15 22:15:03,391][1653645] Updated weights for policy 0, policy_version 833151 (0.0012) [2024-06-15 22:15:05,667][1653645] Updated weights for policy 0, policy_version 833192 (0.0012) [2024-06-15 22:15:05,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 45328.8, 300 sec: 44764.4). Total num frames: 1706393600. Throughput: 0: 11389.1. Samples: 426643968. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:15:07,257][1653645] Updated weights for policy 0, policy_version 833264 (0.0013) [2024-06-15 22:15:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1706557440. Throughput: 0: 11411.9. Samples: 426714624. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:10,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:15:11,218][1653645] Updated weights for policy 0, policy_version 833296 (0.0012) [2024-06-15 22:15:13,342][1653645] Updated weights for policy 0, policy_version 833347 (0.0015) [2024-06-15 22:15:15,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 45328.7, 300 sec: 44433.8). Total num frames: 1706819584. Throughput: 0: 11185.0. Samples: 426776064. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:15:17,007][1653645] Updated weights for policy 0, policy_version 833410 (0.0012) [2024-06-15 22:15:18,207][1653645] Updated weights for policy 0, policy_version 833460 (0.0015) [2024-06-15 22:15:20,021][1653645] Updated weights for policy 0, policy_version 833530 (0.0119) [2024-06-15 22:15:20,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1707081728. Throughput: 0: 11298.2. Samples: 426810880. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:15:23,902][1653645] Updated weights for policy 0, policy_version 833587 (0.0013) [2024-06-15 22:15:25,844][1653645] Updated weights for policy 0, policy_version 833633 (0.0038) [2024-06-15 22:15:25,958][1648982] Fps is (10 sec: 45877.2, 60 sec: 45875.3, 300 sec: 44653.3). Total num frames: 1707278336. Throughput: 0: 11173.0. Samples: 426872832. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:25,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 22:15:29,588][1653645] Updated weights for policy 0, policy_version 833696 (0.0013) [2024-06-15 22:15:30,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45331.3, 300 sec: 44986.6). Total num frames: 1707507712. Throughput: 0: 11264.0. Samples: 426942464. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:15:31,529][1653645] Updated weights for policy 0, policy_version 833776 (0.0014) [2024-06-15 22:15:35,315][1653645] Updated weights for policy 0, policy_version 833826 (0.0012) [2024-06-15 22:15:35,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45875.4, 300 sec: 44875.5). Total num frames: 1707737088. Throughput: 0: 11195.7. Samples: 426976768. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:35,958][1648982] Avg episode reward: [(0, '37.030')] [2024-06-15 22:15:37,264][1653645] Updated weights for policy 0, policy_version 833872 (0.0075) [2024-06-15 22:15:40,476][1653645] Updated weights for policy 0, policy_version 833936 (0.0016) [2024-06-15 22:15:40,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44782.8, 300 sec: 44653.3). Total num frames: 1707933696. Throughput: 0: 11161.6. Samples: 427046400. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:15:41,940][1653645] Updated weights for policy 0, policy_version 833986 (0.0013) [2024-06-15 22:15:42,711][1651596] Signal inference workers to stop experience collection... (43300 times) [2024-06-15 22:15:42,764][1653645] InferenceWorker_p0-w0: stopping experience collection (43300 times) [2024-06-15 22:15:43,010][1651596] Signal inference workers to resume experience collection... (43300 times) [2024-06-15 22:15:43,011][1653645] InferenceWorker_p0-w0: resuming experience collection (43300 times) [2024-06-15 22:15:43,195][1653645] Updated weights for policy 0, policy_version 834038 (0.0021) [2024-06-15 22:15:45,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.6, 300 sec: 44764.4). Total num frames: 1708130304. Throughput: 0: 11252.6. Samples: 427115520. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:15:47,445][1653645] Updated weights for policy 0, policy_version 834096 (0.0012) [2024-06-15 22:15:50,129][1653645] Updated weights for policy 0, policy_version 834160 (0.0021) [2024-06-15 22:15:50,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45329.0, 300 sec: 44431.3). Total num frames: 1708392448. Throughput: 0: 11229.9. Samples: 427149312. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:15:52,922][1653645] Updated weights for policy 0, policy_version 834224 (0.0012) [2024-06-15 22:15:54,749][1653645] Updated weights for policy 0, policy_version 834304 (0.0208) [2024-06-15 22:15:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 44237.0, 300 sec: 45097.7). Total num frames: 1708654592. Throughput: 0: 10808.9. Samples: 427201024. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:15:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:16:00,385][1653645] Updated weights for policy 0, policy_version 834365 (0.0013) [2024-06-15 22:16:00,958][1648982] Fps is (10 sec: 39320.2, 60 sec: 44236.5, 300 sec: 44431.1). Total num frames: 1708785664. Throughput: 0: 11173.0. Samples: 427278848. Policy #0 lag: (min: 15.0, avg: 139.8, max: 271.0) [2024-06-15 22:16:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:16:03,134][1653645] Updated weights for policy 0, policy_version 834432 (0.0014) [2024-06-15 22:16:04,569][1653645] Updated weights for policy 0, policy_version 834482 (0.0014) [2024-06-15 22:16:05,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 45329.4, 300 sec: 45097.7). Total num frames: 1709113344. Throughput: 0: 11093.3. Samples: 427310080. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:16:06,151][1653645] Updated weights for policy 0, policy_version 834553 (0.0016) [2024-06-15 22:16:10,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.6, 300 sec: 44653.3). Total num frames: 1709178880. Throughput: 0: 11207.1. Samples: 427377152. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:16:11,375][1653645] Updated weights for policy 0, policy_version 834592 (0.0015) [2024-06-15 22:16:13,483][1653645] Updated weights for policy 0, policy_version 834640 (0.0015) [2024-06-15 22:16:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44783.3, 300 sec: 44653.3). Total num frames: 1709506560. Throughput: 0: 11229.9. Samples: 427447808. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:16:15,990][1653645] Updated weights for policy 0, policy_version 834736 (0.0014) [2024-06-15 22:16:17,305][1653645] Updated weights for policy 0, policy_version 834792 (0.0013) [2024-06-15 22:16:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1709703168. Throughput: 0: 11036.4. Samples: 427473408. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:20,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:16:23,703][1653645] Updated weights for policy 0, policy_version 834867 (0.0013) [2024-06-15 22:16:25,023][1653645] Updated weights for policy 0, policy_version 834897 (0.0009) [2024-06-15 22:16:25,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 44653.3). Total num frames: 1709932544. Throughput: 0: 11252.7. Samples: 427552768. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:16:26,562][1651596] Signal inference workers to stop experience collection... (43350 times) [2024-06-15 22:16:26,583][1653645] Updated weights for policy 0, policy_version 834961 (0.0011) [2024-06-15 22:16:26,636][1653645] InferenceWorker_p0-w0: stopping experience collection (43350 times) [2024-06-15 22:16:26,791][1651596] Signal inference workers to resume experience collection... (43350 times) [2024-06-15 22:16:26,792][1653645] InferenceWorker_p0-w0: resuming experience collection (43350 times) [2024-06-15 22:16:28,599][1653645] Updated weights for policy 0, policy_version 835040 (0.0125) [2024-06-15 22:16:30,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1710227456. Throughput: 0: 10979.5. Samples: 427609600. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:16:35,180][1653645] Updated weights for policy 0, policy_version 835104 (0.0018) [2024-06-15 22:16:35,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1710358528. Throughput: 0: 11104.7. Samples: 427649024. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:16:36,611][1653645] Updated weights for policy 0, policy_version 835152 (0.0019) [2024-06-15 22:16:37,594][1653645] Updated weights for policy 0, policy_version 835196 (0.0014) [2024-06-15 22:16:40,363][1653645] Updated weights for policy 0, policy_version 835296 (0.0013) [2024-06-15 22:16:40,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 46421.5, 300 sec: 45208.7). Total num frames: 1710718976. Throughput: 0: 11377.8. Samples: 427713024. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:16:45,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 43690.4, 300 sec: 44875.4). Total num frames: 1710751744. Throughput: 0: 11207.2. Samples: 427783168. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:16:46,632][1653645] Updated weights for policy 0, policy_version 835349 (0.0014) [2024-06-15 22:16:48,411][1653645] Updated weights for policy 0, policy_version 835424 (0.0081) [2024-06-15 22:16:50,686][1653645] Updated weights for policy 0, policy_version 835473 (0.0018) [2024-06-15 22:16:50,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 44782.9, 300 sec: 44986.6). Total num frames: 1711079424. Throughput: 0: 11218.4. Samples: 427814912. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:16:52,969][1653645] Updated weights for policy 0, policy_version 835583 (0.0013) [2024-06-15 22:16:55,988][1648982] Fps is (10 sec: 52270.4, 60 sec: 43668.4, 300 sec: 45315.1). Total num frames: 1711276032. Throughput: 0: 11006.3. Samples: 427872768. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:16:55,989][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:16:55,994][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000835584_1711276032.pth... [2024-06-15 22:16:56,049][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000830336_1700528128.pth [2024-06-15 22:17:00,307][1653645] Updated weights for policy 0, policy_version 835665 (0.0075) [2024-06-15 22:17:00,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45329.4, 300 sec: 45208.7). Total num frames: 1711505408. Throughput: 0: 11161.6. Samples: 427950080. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:17:02,465][1653645] Updated weights for policy 0, policy_version 835744 (0.0013) [2024-06-15 22:17:04,572][1653645] Updated weights for policy 0, policy_version 835830 (0.0013) [2024-06-15 22:17:05,958][1648982] Fps is (10 sec: 52589.9, 60 sec: 44782.9, 300 sec: 45319.8). Total num frames: 1711800320. Throughput: 0: 11150.3. Samples: 427975168. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:17:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 44236.9, 300 sec: 44877.4). Total num frames: 1711833088. Throughput: 0: 11138.8. Samples: 428054016. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:17:11,605][1651596] Signal inference workers to stop experience collection... (43400 times) [2024-06-15 22:17:11,683][1653645] Updated weights for policy 0, policy_version 835890 (0.0013) [2024-06-15 22:17:11,765][1653645] InferenceWorker_p0-w0: stopping experience collection (43400 times) [2024-06-15 22:17:11,854][1651596] Signal inference workers to resume experience collection... (43400 times) [2024-06-15 22:17:11,856][1653645] InferenceWorker_p0-w0: resuming experience collection (43400 times) [2024-06-15 22:17:12,697][1653645] Updated weights for policy 0, policy_version 835941 (0.0014) [2024-06-15 22:17:14,201][1653645] Updated weights for policy 0, policy_version 836016 (0.0012) [2024-06-15 22:17:15,720][1653645] Updated weights for policy 0, policy_version 836074 (0.0013) [2024-06-15 22:17:15,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 46421.3, 300 sec: 45430.9). Total num frames: 1712291840. Throughput: 0: 11184.3. Samples: 428112896. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:15,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:17:20,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 43690.8, 300 sec: 45097.7). Total num frames: 1712324608. Throughput: 0: 11195.7. Samples: 428152832. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:17:22,025][1653645] Updated weights for policy 0, policy_version 836128 (0.0011) [2024-06-15 22:17:23,181][1653645] Updated weights for policy 0, policy_version 836183 (0.0092) [2024-06-15 22:17:25,764][1653645] Updated weights for policy 0, policy_version 836272 (0.0012) [2024-06-15 22:17:25,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 45208.7). Total num frames: 1712685056. Throughput: 0: 11286.8. Samples: 428220928. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:17:27,492][1653645] Updated weights for policy 0, policy_version 836345 (0.0015) [2024-06-15 22:17:30,959][1648982] Fps is (10 sec: 52422.1, 60 sec: 43689.8, 300 sec: 45319.6). Total num frames: 1712848896. Throughput: 0: 11320.7. Samples: 428292608. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:30,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:17:33,502][1653645] Updated weights for policy 0, policy_version 836390 (0.0013) [2024-06-15 22:17:34,720][1653645] Updated weights for policy 0, policy_version 836436 (0.0014) [2024-06-15 22:17:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 45320.5). Total num frames: 1713111040. Throughput: 0: 11412.0. Samples: 428328448. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:17:37,466][1653645] Updated weights for policy 0, policy_version 836540 (0.0164) [2024-06-15 22:17:39,128][1653645] Updated weights for policy 0, policy_version 836605 (0.0013) [2024-06-15 22:17:40,958][1648982] Fps is (10 sec: 52435.1, 60 sec: 44236.8, 300 sec: 45319.8). Total num frames: 1713373184. Throughput: 0: 11226.1. Samples: 428377600. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:40,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:17:45,649][1653645] Updated weights for policy 0, policy_version 836664 (0.0025) [2024-06-15 22:17:45,958][1648982] Fps is (10 sec: 39321.2, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1713504256. Throughput: 0: 11457.4. Samples: 428465664. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:17:47,652][1653645] Updated weights for policy 0, policy_version 836739 (0.0017) [2024-06-15 22:17:49,323][1653645] Updated weights for policy 0, policy_version 836804 (0.0012) [2024-06-15 22:17:49,703][1651596] Signal inference workers to stop experience collection... (43450 times) [2024-06-15 22:17:49,758][1653645] InferenceWorker_p0-w0: stopping experience collection (43450 times) [2024-06-15 22:17:50,038][1651596] Signal inference workers to resume experience collection... (43450 times) [2024-06-15 22:17:50,039][1653645] InferenceWorker_p0-w0: resuming experience collection (43450 times) [2024-06-15 22:17:50,744][1653645] Updated weights for policy 0, policy_version 836861 (0.0100) [2024-06-15 22:17:50,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 46967.6, 300 sec: 45430.9). Total num frames: 1713897472. Throughput: 0: 11480.2. Samples: 428491776. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:50,958][1648982] Avg episode reward: [(0, '37.130')] [2024-06-15 22:17:55,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43712.8, 300 sec: 45097.7). Total num frames: 1713897472. Throughput: 0: 11207.1. Samples: 428558336. Policy #0 lag: (min: 34.0, avg: 173.9, max: 296.0) [2024-06-15 22:17:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:17:58,471][1653645] Updated weights for policy 0, policy_version 836929 (0.0021) [2024-06-15 22:17:59,977][1653645] Updated weights for policy 0, policy_version 836990 (0.0011) [2024-06-15 22:18:00,958][1648982] Fps is (10 sec: 29491.3, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1714192384. Throughput: 0: 11218.5. Samples: 428617728. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:18:02,394][1653645] Updated weights for policy 0, policy_version 837072 (0.0011) [2024-06-15 22:18:03,485][1653645] Updated weights for policy 0, policy_version 837118 (0.0129) [2024-06-15 22:18:05,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1714421760. Throughput: 0: 10808.9. Samples: 428639232. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:18:10,324][1653645] Updated weights for policy 0, policy_version 837175 (0.0011) [2024-06-15 22:18:10,958][1648982] Fps is (10 sec: 36043.2, 60 sec: 45328.8, 300 sec: 44764.4). Total num frames: 1714552832. Throughput: 0: 11116.0. Samples: 428721152. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:18:11,493][1653645] Updated weights for policy 0, policy_version 837216 (0.0013) [2024-06-15 22:18:12,647][1653645] Updated weights for policy 0, policy_version 837253 (0.0012) [2024-06-15 22:18:13,928][1653645] Updated weights for policy 0, policy_version 837304 (0.0012) [2024-06-15 22:18:15,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.9, 300 sec: 45319.8). Total num frames: 1714946048. Throughput: 0: 10695.4. Samples: 428773888. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:18:20,294][1653645] Updated weights for policy 0, policy_version 837378 (0.0014) [2024-06-15 22:18:20,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 44782.6, 300 sec: 44764.5). Total num frames: 1715011584. Throughput: 0: 10899.8. Samples: 428818944. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:20,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:18:21,579][1653645] Updated weights for policy 0, policy_version 837440 (0.0016) [2024-06-15 22:18:24,492][1653645] Updated weights for policy 0, policy_version 837490 (0.0010) [2024-06-15 22:18:25,958][1648982] Fps is (10 sec: 32767.6, 60 sec: 43144.5, 300 sec: 44654.0). Total num frames: 1715273728. Throughput: 0: 11172.9. Samples: 428880384. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:18:26,169][1653645] Updated weights for policy 0, policy_version 837568 (0.0013) [2024-06-15 22:18:27,450][1653645] Updated weights for policy 0, policy_version 837627 (0.0019) [2024-06-15 22:18:30,960][1648982] Fps is (10 sec: 45876.6, 60 sec: 43691.5, 300 sec: 44875.5). Total num frames: 1715470336. Throughput: 0: 10899.9. Samples: 428956160. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:30,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:18:32,720][1653645] Updated weights for policy 0, policy_version 837684 (0.0013) [2024-06-15 22:18:35,570][1653645] Updated weights for policy 0, policy_version 837744 (0.0013) [2024-06-15 22:18:35,695][1651596] Signal inference workers to stop experience collection... (43500 times) [2024-06-15 22:18:35,763][1653645] InferenceWorker_p0-w0: stopping experience collection (43500 times) [2024-06-15 22:18:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 44653.4). Total num frames: 1715699712. Throughput: 0: 11013.7. Samples: 428987392. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:18:36,091][1651596] Signal inference workers to resume experience collection... (43500 times) [2024-06-15 22:18:36,092][1653645] InferenceWorker_p0-w0: resuming experience collection (43500 times) [2024-06-15 22:18:37,816][1653645] Updated weights for policy 0, policy_version 837840 (0.0014) [2024-06-15 22:18:38,620][1653645] Updated weights for policy 0, policy_version 837888 (0.0026) [2024-06-15 22:18:40,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 43690.5, 300 sec: 44875.5). Total num frames: 1715994624. Throughput: 0: 10968.2. Samples: 429051904. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:18:43,754][1653645] Updated weights for policy 0, policy_version 837952 (0.0012) [2024-06-15 22:18:45,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1716125696. Throughput: 0: 11400.5. Samples: 429130752. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:18:47,619][1653645] Updated weights for policy 0, policy_version 838032 (0.0013) [2024-06-15 22:18:48,847][1653645] Updated weights for policy 0, policy_version 838082 (0.0014) [2024-06-15 22:18:49,825][1653645] Updated weights for policy 0, policy_version 838144 (0.0011) [2024-06-15 22:18:50,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1716518912. Throughput: 0: 11457.4. Samples: 429154816. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:18:55,050][1653645] Updated weights for policy 0, policy_version 838202 (0.0024) [2024-06-15 22:18:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.2, 300 sec: 44875.5). Total num frames: 1716649984. Throughput: 0: 11343.7. Samples: 429231616. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:18:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:18:55,966][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000838208_1716649984.pth... [2024-06-15 22:18:56,041][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000833024_1706033152.pth [2024-06-15 22:18:58,248][1653645] Updated weights for policy 0, policy_version 838258 (0.0012) [2024-06-15 22:18:59,485][1653645] Updated weights for policy 0, policy_version 838320 (0.0046) [2024-06-15 22:19:00,800][1653645] Updated weights for policy 0, policy_version 838395 (0.0014) [2024-06-15 22:19:00,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 47513.5, 300 sec: 45319.8). Total num frames: 1717043200. Throughput: 0: 11685.0. Samples: 429299712. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:19:05,073][1653645] Updated weights for policy 0, policy_version 838439 (0.0035) [2024-06-15 22:19:05,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 45875.0, 300 sec: 45097.6). Total num frames: 1717174272. Throughput: 0: 11616.8. Samples: 429341696. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:19:08,441][1653645] Updated weights for policy 0, policy_version 838504 (0.0015) [2024-06-15 22:19:10,061][1653645] Updated weights for policy 0, policy_version 838586 (0.0017) [2024-06-15 22:19:10,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 48606.1, 300 sec: 45319.8). Total num frames: 1717469184. Throughput: 0: 11844.3. Samples: 429413376. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:10,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:19:11,631][1653645] Updated weights for policy 0, policy_version 838656 (0.0099) [2024-06-15 22:19:15,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 44782.9, 300 sec: 45097.7). Total num frames: 1717633024. Throughput: 0: 11798.8. Samples: 429487104. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:19:19,034][1651596] Signal inference workers to stop experience collection... (43550 times) [2024-06-15 22:19:19,079][1653645] InferenceWorker_p0-w0: stopping experience collection (43550 times) [2024-06-15 22:19:19,088][1653645] Updated weights for policy 0, policy_version 838726 (0.0146) [2024-06-15 22:19:19,259][1651596] Signal inference workers to resume experience collection... (43550 times) [2024-06-15 22:19:19,260][1653645] InferenceWorker_p0-w0: resuming experience collection (43550 times) [2024-06-15 22:19:20,552][1653645] Updated weights for policy 0, policy_version 838791 (0.0025) [2024-06-15 22:19:20,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 47513.8, 300 sec: 45208.7). Total num frames: 1717862400. Throughput: 0: 11798.7. Samples: 429518336. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:20,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:19:21,492][1653645] Updated weights for policy 0, policy_version 838835 (0.0061) [2024-06-15 22:19:23,133][1653645] Updated weights for policy 0, policy_version 838901 (0.0016) [2024-06-15 22:19:25,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 46967.5, 300 sec: 45098.1). Total num frames: 1718091776. Throughput: 0: 11821.6. Samples: 429583872. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:19:26,820][1653645] Updated weights for policy 0, policy_version 838944 (0.0011) [2024-06-15 22:19:30,908][1653645] Updated weights for policy 0, policy_version 839009 (0.0100) [2024-06-15 22:19:30,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 46967.5, 300 sec: 45097.7). Total num frames: 1718288384. Throughput: 0: 11707.8. Samples: 429657600. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:30,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:19:31,918][1653645] Updated weights for policy 0, policy_version 839056 (0.0012) [2024-06-15 22:19:32,834][1653645] Updated weights for policy 0, policy_version 839104 (0.0036) [2024-06-15 22:19:35,200][1653645] Updated weights for policy 0, policy_version 839163 (0.0013) [2024-06-15 22:19:35,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 48605.9, 300 sec: 45319.8). Total num frames: 1718616064. Throughput: 0: 11855.6. Samples: 429688320. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:19:38,361][1653645] Updated weights for policy 0, policy_version 839207 (0.0012) [2024-06-15 22:19:38,924][1653645] Updated weights for policy 0, policy_version 839232 (0.0028) [2024-06-15 22:19:40,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 45875.3, 300 sec: 44875.5). Total num frames: 1718747136. Throughput: 0: 11764.6. Samples: 429761024. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:19:43,667][1653645] Updated weights for policy 0, policy_version 839310 (0.0052) [2024-06-15 22:19:44,555][1653645] Updated weights for policy 0, policy_version 839360 (0.0012) [2024-06-15 22:19:45,966][1648982] Fps is (10 sec: 42561.9, 60 sec: 48599.0, 300 sec: 45318.5). Total num frames: 1719042048. Throughput: 0: 11716.9. Samples: 429827072. Policy #0 lag: (min: 15.0, avg: 72.8, max: 271.0) [2024-06-15 22:19:45,967][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:19:46,532][1653645] Updated weights for policy 0, policy_version 839413 (0.0012) [2024-06-15 22:19:49,750][1653645] Updated weights for policy 0, policy_version 839477 (0.0013) [2024-06-15 22:19:50,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.1, 300 sec: 44986.6). Total num frames: 1719271424. Throughput: 0: 11605.4. Samples: 429863936. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:19:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:19:53,522][1653645] Updated weights for policy 0, policy_version 839536 (0.0012) [2024-06-15 22:19:55,958][1648982] Fps is (10 sec: 45913.2, 60 sec: 47513.5, 300 sec: 45319.8). Total num frames: 1719500800. Throughput: 0: 11514.3. Samples: 429931520. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:19:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:19:57,888][1653645] Updated weights for policy 0, policy_version 839648 (0.0101) [2024-06-15 22:20:00,352][1653645] Updated weights for policy 0, policy_version 839696 (0.0014) [2024-06-15 22:20:00,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 44783.0, 300 sec: 45208.8). Total num frames: 1719730176. Throughput: 0: 11389.2. Samples: 429999616. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:20:04,354][1651596] Signal inference workers to stop experience collection... (43600 times) [2024-06-15 22:20:04,433][1653645] InferenceWorker_p0-w0: stopping experience collection (43600 times) [2024-06-15 22:20:04,455][1653645] Updated weights for policy 0, policy_version 839767 (0.0012) [2024-06-15 22:20:04,568][1651596] Signal inference workers to resume experience collection... (43600 times) [2024-06-15 22:20:04,570][1653645] InferenceWorker_p0-w0: resuming experience collection (43600 times) [2024-06-15 22:20:05,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1719926784. Throughput: 0: 11468.8. Samples: 430034432. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:05,960][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:20:07,120][1653645] Updated weights for policy 0, policy_version 839831 (0.0089) [2024-06-15 22:20:08,768][1653645] Updated weights for policy 0, policy_version 839888 (0.0140) [2024-06-15 22:20:10,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45329.2, 300 sec: 45319.9). Total num frames: 1720188928. Throughput: 0: 11355.0. Samples: 430094848. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:20:12,904][1653645] Updated weights for policy 0, policy_version 839973 (0.0036) [2024-06-15 22:20:15,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1720320000. Throughput: 0: 11320.9. Samples: 430167040. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:20:16,228][1653645] Updated weights for policy 0, policy_version 840020 (0.0013) [2024-06-15 22:20:16,928][1653645] Updated weights for policy 0, policy_version 840063 (0.0013) [2024-06-15 22:20:19,254][1653645] Updated weights for policy 0, policy_version 840128 (0.0015) [2024-06-15 22:20:20,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 46421.4, 300 sec: 45319.8). Total num frames: 1720647680. Throughput: 0: 11480.2. Samples: 430204928. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:20:21,508][1653645] Updated weights for policy 0, policy_version 840192 (0.0012) [2024-06-15 22:20:24,297][1653645] Updated weights for policy 0, policy_version 840242 (0.0012) [2024-06-15 22:20:25,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 45875.3, 300 sec: 45208.7). Total num frames: 1720844288. Throughput: 0: 11389.2. Samples: 430273536. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:25,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 22:20:26,598][1653645] Updated weights for policy 0, policy_version 840273 (0.0026) [2024-06-15 22:20:29,065][1653645] Updated weights for policy 0, policy_version 840322 (0.0013) [2024-06-15 22:20:30,167][1653645] Updated weights for policy 0, policy_version 840377 (0.0012) [2024-06-15 22:20:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 47513.6, 300 sec: 45430.9). Total num frames: 1721139200. Throughput: 0: 11618.9. Samples: 430349824. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:30,960][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:20:31,650][1653645] Updated weights for policy 0, policy_version 840443 (0.0011) [2024-06-15 22:20:34,651][1653645] Updated weights for policy 0, policy_version 840496 (0.0015) [2024-06-15 22:20:35,957][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.3, 300 sec: 45542.0). Total num frames: 1721368576. Throughput: 0: 11616.7. Samples: 430386688. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:20:37,320][1653645] Updated weights for policy 0, policy_version 840528 (0.0014) [2024-06-15 22:20:40,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 46421.2, 300 sec: 45430.8). Total num frames: 1721532416. Throughput: 0: 11605.3. Samples: 430453760. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:40,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:20:41,062][1653645] Updated weights for policy 0, policy_version 840608 (0.0014) [2024-06-15 22:20:42,271][1653645] Updated weights for policy 0, policy_version 840647 (0.0014) [2024-06-15 22:20:43,636][1653645] Updated weights for policy 0, policy_version 840702 (0.0012) [2024-06-15 22:20:45,958][1648982] Fps is (10 sec: 45874.7, 60 sec: 46428.0, 300 sec: 45542.0). Total num frames: 1721827328. Throughput: 0: 11537.1. Samples: 430518784. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:20:46,386][1653645] Updated weights for policy 0, policy_version 840764 (0.0013) [2024-06-15 22:20:49,698][1651596] Signal inference workers to stop experience collection... (43650 times) [2024-06-15 22:20:49,750][1653645] InferenceWorker_p0-w0: stopping experience collection (43650 times) [2024-06-15 22:20:49,859][1651596] Signal inference workers to resume experience collection... (43650 times) [2024-06-15 22:20:49,860][1653645] InferenceWorker_p0-w0: resuming experience collection (43650 times) [2024-06-15 22:20:50,130][1653645] Updated weights for policy 0, policy_version 840832 (0.0013) [2024-06-15 22:20:50,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1722023936. Throughput: 0: 11571.2. Samples: 430555136. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:20:54,195][1653645] Updated weights for policy 0, policy_version 840897 (0.0017) [2024-06-15 22:20:55,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 46421.5, 300 sec: 45764.2). Total num frames: 1722286080. Throughput: 0: 11776.0. Samples: 430624768. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:20:55,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:20:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000840960_1722286080.pth... [2024-06-15 22:20:56,027][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000835584_1711276032.pth [2024-06-15 22:20:56,891][1653645] Updated weights for policy 0, policy_version 840976 (0.0014) [2024-06-15 22:20:58,104][1653645] Updated weights for policy 0, policy_version 841024 (0.0012) [2024-06-15 22:21:00,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 45328.9, 300 sec: 45208.7). Total num frames: 1722449920. Throughput: 0: 11616.7. Samples: 430689792. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:00,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:21:01,506][1653645] Updated weights for policy 0, policy_version 841082 (0.0032) [2024-06-15 22:21:05,260][1653645] Updated weights for policy 0, policy_version 841146 (0.0013) [2024-06-15 22:21:05,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 46421.5, 300 sec: 45875.2). Total num frames: 1722712064. Throughput: 0: 11537.1. Samples: 430724096. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:21:06,948][1653645] Updated weights for policy 0, policy_version 841211 (0.0020) [2024-06-15 22:21:09,480][1653645] Updated weights for policy 0, policy_version 841253 (0.0012) [2024-06-15 22:21:10,958][1648982] Fps is (10 sec: 49152.8, 60 sec: 45875.2, 300 sec: 45542.0). Total num frames: 1722941440. Throughput: 0: 11446.0. Samples: 430788608. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:21:12,294][1653645] Updated weights for policy 0, policy_version 841299 (0.0014) [2024-06-15 22:21:15,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 45875.3, 300 sec: 45319.8). Total num frames: 1723072512. Throughput: 0: 11320.9. Samples: 430859264. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:21:16,846][1653645] Updated weights for policy 0, policy_version 841392 (0.0024) [2024-06-15 22:21:18,624][1653645] Updated weights for policy 0, policy_version 841461 (0.0116) [2024-06-15 22:21:20,723][1653645] Updated weights for policy 0, policy_version 841504 (0.0014) [2024-06-15 22:21:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45875.2, 300 sec: 45653.0). Total num frames: 1723400192. Throughput: 0: 11059.2. Samples: 430884352. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:21:23,872][1653645] Updated weights for policy 0, policy_version 841552 (0.0013) [2024-06-15 22:21:25,958][1648982] Fps is (10 sec: 52427.0, 60 sec: 45874.9, 300 sec: 45319.8). Total num frames: 1723596800. Throughput: 0: 11264.0. Samples: 430960640. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:21:27,645][1653645] Updated weights for policy 0, policy_version 841619 (0.0015) [2024-06-15 22:21:29,962][1653645] Updated weights for policy 0, policy_version 841697 (0.0011) [2024-06-15 22:21:30,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 45329.1, 300 sec: 45764.1). Total num frames: 1723858944. Throughput: 0: 11138.8. Samples: 431020032. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:30,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:21:32,241][1653645] Updated weights for policy 0, policy_version 841747 (0.0013) [2024-06-15 22:21:35,356][1653645] Updated weights for policy 0, policy_version 841793 (0.0012) [2024-06-15 22:21:35,958][1648982] Fps is (10 sec: 42600.0, 60 sec: 44236.7, 300 sec: 45097.7). Total num frames: 1724022784. Throughput: 0: 11218.5. Samples: 431059968. Policy #0 lag: (min: 63.0, avg: 200.3, max: 319.0) [2024-06-15 22:21:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:21:36,279][1651596] Signal inference workers to stop experience collection... (43700 times) [2024-06-15 22:21:36,381][1653645] InferenceWorker_p0-w0: stopping experience collection (43700 times) [2024-06-15 22:21:36,462][1651596] Signal inference workers to resume experience collection... (43700 times) [2024-06-15 22:21:36,462][1653645] InferenceWorker_p0-w0: resuming experience collection (43700 times) [2024-06-15 22:21:36,529][1653645] Updated weights for policy 0, policy_version 841844 (0.0014) [2024-06-15 22:21:40,053][1653645] Updated weights for policy 0, policy_version 841889 (0.0013) [2024-06-15 22:21:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 45329.3, 300 sec: 45764.2). Total num frames: 1724252160. Throughput: 0: 11320.9. Samples: 431134208. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:21:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:21:41,586][1653645] Updated weights for policy 0, policy_version 841955 (0.0101) [2024-06-15 22:21:42,190][1653645] Updated weights for policy 0, policy_version 841984 (0.0012) [2024-06-15 22:21:43,922][1653645] Updated weights for policy 0, policy_version 842048 (0.0027) [2024-06-15 22:21:45,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44782.9, 300 sec: 45542.0). Total num frames: 1724514304. Throughput: 0: 11389.2. Samples: 431202304. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:21:45,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:21:47,647][1653645] Updated weights for policy 0, policy_version 842108 (0.0012) [2024-06-15 22:21:50,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.8, 300 sec: 45435.6). Total num frames: 1724678144. Throughput: 0: 11320.9. Samples: 431233536. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:21:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:21:52,321][1653645] Updated weights for policy 0, policy_version 842192 (0.0091) [2024-06-15 22:21:53,304][1653645] Updated weights for policy 0, policy_version 842235 (0.0013) [2024-06-15 22:21:54,737][1653645] Updated weights for policy 0, policy_version 842301 (0.0033) [2024-06-15 22:21:55,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45875.1, 300 sec: 45875.2). Total num frames: 1725038592. Throughput: 0: 11389.1. Samples: 431301120. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:21:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:21:58,991][1653645] Updated weights for policy 0, policy_version 842359 (0.0060) [2024-06-15 22:22:00,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.2, 300 sec: 45319.8). Total num frames: 1725169664. Throughput: 0: 11377.8. Samples: 431371264. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:22:03,759][1653645] Updated weights for policy 0, policy_version 842434 (0.0035) [2024-06-15 22:22:05,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.0, 300 sec: 46097.4). Total num frames: 1725431808. Throughput: 0: 11559.8. Samples: 431404544. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:22:06,435][1653645] Updated weights for policy 0, policy_version 842530 (0.0084) [2024-06-15 22:22:10,621][1653645] Updated weights for policy 0, policy_version 842608 (0.0053) [2024-06-15 22:22:10,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 45875.0, 300 sec: 45430.9). Total num frames: 1725693952. Throughput: 0: 11332.3. Samples: 431470592. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:22:14,785][1653645] Updated weights for policy 0, policy_version 842659 (0.0012) [2024-06-15 22:22:15,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 45875.2, 300 sec: 45764.1). Total num frames: 1725825024. Throughput: 0: 11571.2. Samples: 431540736. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:22:16,145][1653645] Updated weights for policy 0, policy_version 842705 (0.0031) [2024-06-15 22:22:17,773][1651596] Signal inference workers to stop experience collection... (43750 times) [2024-06-15 22:22:17,863][1653645] InferenceWorker_p0-w0: stopping experience collection (43750 times) [2024-06-15 22:22:17,865][1653645] Updated weights for policy 0, policy_version 842793 (0.0012) [2024-06-15 22:22:18,000][1651596] Signal inference workers to resume experience collection... (43750 times) [2024-06-15 22:22:18,000][1653645] InferenceWorker_p0-w0: resuming experience collection (43750 times) [2024-06-15 22:22:20,485][1653645] Updated weights for policy 0, policy_version 842832 (0.0013) [2024-06-15 22:22:20,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 45875.0, 300 sec: 45653.0). Total num frames: 1726152704. Throughput: 0: 11298.1. Samples: 431568384. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:22:25,958][1648982] Fps is (10 sec: 45874.5, 60 sec: 44783.1, 300 sec: 45542.1). Total num frames: 1726283776. Throughput: 0: 11423.2. Samples: 431648256. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:22:26,242][1653645] Updated weights for policy 0, policy_version 842940 (0.0112) [2024-06-15 22:22:28,134][1653645] Updated weights for policy 0, policy_version 842992 (0.0043) [2024-06-15 22:22:29,798][1653645] Updated weights for policy 0, policy_version 843063 (0.0013) [2024-06-15 22:22:30,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 45875.1, 300 sec: 45764.1). Total num frames: 1726611456. Throughput: 0: 11309.5. Samples: 431711232. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:22:32,060][1653645] Updated weights for policy 0, policy_version 843104 (0.0016) [2024-06-15 22:22:35,958][1648982] Fps is (10 sec: 45875.9, 60 sec: 45329.1, 300 sec: 45319.8). Total num frames: 1726742528. Throughput: 0: 11355.0. Samples: 431744512. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:22:37,705][1653645] Updated weights for policy 0, policy_version 843152 (0.0013) [2024-06-15 22:22:39,616][1653645] Updated weights for policy 0, policy_version 843235 (0.0013) [2024-06-15 22:22:40,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 46421.3, 300 sec: 45875.2). Total num frames: 1727037440. Throughput: 0: 11389.2. Samples: 431813632. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:22:41,084][1653645] Updated weights for policy 0, policy_version 843296 (0.0015) [2024-06-15 22:22:43,745][1653645] Updated weights for policy 0, policy_version 843347 (0.0025) [2024-06-15 22:22:45,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 45875.0, 300 sec: 45319.8). Total num frames: 1727266816. Throughput: 0: 11275.3. Samples: 431878656. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:22:49,883][1653645] Updated weights for policy 0, policy_version 843426 (0.0019) [2024-06-15 22:22:50,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 45875.3, 300 sec: 45875.2). Total num frames: 1727430656. Throughput: 0: 11480.2. Samples: 431921152. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:22:52,060][1653645] Updated weights for policy 0, policy_version 843520 (0.0027) [2024-06-15 22:22:53,389][1653645] Updated weights for policy 0, policy_version 843576 (0.0013) [2024-06-15 22:22:55,920][1653645] Updated weights for policy 0, policy_version 843632 (0.0013) [2024-06-15 22:22:55,958][1648982] Fps is (10 sec: 49152.6, 60 sec: 45329.1, 300 sec: 45986.3). Total num frames: 1727758336. Throughput: 0: 11264.0. Samples: 431977472. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:22:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:22:56,178][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000843648_1727791104.pth... [2024-06-15 22:22:56,250][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000838208_1716649984.pth [2024-06-15 22:23:00,958][1648982] Fps is (10 sec: 36043.2, 60 sec: 43690.4, 300 sec: 45319.7). Total num frames: 1727791104. Throughput: 0: 11354.9. Samples: 432051712. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:23:00,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:23:03,630][1651596] Signal inference workers to stop experience collection... (43800 times) [2024-06-15 22:23:03,672][1653645] InferenceWorker_p0-w0: stopping experience collection (43800 times) [2024-06-15 22:23:03,852][1651596] Signal inference workers to resume experience collection... (43800 times) [2024-06-15 22:23:03,852][1653645] InferenceWorker_p0-w0: resuming experience collection (43800 times) [2024-06-15 22:23:03,855][1653645] Updated weights for policy 0, policy_version 843744 (0.0080) [2024-06-15 22:23:05,208][1653645] Updated weights for policy 0, policy_version 843794 (0.0010) [2024-06-15 22:23:05,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 45329.2, 300 sec: 46097.4). Total num frames: 1728151552. Throughput: 0: 11309.6. Samples: 432077312. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:23:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:23:06,133][1653645] Updated weights for policy 0, policy_version 843833 (0.0009) [2024-06-15 22:23:07,596][1653645] Updated weights for policy 0, policy_version 843881 (0.0067) [2024-06-15 22:23:10,958][1648982] Fps is (10 sec: 52430.9, 60 sec: 43690.9, 300 sec: 45319.8). Total num frames: 1728315392. Throughput: 0: 11047.9. Samples: 432145408. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:23:10,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:23:14,327][1653645] Updated weights for policy 0, policy_version 843936 (0.0012) [2024-06-15 22:23:15,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.8, 300 sec: 45653.1). Total num frames: 1728479232. Throughput: 0: 11047.9. Samples: 432208384. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:23:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:23:16,022][1653645] Updated weights for policy 0, policy_version 844000 (0.0012) [2024-06-15 22:23:18,063][1653645] Updated weights for policy 0, policy_version 844080 (0.0130) [2024-06-15 22:23:20,163][1653645] Updated weights for policy 0, policy_version 844129 (0.0013) [2024-06-15 22:23:20,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44783.1, 300 sec: 45986.3). Total num frames: 1728839680. Throughput: 0: 10808.9. Samples: 432230912. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:23:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:23:25,967][1648982] Fps is (10 sec: 36012.8, 60 sec: 42592.2, 300 sec: 45318.5). Total num frames: 1728839680. Throughput: 0: 10840.9. Samples: 432301568. Policy #0 lag: (min: 45.0, avg: 154.4, max: 301.0) [2024-06-15 22:23:25,969][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:23:26,889][1653645] Updated weights for policy 0, policy_version 844198 (0.0013) [2024-06-15 22:23:28,366][1653645] Updated weights for policy 0, policy_version 844272 (0.0012) [2024-06-15 22:23:29,503][1653645] Updated weights for policy 0, policy_version 844322 (0.0012) [2024-06-15 22:23:30,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 45875.2). Total num frames: 1729232896. Throughput: 0: 10854.4. Samples: 432367104. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:23:30,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:23:32,061][1653645] Updated weights for policy 0, policy_version 844387 (0.0026) [2024-06-15 22:23:35,958][1648982] Fps is (10 sec: 52473.6, 60 sec: 43690.5, 300 sec: 45319.8). Total num frames: 1729363968. Throughput: 0: 10706.4. Samples: 432402944. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:23:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:23:37,890][1653645] Updated weights for policy 0, policy_version 844448 (0.0017) [2024-06-15 22:23:39,452][1653645] Updated weights for policy 0, policy_version 844516 (0.0013) [2024-06-15 22:23:40,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.6, 300 sec: 45875.2). Total num frames: 1729658880. Throughput: 0: 10945.4. Samples: 432470016. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:23:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:23:41,805][1653645] Updated weights for policy 0, policy_version 844607 (0.0075) [2024-06-15 22:23:44,869][1651596] Signal inference workers to stop experience collection... (43850 times) [2024-06-15 22:23:44,926][1653645] InferenceWorker_p0-w0: stopping experience collection (43850 times) [2024-06-15 22:23:44,929][1653645] Updated weights for policy 0, policy_version 844643 (0.0013) [2024-06-15 22:23:45,236][1651596] Signal inference workers to resume experience collection... (43850 times) [2024-06-15 22:23:45,237][1653645] InferenceWorker_p0-w0: resuming experience collection (43850 times) [2024-06-15 22:23:45,958][1648982] Fps is (10 sec: 52430.3, 60 sec: 43690.8, 300 sec: 45319.8). Total num frames: 1729888256. Throughput: 0: 10467.6. Samples: 432522752. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:23:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:23:50,576][1653645] Updated weights for policy 0, policy_version 844720 (0.0017) [2024-06-15 22:23:50,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.4, 300 sec: 45319.8). Total num frames: 1730019328. Throughput: 0: 10774.7. Samples: 432562176. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:23:50,960][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:23:51,939][1653645] Updated weights for policy 0, policy_version 844768 (0.0012) [2024-06-15 22:23:53,498][1653645] Updated weights for policy 0, policy_version 844826 (0.0095) [2024-06-15 22:23:55,752][1653645] Updated weights for policy 0, policy_version 844880 (0.0012) [2024-06-15 22:23:55,958][1648982] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 44986.6). Total num frames: 1730314240. Throughput: 0: 10683.7. Samples: 432626176. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:23:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:23:56,733][1653645] Updated weights for policy 0, policy_version 844922 (0.0016) [2024-06-15 22:24:00,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43691.0, 300 sec: 44875.5). Total num frames: 1730412544. Throughput: 0: 11025.1. Samples: 432704512. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:24:02,012][1653645] Updated weights for policy 0, policy_version 844981 (0.0011) [2024-06-15 22:24:03,371][1653645] Updated weights for policy 0, policy_version 845027 (0.0011) [2024-06-15 22:24:04,524][1653645] Updated weights for policy 0, policy_version 845088 (0.0012) [2024-06-15 22:24:05,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 44236.7, 300 sec: 45208.7). Total num frames: 1730805760. Throughput: 0: 11207.1. Samples: 432735232. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:05,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:24:06,707][1653645] Updated weights for policy 0, policy_version 845136 (0.0013) [2024-06-15 22:24:10,959][1648982] Fps is (10 sec: 52423.4, 60 sec: 43689.9, 300 sec: 45097.5). Total num frames: 1730936832. Throughput: 0: 11061.1. Samples: 432799232. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:24:12,781][1653645] Updated weights for policy 0, policy_version 845203 (0.0014) [2024-06-15 22:24:14,313][1653645] Updated weights for policy 0, policy_version 845249 (0.0012) [2024-06-15 22:24:15,652][1653645] Updated weights for policy 0, policy_version 845312 (0.0117) [2024-06-15 22:24:15,963][1648982] Fps is (10 sec: 42598.8, 60 sec: 45875.1, 300 sec: 45319.8). Total num frames: 1731231744. Throughput: 0: 11173.0. Samples: 432869888. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:15,964][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:24:18,699][1653645] Updated weights for policy 0, policy_version 845392 (0.0024) [2024-06-15 22:24:20,958][1648982] Fps is (10 sec: 52434.0, 60 sec: 43690.7, 300 sec: 45319.8). Total num frames: 1731461120. Throughput: 0: 11002.4. Samples: 432898048. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:24:24,454][1653645] Updated weights for policy 0, policy_version 845457 (0.0014) [2024-06-15 22:24:25,314][1653645] Updated weights for policy 0, policy_version 845504 (0.0012) [2024-06-15 22:24:25,958][1648982] Fps is (10 sec: 36043.7, 60 sec: 45881.7, 300 sec: 45097.6). Total num frames: 1731592192. Throughput: 0: 11195.7. Samples: 432973824. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:24:27,242][1653645] Updated weights for policy 0, policy_version 845557 (0.0012) [2024-06-15 22:24:29,108][1653645] Updated weights for policy 0, policy_version 845632 (0.0014) [2024-06-15 22:24:29,767][1651596] Signal inference workers to stop experience collection... (43900 times) [2024-06-15 22:24:29,833][1653645] InferenceWorker_p0-w0: stopping experience collection (43900 times) [2024-06-15 22:24:30,003][1651596] Signal inference workers to resume experience collection... (43900 times) [2024-06-15 22:24:30,004][1653645] InferenceWorker_p0-w0: resuming experience collection (43900 times) [2024-06-15 22:24:30,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1731952640. Throughput: 0: 11343.6. Samples: 433033216. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:24:31,209][1653645] Updated weights for policy 0, policy_version 845696 (0.0253) [2024-06-15 22:24:35,958][1648982] Fps is (10 sec: 39322.8, 60 sec: 43690.9, 300 sec: 44875.5). Total num frames: 1731985408. Throughput: 0: 11184.4. Samples: 433065472. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:24:38,725][1653645] Updated weights for policy 0, policy_version 845776 (0.0016) [2024-06-15 22:24:40,623][1653645] Updated weights for policy 0, policy_version 845859 (0.0014) [2024-06-15 22:24:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 44783.0, 300 sec: 45099.0). Total num frames: 1732345856. Throughput: 0: 11412.0. Samples: 433139712. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:24:42,311][1653645] Updated weights for policy 0, policy_version 845922 (0.0012) [2024-06-15 22:24:45,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1732509696. Throughput: 0: 10945.4. Samples: 433197056. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:24:48,595][1653645] Updated weights for policy 0, policy_version 845968 (0.0102) [2024-06-15 22:24:49,649][1653645] Updated weights for policy 0, policy_version 846016 (0.0011) [2024-06-15 22:24:50,958][1648982] Fps is (10 sec: 32767.7, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1732673536. Throughput: 0: 11104.7. Samples: 433234944. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:50,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:24:52,025][1653645] Updated weights for policy 0, policy_version 846081 (0.0011) [2024-06-15 22:24:53,602][1653645] Updated weights for policy 0, policy_version 846141 (0.0012) [2024-06-15 22:24:55,364][1653645] Updated weights for policy 0, policy_version 846208 (0.0012) [2024-06-15 22:24:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 45328.9, 300 sec: 45097.6). Total num frames: 1733033984. Throughput: 0: 11070.7. Samples: 433297408. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:24:55,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 22:24:55,991][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000846208_1733033984.pth... [2024-06-15 22:24:56,083][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000840960_1722286080.pth [2024-06-15 22:25:00,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 44236.6, 300 sec: 44542.2). Total num frames: 1733066752. Throughput: 0: 11116.0. Samples: 433370112. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:25:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:25:02,442][1653645] Updated weights for policy 0, policy_version 846277 (0.0012) [2024-06-15 22:25:03,703][1653645] Updated weights for policy 0, policy_version 846337 (0.0012) [2024-06-15 22:25:05,716][1653645] Updated weights for policy 0, policy_version 846405 (0.0014) [2024-06-15 22:25:05,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1733459968. Throughput: 0: 11093.3. Samples: 433397248. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:25:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:25:10,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 43691.4, 300 sec: 44875.5). Total num frames: 1733558272. Throughput: 0: 10661.0. Samples: 433453568. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:25:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:25:12,585][1653645] Updated weights for policy 0, policy_version 846467 (0.0014) [2024-06-15 22:25:13,824][1653645] Updated weights for policy 0, policy_version 846523 (0.0011) [2024-06-15 22:25:15,034][1651596] Signal inference workers to stop experience collection... (43950 times) [2024-06-15 22:25:15,056][1653645] InferenceWorker_p0-w0: stopping experience collection (43950 times) [2024-06-15 22:25:15,288][1651596] Signal inference workers to resume experience collection... (43950 times) [2024-06-15 22:25:15,289][1653645] InferenceWorker_p0-w0: resuming experience collection (43950 times) [2024-06-15 22:25:15,291][1653645] Updated weights for policy 0, policy_version 846576 (0.0012) [2024-06-15 22:25:15,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43144.5, 300 sec: 44653.3). Total num frames: 1733820416. Throughput: 0: 10911.3. Samples: 433524224. Policy #0 lag: (min: 31.0, avg: 96.5, max: 287.0) [2024-06-15 22:25:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:25:16,739][1653645] Updated weights for policy 0, policy_version 846624 (0.0018) [2024-06-15 22:25:18,666][1653645] Updated weights for policy 0, policy_version 846688 (0.0120) [2024-06-15 22:25:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.6, 300 sec: 44875.5). Total num frames: 1734082560. Throughput: 0: 10854.4. Samples: 433553920. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:25:25,796][1653645] Updated weights for policy 0, policy_version 846768 (0.0014) [2024-06-15 22:25:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43144.7, 300 sec: 44209.0). Total num frames: 1734180864. Throughput: 0: 10831.6. Samples: 433627136. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:25:28,739][1653645] Updated weights for policy 0, policy_version 846850 (0.0013) [2024-06-15 22:25:30,481][1653645] Updated weights for policy 0, policy_version 846928 (0.0015) [2024-06-15 22:25:30,959][1648982] Fps is (10 sec: 45870.0, 60 sec: 43143.7, 300 sec: 44653.1). Total num frames: 1734541312. Throughput: 0: 10683.5. Samples: 433677824. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:30,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:25:35,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1734606848. Throughput: 0: 10695.1. Samples: 433716224. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:25:37,649][1653645] Updated weights for policy 0, policy_version 847013 (0.0015) [2024-06-15 22:25:39,810][1653645] Updated weights for policy 0, policy_version 847096 (0.0093) [2024-06-15 22:25:40,958][1648982] Fps is (10 sec: 32771.9, 60 sec: 42052.3, 300 sec: 44209.0). Total num frames: 1734868992. Throughput: 0: 10786.2. Samples: 433782784. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:40,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:25:42,140][1653645] Updated weights for policy 0, policy_version 847171 (0.0014) [2024-06-15 22:25:43,169][1653645] Updated weights for policy 0, policy_version 847229 (0.0068) [2024-06-15 22:25:45,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1735131136. Throughput: 0: 10774.8. Samples: 433854976. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:25:49,884][1653645] Updated weights for policy 0, policy_version 847296 (0.0014) [2024-06-15 22:25:50,960][1648982] Fps is (10 sec: 45874.9, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1735327744. Throughput: 0: 11070.6. Samples: 433895424. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:50,961][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:25:51,478][1653645] Updated weights for policy 0, policy_version 847355 (0.0015) [2024-06-15 22:25:53,296][1653645] Updated weights for policy 0, policy_version 847410 (0.0020) [2024-06-15 22:25:55,053][1653645] Updated weights for policy 0, policy_version 847480 (0.0014) [2024-06-15 22:25:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.9, 300 sec: 44764.5). Total num frames: 1735655424. Throughput: 0: 11002.3. Samples: 433948672. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:25:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:26:00,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43690.9, 300 sec: 43986.9). Total num frames: 1735688192. Throughput: 0: 11207.1. Samples: 434028544. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:00,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:26:00,973][1651596] Signal inference workers to stop experience collection... (44000 times) [2024-06-15 22:26:01,047][1653645] InferenceWorker_p0-w0: stopping experience collection (44000 times) [2024-06-15 22:26:01,049][1653645] Updated weights for policy 0, policy_version 847509 (0.0014) [2024-06-15 22:26:01,266][1651596] Signal inference workers to resume experience collection... (44000 times) [2024-06-15 22:26:01,266][1653645] InferenceWorker_p0-w0: resuming experience collection (44000 times) [2024-06-15 22:26:02,528][1653645] Updated weights for policy 0, policy_version 847572 (0.0073) [2024-06-15 22:26:03,655][1653645] Updated weights for policy 0, policy_version 847632 (0.0014) [2024-06-15 22:26:04,826][1653645] Updated weights for policy 0, policy_version 847680 (0.0013) [2024-06-15 22:26:05,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44236.9, 300 sec: 44653.4). Total num frames: 1736114176. Throughput: 0: 11150.2. Samples: 434055680. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:26:06,458][1653645] Updated weights for policy 0, policy_version 847737 (0.0012) [2024-06-15 22:26:10,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1736179712. Throughput: 0: 11025.1. Samples: 434123264. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:26:12,678][1653645] Updated weights for policy 0, policy_version 847776 (0.0021) [2024-06-15 22:26:14,320][1653645] Updated weights for policy 0, policy_version 847852 (0.0088) [2024-06-15 22:26:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1736507392. Throughput: 0: 11423.6. Samples: 434191872. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:26:16,047][1653645] Updated weights for policy 0, policy_version 847920 (0.0015) [2024-06-15 22:26:17,749][1653645] Updated weights for policy 0, policy_version 847984 (0.0133) [2024-06-15 22:26:20,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1736704000. Throughput: 0: 11195.7. Samples: 434220032. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:20,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:26:24,235][1653645] Updated weights for policy 0, policy_version 848048 (0.0013) [2024-06-15 22:26:25,278][1653645] Updated weights for policy 0, policy_version 848097 (0.0086) [2024-06-15 22:26:25,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 46421.4, 300 sec: 44431.2). Total num frames: 1736966144. Throughput: 0: 11423.3. Samples: 434296832. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:26:26,768][1653645] Updated weights for policy 0, policy_version 848160 (0.0105) [2024-06-15 22:26:29,617][1653645] Updated weights for policy 0, policy_version 848240 (0.0016) [2024-06-15 22:26:30,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 44783.8, 300 sec: 44764.4). Total num frames: 1737228288. Throughput: 0: 11138.8. Samples: 434356224. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:30,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:26:35,610][1653645] Updated weights for policy 0, policy_version 848291 (0.0011) [2024-06-15 22:26:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 45329.1, 300 sec: 44320.1). Total num frames: 1737326592. Throughput: 0: 11104.7. Samples: 434395136. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:26:37,259][1653645] Updated weights for policy 0, policy_version 848368 (0.0013) [2024-06-15 22:26:38,636][1653645] Updated weights for policy 0, policy_version 848419 (0.0013) [2024-06-15 22:26:40,716][1651596] Signal inference workers to stop experience collection... (44050 times) [2024-06-15 22:26:40,743][1653645] InferenceWorker_p0-w0: stopping experience collection (44050 times) [2024-06-15 22:26:40,958][1648982] Fps is (10 sec: 39320.5, 60 sec: 45875.0, 300 sec: 44431.1). Total num frames: 1737621504. Throughput: 0: 11332.2. Samples: 434458624. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:40,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:26:40,976][1651596] Signal inference workers to resume experience collection... (44050 times) [2024-06-15 22:26:40,977][1653645] InferenceWorker_p0-w0: resuming experience collection (44050 times) [2024-06-15 22:26:41,458][1653645] Updated weights for policy 0, policy_version 848481 (0.0014) [2024-06-15 22:26:45,973][1648982] Fps is (10 sec: 42532.3, 60 sec: 43679.3, 300 sec: 44317.8). Total num frames: 1737752576. Throughput: 0: 11146.4. Samples: 434530304. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:45,974][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:26:47,572][1653645] Updated weights for policy 0, policy_version 848575 (0.0021) [2024-06-15 22:26:49,236][1653645] Updated weights for policy 0, policy_version 848635 (0.0012) [2024-06-15 22:26:50,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 46421.2, 300 sec: 44320.1). Total num frames: 1738113024. Throughput: 0: 11298.1. Samples: 434564096. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:50,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:26:52,432][1653645] Updated weights for policy 0, policy_version 848706 (0.0016) [2024-06-15 22:26:53,370][1653645] Updated weights for policy 0, policy_version 848759 (0.0013) [2024-06-15 22:26:55,958][1648982] Fps is (10 sec: 52510.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1738276864. Throughput: 0: 11184.3. Samples: 434626560. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:26:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:26:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000848768_1738276864.pth... [2024-06-15 22:26:56,034][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000843648_1727791104.pth [2024-06-15 22:26:59,400][1653645] Updated weights for policy 0, policy_version 848825 (0.0027) [2024-06-15 22:27:00,930][1653645] Updated weights for policy 0, policy_version 848880 (0.0013) [2024-06-15 22:27:00,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 46967.4, 300 sec: 44320.1). Total num frames: 1738506240. Throughput: 0: 11207.1. Samples: 434696192. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:27:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:27:02,573][1653645] Updated weights for policy 0, policy_version 848944 (0.0023) [2024-06-15 22:27:04,456][1653645] Updated weights for policy 0, policy_version 848992 (0.0013) [2024-06-15 22:27:05,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44782.8, 300 sec: 44431.2). Total num frames: 1738801152. Throughput: 0: 11218.5. Samples: 434724864. Policy #0 lag: (min: 39.0, avg: 164.7, max: 303.0) [2024-06-15 22:27:05,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 22:27:09,181][1653645] Updated weights for policy 0, policy_version 849040 (0.0012) [2024-06-15 22:27:10,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1738932224. Throughput: 0: 11320.9. Samples: 434806272. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:10,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:27:11,264][1653645] Updated weights for policy 0, policy_version 849090 (0.0030) [2024-06-15 22:27:13,109][1653645] Updated weights for policy 0, policy_version 849156 (0.0014) [2024-06-15 22:27:14,218][1653645] Updated weights for policy 0, policy_version 849210 (0.0014) [2024-06-15 22:27:15,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1739259904. Throughput: 0: 11400.5. Samples: 434869248. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:15,960][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:27:16,642][1653645] Updated weights for policy 0, policy_version 849276 (0.0094) [2024-06-15 22:27:20,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1739390976. Throughput: 0: 11343.6. Samples: 434905600. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:27:21,386][1653645] Updated weights for policy 0, policy_version 849340 (0.0014) [2024-06-15 22:27:25,552][1653645] Updated weights for policy 0, policy_version 849424 (0.0086) [2024-06-15 22:27:25,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1739653120. Throughput: 0: 11355.1. Samples: 434969600. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:27:26,815][1653645] Updated weights for policy 0, policy_version 849472 (0.0025) [2024-06-15 22:27:27,404][1651596] Signal inference workers to stop experience collection... (44100 times) [2024-06-15 22:27:27,452][1653645] InferenceWorker_p0-w0: stopping experience collection (44100 times) [2024-06-15 22:27:27,627][1651596] Signal inference workers to resume experience collection... (44100 times) [2024-06-15 22:27:27,628][1653645] InferenceWorker_p0-w0: resuming experience collection (44100 times) [2024-06-15 22:27:28,695][1653645] Updated weights for policy 0, policy_version 849529 (0.0136) [2024-06-15 22:27:30,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1739849728. Throughput: 0: 11165.4. Samples: 435032576. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:27:33,052][1653645] Updated weights for policy 0, policy_version 849589 (0.0013) [2024-06-15 22:27:35,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 1739980800. Throughput: 0: 11184.4. Samples: 435067392. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:27:37,968][1653645] Updated weights for policy 0, policy_version 849696 (0.0135) [2024-06-15 22:27:38,765][1653645] Updated weights for policy 0, policy_version 849728 (0.0012) [2024-06-15 22:27:40,715][1653645] Updated weights for policy 0, policy_version 849786 (0.0013) [2024-06-15 22:27:40,958][1648982] Fps is (10 sec: 52427.4, 60 sec: 45875.2, 300 sec: 44431.2). Total num frames: 1740374016. Throughput: 0: 11150.2. Samples: 435128320. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:40,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:27:44,389][1653645] Updated weights for policy 0, policy_version 849824 (0.0028) [2024-06-15 22:27:45,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45887.0, 300 sec: 44320.1). Total num frames: 1740505088. Throughput: 0: 11298.1. Samples: 435204608. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:27:47,726][1653645] Updated weights for policy 0, policy_version 849878 (0.0011) [2024-06-15 22:27:49,177][1653645] Updated weights for policy 0, policy_version 849942 (0.0014) [2024-06-15 22:27:50,958][1648982] Fps is (10 sec: 39322.6, 60 sec: 44236.9, 300 sec: 44098.0). Total num frames: 1740767232. Throughput: 0: 11389.2. Samples: 435237376. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:50,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:27:52,171][1653645] Updated weights for policy 0, policy_version 850032 (0.0015) [2024-06-15 22:27:55,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1740898304. Throughput: 0: 10877.1. Samples: 435295744. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:27:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:27:57,021][1653645] Updated weights for policy 0, policy_version 850102 (0.0015) [2024-06-15 22:28:00,966][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1741127680. Throughput: 0: 10990.9. Samples: 435363840. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:00,967][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:28:01,068][1653645] Updated weights for policy 0, policy_version 850176 (0.0011) [2024-06-15 22:28:03,655][1653645] Updated weights for policy 0, policy_version 850243 (0.0012) [2024-06-15 22:28:04,862][1653645] Updated weights for policy 0, policy_version 850304 (0.0019) [2024-06-15 22:28:05,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1741422592. Throughput: 0: 10774.8. Samples: 435390464. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:05,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:28:10,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1741553664. Throughput: 0: 10888.5. Samples: 435459584. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:28:11,502][1653645] Updated weights for policy 0, policy_version 850372 (0.0012) [2024-06-15 22:28:13,037][1653645] Updated weights for policy 0, policy_version 850436 (0.0024) [2024-06-15 22:28:13,364][1651596] Signal inference workers to stop experience collection... (44150 times) [2024-06-15 22:28:13,415][1653645] InferenceWorker_p0-w0: stopping experience collection (44150 times) [2024-06-15 22:28:13,664][1651596] Signal inference workers to resume experience collection... (44150 times) [2024-06-15 22:28:13,665][1653645] InferenceWorker_p0-w0: resuming experience collection (44150 times) [2024-06-15 22:28:14,385][1653645] Updated weights for policy 0, policy_version 850492 (0.0012) [2024-06-15 22:28:15,958][1648982] Fps is (10 sec: 42597.3, 60 sec: 43144.4, 300 sec: 44097.9). Total num frames: 1741848576. Throughput: 0: 10945.4. Samples: 435525120. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:28:16,662][1653645] Updated weights for policy 0, policy_version 850549 (0.0011) [2024-06-15 22:28:20,754][1653645] Updated weights for policy 0, policy_version 850592 (0.0014) [2024-06-15 22:28:20,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 44654.7). Total num frames: 1742012416. Throughput: 0: 10877.2. Samples: 435556864. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:28:21,384][1653645] Updated weights for policy 0, policy_version 850624 (0.0012) [2024-06-15 22:28:24,819][1653645] Updated weights for policy 0, policy_version 850691 (0.0012) [2024-06-15 22:28:25,958][1648982] Fps is (10 sec: 45874.4, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 1742307328. Throughput: 0: 11138.8. Samples: 435629568. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:25,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:28:26,964][1653645] Updated weights for policy 0, policy_version 850768 (0.0013) [2024-06-15 22:28:28,067][1653645] Updated weights for policy 0, policy_version 850816 (0.0014) [2024-06-15 22:28:30,960][1648982] Fps is (10 sec: 45865.2, 60 sec: 43689.1, 300 sec: 44430.9). Total num frames: 1742471168. Throughput: 0: 10899.4. Samples: 435695104. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:30,961][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:28:35,189][1653645] Updated weights for policy 0, policy_version 850881 (0.0013) [2024-06-15 22:28:35,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 44782.8, 300 sec: 44097.9). Total num frames: 1742667776. Throughput: 0: 10956.7. Samples: 435730432. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:35,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:28:37,263][1653645] Updated weights for policy 0, policy_version 850963 (0.0013) [2024-06-15 22:28:38,967][1653645] Updated weights for policy 0, policy_version 851026 (0.0016) [2024-06-15 22:28:40,962][1648982] Fps is (10 sec: 52415.5, 60 sec: 43687.4, 300 sec: 44430.5). Total num frames: 1742995456. Throughput: 0: 10921.5. Samples: 435787264. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:40,963][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:28:44,510][1653645] Updated weights for policy 0, policy_version 851073 (0.0035) [2024-06-15 22:28:45,933][1653645] Updated weights for policy 0, policy_version 851132 (0.0012) [2024-06-15 22:28:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 44320.1). Total num frames: 1743093760. Throughput: 0: 10968.1. Samples: 435857408. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:28:49,272][1653645] Updated weights for policy 0, policy_version 851216 (0.0012) [2024-06-15 22:28:50,958][1648982] Fps is (10 sec: 39340.3, 60 sec: 43690.7, 300 sec: 44320.1). Total num frames: 1743388672. Throughput: 0: 11081.9. Samples: 435889152. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:28:51,509][1653645] Updated weights for policy 0, policy_version 851296 (0.0109) [2024-06-15 22:28:55,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1743519744. Throughput: 0: 10786.1. Samples: 435944960. Policy #0 lag: (min: 7.0, avg: 83.1, max: 263.0) [2024-06-15 22:28:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:28:55,983][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000851328_1743519744.pth... [2024-06-15 22:28:56,029][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000846208_1733033984.pth [2024-06-15 22:28:58,452][1653645] Updated weights for policy 0, policy_version 851364 (0.0013) [2024-06-15 22:29:00,598][1653645] Updated weights for policy 0, policy_version 851424 (0.0132) [2024-06-15 22:29:00,946][1651596] Signal inference workers to stop experience collection... (44200 times) [2024-06-15 22:29:00,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1743749120. Throughput: 0: 10945.5. Samples: 436017664. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:29:01,014][1653645] InferenceWorker_p0-w0: stopping experience collection (44200 times) [2024-06-15 22:29:01,165][1651596] Signal inference workers to resume experience collection... (44200 times) [2024-06-15 22:29:01,166][1653645] InferenceWorker_p0-w0: resuming experience collection (44200 times) [2024-06-15 22:29:02,624][1653645] Updated weights for policy 0, policy_version 851520 (0.0014) [2024-06-15 22:29:05,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 43690.7, 300 sec: 44431.3). Total num frames: 1744044032. Throughput: 0: 10774.8. Samples: 436041728. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:29:09,897][1653645] Updated weights for policy 0, policy_version 851585 (0.0061) [2024-06-15 22:29:10,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 42598.2, 300 sec: 43653.6). Total num frames: 1744109568. Throughput: 0: 10774.8. Samples: 436114432. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:10,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:29:11,242][1653645] Updated weights for policy 0, policy_version 851647 (0.0013) [2024-06-15 22:29:13,009][1653645] Updated weights for policy 0, policy_version 851712 (0.0014) [2024-06-15 22:29:15,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.8, 300 sec: 44097.9). Total num frames: 1744470016. Throughput: 0: 10604.6. Samples: 436172288. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:29:16,027][1653645] Updated weights for policy 0, policy_version 851808 (0.0084) [2024-06-15 22:29:20,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 43986.9). Total num frames: 1744568320. Throughput: 0: 10604.1. Samples: 436207616. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:29:22,158][1653645] Updated weights for policy 0, policy_version 851857 (0.0012) [2024-06-15 22:29:23,359][1653645] Updated weights for policy 0, policy_version 851912 (0.0012) [2024-06-15 22:29:24,778][1653645] Updated weights for policy 0, policy_version 851970 (0.0119) [2024-06-15 22:29:25,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 44237.0, 300 sec: 44097.9). Total num frames: 1744961536. Throughput: 0: 10946.6. Samples: 436279808. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:29:27,903][1653645] Updated weights for policy 0, policy_version 852052 (0.0137) [2024-06-15 22:29:30,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 43692.2, 300 sec: 44431.2). Total num frames: 1745092608. Throughput: 0: 10774.8. Samples: 436342272. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:30,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:29:33,501][1653645] Updated weights for policy 0, policy_version 852116 (0.0015) [2024-06-15 22:29:35,541][1653645] Updated weights for policy 0, policy_version 852176 (0.0014) [2024-06-15 22:29:35,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1745289216. Throughput: 0: 10990.9. Samples: 436383744. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:29:37,094][1653645] Updated weights for policy 0, policy_version 852242 (0.0066) [2024-06-15 22:29:37,964][1653645] Updated weights for policy 0, policy_version 852287 (0.0012) [2024-06-15 22:29:40,958][1648982] Fps is (10 sec: 52426.4, 60 sec: 43693.7, 300 sec: 44431.1). Total num frames: 1745616896. Throughput: 0: 11081.8. Samples: 436443648. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:29:44,614][1653645] Updated weights for policy 0, policy_version 852368 (0.0167) [2024-06-15 22:29:45,331][1653645] Updated weights for policy 0, policy_version 852416 (0.0015) [2024-06-15 22:29:45,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 44237.0, 300 sec: 44320.1). Total num frames: 1745747968. Throughput: 0: 11116.1. Samples: 436517888. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:29:46,684][1651596] Signal inference workers to stop experience collection... (44250 times) [2024-06-15 22:29:46,704][1653645] InferenceWorker_p0-w0: stopping experience collection (44250 times) [2024-06-15 22:29:46,987][1651596] Signal inference workers to resume experience collection... (44250 times) [2024-06-15 22:29:46,988][1653645] InferenceWorker_p0-w0: resuming experience collection (44250 times) [2024-06-15 22:29:48,811][1653645] Updated weights for policy 0, policy_version 852512 (0.0012) [2024-06-15 22:29:50,958][1648982] Fps is (10 sec: 39324.1, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1746010112. Throughput: 0: 11218.5. Samples: 436546560. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:29:51,429][1653645] Updated weights for policy 0, policy_version 852561 (0.0015) [2024-06-15 22:29:55,705][1653645] Updated weights for policy 0, policy_version 852612 (0.0135) [2024-06-15 22:29:55,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1746173952. Throughput: 0: 11093.4. Samples: 436613632. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:29:55,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 22:29:56,687][1653645] Updated weights for policy 0, policy_version 852672 (0.0014) [2024-06-15 22:30:00,404][1653645] Updated weights for policy 0, policy_version 852752 (0.0093) [2024-06-15 22:30:00,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 45329.1, 300 sec: 44098.0). Total num frames: 1746468864. Throughput: 0: 11320.9. Samples: 436681728. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:30:03,684][1653645] Updated weights for policy 0, policy_version 852832 (0.0013) [2024-06-15 22:30:05,958][1648982] Fps is (10 sec: 49149.8, 60 sec: 43690.3, 300 sec: 44431.1). Total num frames: 1746665472. Throughput: 0: 11172.9. Samples: 436710400. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:05,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:30:07,330][1653645] Updated weights for policy 0, policy_version 852887 (0.0013) [2024-06-15 22:30:10,466][1653645] Updated weights for policy 0, policy_version 852944 (0.0040) [2024-06-15 22:30:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 45875.3, 300 sec: 44209.0). Total num frames: 1746862080. Throughput: 0: 11286.7. Samples: 436787712. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:30:12,135][1653645] Updated weights for policy 0, policy_version 853014 (0.0013) [2024-06-15 22:30:12,829][1653645] Updated weights for policy 0, policy_version 853052 (0.0013) [2024-06-15 22:30:15,544][1653645] Updated weights for policy 0, policy_version 853120 (0.0012) [2024-06-15 22:30:15,960][1648982] Fps is (10 sec: 52419.6, 60 sec: 45327.4, 300 sec: 44430.9). Total num frames: 1747189760. Throughput: 0: 11274.9. Samples: 436849664. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:15,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:30:19,698][1653645] Updated weights for policy 0, policy_version 853184 (0.0012) [2024-06-15 22:30:20,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1747320832. Throughput: 0: 11207.1. Samples: 436888064. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:30:23,980][1653645] Updated weights for policy 0, policy_version 853250 (0.0014) [2024-06-15 22:30:25,257][1653645] Updated weights for policy 0, policy_version 853303 (0.0012) [2024-06-15 22:30:25,958][1648982] Fps is (10 sec: 39329.6, 60 sec: 43690.6, 300 sec: 44209.2). Total num frames: 1747582976. Throughput: 0: 11321.0. Samples: 436953088. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:30:26,731][1653645] Updated weights for policy 0, policy_version 853344 (0.0015) [2024-06-15 22:30:30,475][1651596] Signal inference workers to stop experience collection... (44300 times) [2024-06-15 22:30:30,514][1653645] InferenceWorker_p0-w0: stopping experience collection (44300 times) [2024-06-15 22:30:30,709][1651596] Signal inference workers to resume experience collection... (44300 times) [2024-06-15 22:30:30,711][1653645] InferenceWorker_p0-w0: resuming experience collection (44300 times) [2024-06-15 22:30:30,955][1653645] Updated weights for policy 0, policy_version 853414 (0.0013) [2024-06-15 22:30:30,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 1747779584. Throughput: 0: 11104.7. Samples: 437017600. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:30,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:30:34,443][1653645] Updated weights for policy 0, policy_version 853458 (0.0012) [2024-06-15 22:30:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 45329.0, 300 sec: 44542.2). Total num frames: 1748008960. Throughput: 0: 11298.1. Samples: 437054976. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:35,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:30:36,356][1653645] Updated weights for policy 0, policy_version 853537 (0.0032) [2024-06-15 22:30:38,212][1653645] Updated weights for policy 0, policy_version 853587 (0.0014) [2024-06-15 22:30:40,958][1648982] Fps is (10 sec: 45875.3, 60 sec: 43691.1, 300 sec: 44431.2). Total num frames: 1748238336. Throughput: 0: 11241.2. Samples: 437119488. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:30:41,684][1653645] Updated weights for policy 0, policy_version 853664 (0.0011) [2024-06-15 22:30:45,960][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44320.1). Total num frames: 1748402176. Throughput: 0: 11366.4. Samples: 437193216. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:45,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:30:45,973][1653645] Updated weights for policy 0, policy_version 853719 (0.0013) [2024-06-15 22:30:47,216][1653645] Updated weights for policy 0, policy_version 853776 (0.0014) [2024-06-15 22:30:49,147][1653645] Updated weights for policy 0, policy_version 853826 (0.0014) [2024-06-15 22:30:50,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 45875.1, 300 sec: 44431.2). Total num frames: 1748762624. Throughput: 0: 11343.7. Samples: 437220864. Policy #0 lag: (min: 12.0, avg: 98.5, max: 268.0) [2024-06-15 22:30:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:30:54,135][1653645] Updated weights for policy 0, policy_version 853944 (0.0022) [2024-06-15 22:30:55,958][1648982] Fps is (10 sec: 49151.6, 60 sec: 45329.0, 300 sec: 44764.4). Total num frames: 1748893696. Throughput: 0: 11081.9. Samples: 437286400. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:30:55,958][1648982] Avg episode reward: [(0, '37.220')] [2024-06-15 22:30:56,001][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000853952_1748893696.pth... [2024-06-15 22:30:56,055][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000848768_1738276864.pth [2024-06-15 22:30:56,096][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000853952_1748893696.pth [2024-06-15 22:30:58,741][1653645] Updated weights for policy 0, policy_version 854002 (0.0013) [2024-06-15 22:31:00,663][1653645] Updated weights for policy 0, policy_version 854077 (0.0140) [2024-06-15 22:31:00,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 44782.9, 300 sec: 44209.0). Total num frames: 1749155840. Throughput: 0: 11048.3. Samples: 437346816. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:31:02,689][1653645] Updated weights for policy 0, policy_version 854143 (0.0080) [2024-06-15 22:31:05,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 44783.3, 300 sec: 44653.3). Total num frames: 1749352448. Throughput: 0: 10911.3. Samples: 437379072. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:05,971][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:31:06,467][1653645] Updated weights for policy 0, policy_version 854202 (0.0012) [2024-06-15 22:31:10,958][1648982] Fps is (10 sec: 36045.6, 60 sec: 44237.0, 300 sec: 44098.0). Total num frames: 1749516288. Throughput: 0: 11116.2. Samples: 437453312. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:31:11,558][1653645] Updated weights for policy 0, policy_version 854277 (0.0016) [2024-06-15 22:31:14,376][1653645] Updated weights for policy 0, policy_version 854339 (0.0070) [2024-06-15 22:31:15,837][1653645] Updated weights for policy 0, policy_version 854399 (0.0013) [2024-06-15 22:31:15,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 43692.0, 300 sec: 44431.1). Total num frames: 1749811200. Throughput: 0: 10843.0. Samples: 437505536. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:15,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:31:17,742][1651596] Signal inference workers to stop experience collection... (44350 times) [2024-06-15 22:31:17,809][1653645] InferenceWorker_p0-w0: stopping experience collection (44350 times) [2024-06-15 22:31:17,952][1651596] Signal inference workers to resume experience collection... (44350 times) [2024-06-15 22:31:17,990][1653645] InferenceWorker_p0-w0: resuming experience collection (44350 times) [2024-06-15 22:31:18,999][1653645] Updated weights for policy 0, policy_version 854456 (0.0011) [2024-06-15 22:31:20,958][1648982] Fps is (10 sec: 42597.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1749942272. Throughput: 0: 10831.7. Samples: 437542400. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:20,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 22:31:23,883][1653645] Updated weights for policy 0, policy_version 854529 (0.0013) [2024-06-15 22:31:25,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 43986.8). Total num frames: 1750204416. Throughput: 0: 10774.7. Samples: 437604352. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:31:26,490][1653645] Updated weights for policy 0, policy_version 854595 (0.0016) [2024-06-15 22:31:29,953][1653645] Updated weights for policy 0, policy_version 854676 (0.0013) [2024-06-15 22:31:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44783.0, 300 sec: 44542.3). Total num frames: 1750466560. Throughput: 0: 10581.3. Samples: 437669376. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:31:34,354][1653645] Updated weights for policy 0, policy_version 854736 (0.0022) [2024-06-15 22:31:35,247][1653645] Updated weights for policy 0, policy_version 854780 (0.0017) [2024-06-15 22:31:35,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43690.8, 300 sec: 44098.0). Total num frames: 1750630400. Throughput: 0: 10831.7. Samples: 437708288. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:35,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:31:36,963][1653645] Updated weights for policy 0, policy_version 854844 (0.0027) [2024-06-15 22:31:39,648][1653645] Updated weights for policy 0, policy_version 854905 (0.0123) [2024-06-15 22:31:40,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 44433.5). Total num frames: 1750859776. Throughput: 0: 10683.8. Samples: 437767168. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:31:42,462][1653645] Updated weights for policy 0, policy_version 854968 (0.0017) [2024-06-15 22:31:45,958][1648982] Fps is (10 sec: 36044.4, 60 sec: 43144.5, 300 sec: 43653.7). Total num frames: 1750990848. Throughput: 0: 10979.6. Samples: 437840896. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:31:47,232][1653645] Updated weights for policy 0, policy_version 855029 (0.0012) [2024-06-15 22:31:50,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 43986.9). Total num frames: 1751252992. Throughput: 0: 10820.3. Samples: 437865984. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:31:51,784][1653645] Updated weights for policy 0, policy_version 855120 (0.0012) [2024-06-15 22:31:53,075][1653645] Updated weights for policy 0, policy_version 855168 (0.0011) [2024-06-15 22:31:54,689][1653645] Updated weights for policy 0, policy_version 855225 (0.0014) [2024-06-15 22:31:55,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 44097.9). Total num frames: 1751515136. Throughput: 0: 10547.2. Samples: 437927936. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:31:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:31:59,499][1653645] Updated weights for policy 0, policy_version 855280 (0.0015) [2024-06-15 22:32:00,958][1648982] Fps is (10 sec: 45873.9, 60 sec: 42598.3, 300 sec: 43764.7). Total num frames: 1751711744. Throughput: 0: 10945.5. Samples: 437998080. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:32:01,272][1653645] Updated weights for policy 0, policy_version 855359 (0.0013) [2024-06-15 22:32:04,806][1651596] Signal inference workers to stop experience collection... (44400 times) [2024-06-15 22:32:04,853][1653645] InferenceWorker_p0-w0: stopping experience collection (44400 times) [2024-06-15 22:32:05,093][1651596] Signal inference workers to resume experience collection... (44400 times) [2024-06-15 22:32:05,095][1653645] InferenceWorker_p0-w0: resuming experience collection (44400 times) [2024-06-15 22:32:05,195][1653645] Updated weights for policy 0, policy_version 855424 (0.0110) [2024-06-15 22:32:05,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 44098.0). Total num frames: 1751941120. Throughput: 0: 10956.8. Samples: 438035456. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:32:10,817][1653645] Updated weights for policy 0, policy_version 855493 (0.0157) [2024-06-15 22:32:10,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42052.0, 300 sec: 43320.4). Total num frames: 1752039424. Throughput: 0: 10888.5. Samples: 438094336. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:10,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:32:13,009][1653645] Updated weights for policy 0, policy_version 855589 (0.0013) [2024-06-15 22:32:15,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 42052.5, 300 sec: 43875.8). Total num frames: 1752334336. Throughput: 0: 11047.8. Samples: 438166528. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:32:15,997][1653645] Updated weights for policy 0, policy_version 855648 (0.0091) [2024-06-15 22:32:18,255][1653645] Updated weights for policy 0, policy_version 855738 (0.0023) [2024-06-15 22:32:20,965][1648982] Fps is (10 sec: 52390.3, 60 sec: 43685.1, 300 sec: 43763.6). Total num frames: 1752563712. Throughput: 0: 10590.9. Samples: 438184960. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:20,966][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:32:23,474][1653645] Updated weights for policy 0, policy_version 855779 (0.0027) [2024-06-15 22:32:24,876][1653645] Updated weights for policy 0, policy_version 855842 (0.0122) [2024-06-15 22:32:25,958][1648982] Fps is (10 sec: 49149.9, 60 sec: 43690.5, 300 sec: 43986.8). Total num frames: 1752825856. Throughput: 0: 10877.1. Samples: 438256640. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:32:26,865][1653645] Updated weights for policy 0, policy_version 855875 (0.0011) [2024-06-15 22:32:29,478][1653645] Updated weights for policy 0, policy_version 855953 (0.0013) [2024-06-15 22:32:30,958][1648982] Fps is (10 sec: 52468.7, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1753088000. Throughput: 0: 10740.6. Samples: 438324224. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:32:34,895][1653645] Updated weights for policy 0, policy_version 856016 (0.0015) [2024-06-15 22:32:35,958][1648982] Fps is (10 sec: 39322.5, 60 sec: 43144.4, 300 sec: 43542.6). Total num frames: 1753219072. Throughput: 0: 10990.9. Samples: 438360576. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:35,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:32:36,401][1653645] Updated weights for policy 0, policy_version 856080 (0.0015) [2024-06-15 22:32:37,560][1653645] Updated weights for policy 0, policy_version 856123 (0.0011) [2024-06-15 22:32:40,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1753481216. Throughput: 0: 10945.4. Samples: 438420480. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:32:41,636][1653645] Updated weights for policy 0, policy_version 856196 (0.0013) [2024-06-15 22:32:42,837][1653645] Updated weights for policy 0, policy_version 856252 (0.0012) [2024-06-15 22:32:45,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1753612288. Throughput: 0: 10979.6. Samples: 438492160. Policy #0 lag: (min: 6.0, avg: 122.5, max: 262.0) [2024-06-15 22:32:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:32:48,055][1653645] Updated weights for policy 0, policy_version 856306 (0.0014) [2024-06-15 22:32:49,177][1651596] Signal inference workers to stop experience collection... (44450 times) [2024-06-15 22:32:49,222][1653645] InferenceWorker_p0-w0: stopping experience collection (44450 times) [2024-06-15 22:32:49,513][1651596] Signal inference workers to resume experience collection... (44450 times) [2024-06-15 22:32:49,523][1653645] InferenceWorker_p0-w0: resuming experience collection (44450 times) [2024-06-15 22:32:49,701][1653645] Updated weights for policy 0, policy_version 856379 (0.0013) [2024-06-15 22:32:50,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 1753907200. Throughput: 0: 10752.0. Samples: 438519296. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:32:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:32:51,705][1653645] Updated weights for policy 0, policy_version 856448 (0.0013) [2024-06-15 22:32:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 1754136576. Throughput: 0: 10922.7. Samples: 438585856. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:32:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:32:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000856512_1754136576.pth... [2024-06-15 22:32:56,033][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000851328_1743519744.pth [2024-06-15 22:32:59,084][1653645] Updated weights for policy 0, policy_version 856528 (0.0015) [2024-06-15 22:33:00,414][1653645] Updated weights for policy 0, policy_version 856578 (0.0012) [2024-06-15 22:33:00,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 43764.7). Total num frames: 1754333184. Throughput: 0: 10786.1. Samples: 438651904. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:33:01,531][1653645] Updated weights for policy 0, policy_version 856638 (0.0035) [2024-06-15 22:33:03,063][1653645] Updated weights for policy 0, policy_version 856704 (0.0014) [2024-06-15 22:33:05,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1754529792. Throughput: 0: 11049.7. Samples: 438682112. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:33:06,905][1653645] Updated weights for policy 0, policy_version 856762 (0.0013) [2024-06-15 22:33:10,958][1648982] Fps is (10 sec: 39320.6, 60 sec: 44782.9, 300 sec: 43653.6). Total num frames: 1754726400. Throughput: 0: 11150.3. Samples: 438758400. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:33:10,963][1653645] Updated weights for policy 0, policy_version 856816 (0.0016) [2024-06-15 22:33:13,080][1653645] Updated weights for policy 0, policy_version 856896 (0.0048) [2024-06-15 22:33:14,748][1653645] Updated weights for policy 0, policy_version 856955 (0.0014) [2024-06-15 22:33:15,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 45328.9, 300 sec: 44209.0). Total num frames: 1755054080. Throughput: 0: 11059.1. Samples: 438821888. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:33:18,702][1653645] Updated weights for policy 0, policy_version 857023 (0.0014) [2024-06-15 22:33:20,958][1648982] Fps is (10 sec: 45876.0, 60 sec: 43696.2, 300 sec: 43653.7). Total num frames: 1755185152. Throughput: 0: 10968.2. Samples: 438854144. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:20,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 22:33:22,629][1653645] Updated weights for policy 0, policy_version 857084 (0.0106) [2024-06-15 22:33:25,463][1653645] Updated weights for policy 0, policy_version 857121 (0.0011) [2024-06-15 22:33:25,958][1648982] Fps is (10 sec: 36045.5, 60 sec: 43144.8, 300 sec: 43876.1). Total num frames: 1755414528. Throughput: 0: 11150.2. Samples: 438922240. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:33:27,808][1653645] Updated weights for policy 0, policy_version 857213 (0.0014) [2024-06-15 22:33:30,672][1653645] Updated weights for policy 0, policy_version 857268 (0.0047) [2024-06-15 22:33:30,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43690.6, 300 sec: 44209.1). Total num frames: 1755709440. Throughput: 0: 10763.4. Samples: 438976512. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:33:34,097][1653645] Updated weights for policy 0, policy_version 857313 (0.0033) [2024-06-15 22:33:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 43690.8, 300 sec: 43543.3). Total num frames: 1755840512. Throughput: 0: 11093.4. Samples: 439018496. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:33:37,029][1653645] Updated weights for policy 0, policy_version 857349 (0.0029) [2024-06-15 22:33:37,809][1651596] Signal inference workers to stop experience collection... (44500 times) [2024-06-15 22:33:37,855][1653645] InferenceWorker_p0-w0: stopping experience collection (44500 times) [2024-06-15 22:33:38,130][1651596] Signal inference workers to resume experience collection... (44500 times) [2024-06-15 22:33:38,138][1653645] InferenceWorker_p0-w0: resuming experience collection (44500 times) [2024-06-15 22:33:39,194][1653645] Updated weights for policy 0, policy_version 857440 (0.0013) [2024-06-15 22:33:40,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1756102656. Throughput: 0: 10888.6. Samples: 439075840. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:40,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:33:42,158][1653645] Updated weights for policy 0, policy_version 857520 (0.0097) [2024-06-15 22:33:45,840][1653645] Updated weights for policy 0, policy_version 857573 (0.0020) [2024-06-15 22:33:45,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.2, 300 sec: 43875.8). Total num frames: 1756332032. Throughput: 0: 11116.1. Samples: 439152128. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:33:48,901][1653645] Updated weights for policy 0, policy_version 857616 (0.0013) [2024-06-15 22:33:50,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 44236.8, 300 sec: 44209.0). Total num frames: 1756561408. Throughput: 0: 11286.8. Samples: 439190016. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:50,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:33:51,447][1653645] Updated weights for policy 0, policy_version 857728 (0.0116) [2024-06-15 22:33:54,012][1653645] Updated weights for policy 0, policy_version 857784 (0.0021) [2024-06-15 22:33:55,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 43690.8, 300 sec: 44097.9). Total num frames: 1756758016. Throughput: 0: 10820.3. Samples: 439245312. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:33:55,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:33:57,100][1653645] Updated weights for policy 0, policy_version 857813 (0.0124) [2024-06-15 22:34:00,818][1653645] Updated weights for policy 0, policy_version 857875 (0.0014) [2024-06-15 22:34:00,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43144.3, 300 sec: 43653.6). Total num frames: 1756921856. Throughput: 0: 11173.0. Samples: 439324672. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:34:02,466][1653645] Updated weights for policy 0, policy_version 857952 (0.0104) [2024-06-15 22:34:04,931][1653645] Updated weights for policy 0, policy_version 858040 (0.0145) [2024-06-15 22:34:05,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 45875.3, 300 sec: 44653.4). Total num frames: 1757282304. Throughput: 0: 11082.0. Samples: 439352832. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:34:09,112][1653645] Updated weights for policy 0, policy_version 858080 (0.0102) [2024-06-15 22:34:10,958][1648982] Fps is (10 sec: 49152.9, 60 sec: 44783.1, 300 sec: 43875.8). Total num frames: 1757413376. Throughput: 0: 11218.5. Samples: 439427072. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:34:12,268][1653645] Updated weights for policy 0, policy_version 858144 (0.0014) [2024-06-15 22:34:15,613][1653645] Updated weights for policy 0, policy_version 858241 (0.0132) [2024-06-15 22:34:15,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44236.8, 300 sec: 44542.2). Total num frames: 1757708288. Throughput: 0: 11320.8. Samples: 439485952. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:34:16,948][1653645] Updated weights for policy 0, policy_version 858299 (0.0157) [2024-06-15 22:34:20,958][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.9, 300 sec: 43653.7). Total num frames: 1757839360. Throughput: 0: 11150.2. Samples: 439520256. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:34:21,877][1653645] Updated weights for policy 0, policy_version 858368 (0.0108) [2024-06-15 22:34:23,379][1651596] Signal inference workers to stop experience collection... (44550 times) [2024-06-15 22:34:23,457][1653645] InferenceWorker_p0-w0: stopping experience collection (44550 times) [2024-06-15 22:34:23,633][1651596] Signal inference workers to resume experience collection... (44550 times) [2024-06-15 22:34:23,634][1653645] InferenceWorker_p0-w0: resuming experience collection (44550 times) [2024-06-15 22:34:24,219][1653645] Updated weights for policy 0, policy_version 858409 (0.0013) [2024-06-15 22:34:25,546][1653645] Updated weights for policy 0, policy_version 858465 (0.0012) [2024-06-15 22:34:25,958][1648982] Fps is (10 sec: 45876.1, 60 sec: 45875.2, 300 sec: 44320.1). Total num frames: 1758167040. Throughput: 0: 11434.7. Samples: 439590400. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:34:28,253][1653645] Updated weights for policy 0, policy_version 858528 (0.0014) [2024-06-15 22:34:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1758330880. Throughput: 0: 11286.7. Samples: 439660032. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:34:32,140][1653645] Updated weights for policy 0, policy_version 858567 (0.0015) [2024-06-15 22:34:33,403][1653645] Updated weights for policy 0, policy_version 858624 (0.0013) [2024-06-15 22:34:35,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 43987.0). Total num frames: 1758593024. Throughput: 0: 11173.0. Samples: 439692800. Policy #0 lag: (min: 15.0, avg: 88.4, max: 271.0) [2024-06-15 22:34:35,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 22:34:36,224][1653645] Updated weights for policy 0, policy_version 858704 (0.0014) [2024-06-15 22:34:40,340][1653645] Updated weights for policy 0, policy_version 858770 (0.0014) [2024-06-15 22:34:40,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44783.0, 300 sec: 44209.0). Total num frames: 1758789632. Throughput: 0: 11389.2. Samples: 439757824. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:34:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:34:41,298][1653645] Updated weights for policy 0, policy_version 858813 (0.0011) [2024-06-15 22:34:44,891][1653645] Updated weights for policy 0, policy_version 858864 (0.0012) [2024-06-15 22:34:45,958][1648982] Fps is (10 sec: 39320.3, 60 sec: 44236.5, 300 sec: 43986.8). Total num frames: 1758986240. Throughput: 0: 11161.6. Samples: 439826944. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:34:45,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:34:47,416][1653645] Updated weights for policy 0, policy_version 858944 (0.0141) [2024-06-15 22:34:49,052][1653645] Updated weights for policy 0, policy_version 859000 (0.0126) [2024-06-15 22:34:50,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44783.0, 300 sec: 44320.1). Total num frames: 1759248384. Throughput: 0: 11207.1. Samples: 439857152. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:34:50,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:34:51,528][1653645] Updated weights for policy 0, policy_version 859029 (0.0010) [2024-06-15 22:34:55,729][1653645] Updated weights for policy 0, policy_version 859074 (0.0166) [2024-06-15 22:34:55,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 44236.7, 300 sec: 43875.8). Total num frames: 1759412224. Throughput: 0: 11241.2. Samples: 439932928. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:34:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:34:56,324][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000859104_1759444992.pth... [2024-06-15 22:34:56,435][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000853952_1748893696.pth [2024-06-15 22:34:57,621][1653645] Updated weights for policy 0, policy_version 859155 (0.0012) [2024-06-15 22:34:59,252][1653645] Updated weights for policy 0, policy_version 859202 (0.0013) [2024-06-15 22:35:00,591][1653645] Updated weights for policy 0, policy_version 859261 (0.0014) [2024-06-15 22:35:00,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 47513.8, 300 sec: 44431.3). Total num frames: 1759772672. Throughput: 0: 11229.9. Samples: 439991296. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:35:04,265][1653645] Updated weights for policy 0, policy_version 859321 (0.0013) [2024-06-15 22:35:05,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43690.4, 300 sec: 44209.0). Total num frames: 1759903744. Throughput: 0: 11252.5. Samples: 440026624. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:05,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:35:08,995][1651596] Signal inference workers to stop experience collection... (44600 times) [2024-06-15 22:35:09,035][1653645] InferenceWorker_p0-w0: stopping experience collection (44600 times) [2024-06-15 22:35:09,268][1651596] Signal inference workers to resume experience collection... (44600 times) [2024-06-15 22:35:09,269][1653645] InferenceWorker_p0-w0: resuming experience collection (44600 times) [2024-06-15 22:35:09,676][1653645] Updated weights for policy 0, policy_version 859392 (0.0013) [2024-06-15 22:35:10,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 45329.1, 300 sec: 43876.1). Total num frames: 1760133120. Throughput: 0: 11127.5. Samples: 440091136. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:35:11,321][1653645] Updated weights for policy 0, policy_version 859452 (0.0014) [2024-06-15 22:35:12,851][1653645] Updated weights for policy 0, policy_version 859510 (0.0012) [2024-06-15 22:35:15,958][1648982] Fps is (10 sec: 45876.4, 60 sec: 44236.9, 300 sec: 44209.0). Total num frames: 1760362496. Throughput: 0: 10968.2. Samples: 440153600. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:35:16,030][1653645] Updated weights for policy 0, policy_version 859568 (0.0100) [2024-06-15 22:35:20,916][1653645] Updated weights for policy 0, policy_version 859616 (0.0012) [2024-06-15 22:35:20,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 44236.7, 300 sec: 43764.7). Total num frames: 1760493568. Throughput: 0: 11059.2. Samples: 440190464. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:35:23,221][1653645] Updated weights for policy 0, policy_version 859700 (0.0014) [2024-06-15 22:35:25,004][1653645] Updated weights for policy 0, policy_version 859776 (0.0013) [2024-06-15 22:35:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44236.7, 300 sec: 44209.0). Total num frames: 1760821248. Throughput: 0: 10911.3. Samples: 440248832. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:35:28,692][1653645] Updated weights for policy 0, policy_version 859840 (0.0013) [2024-06-15 22:35:30,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 1760952320. Throughput: 0: 10956.8. Samples: 440320000. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:35:33,361][1653645] Updated weights for policy 0, policy_version 859898 (0.0013) [2024-06-15 22:35:35,261][1653645] Updated weights for policy 0, policy_version 859954 (0.0012) [2024-06-15 22:35:35,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44236.8, 300 sec: 44098.0). Total num frames: 1761247232. Throughput: 0: 11104.7. Samples: 440356864. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:35:39,360][1653645] Updated weights for policy 0, policy_version 860037 (0.0013) [2024-06-15 22:35:40,608][1653645] Updated weights for policy 0, policy_version 860089 (0.0039) [2024-06-15 22:35:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 44782.8, 300 sec: 44320.1). Total num frames: 1761476608. Throughput: 0: 10717.9. Samples: 440415232. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:40,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:35:44,750][1653645] Updated weights for policy 0, policy_version 860128 (0.0065) [2024-06-15 22:35:45,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43691.0, 300 sec: 43542.6). Total num frames: 1761607680. Throughput: 0: 11150.2. Samples: 440493056. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:35:46,468][1653645] Updated weights for policy 0, policy_version 860192 (0.0027) [2024-06-15 22:35:47,918][1651596] Signal inference workers to stop experience collection... (44650 times) [2024-06-15 22:35:48,008][1653645] InferenceWorker_p0-w0: stopping experience collection (44650 times) [2024-06-15 22:35:48,095][1651596] Signal inference workers to resume experience collection... (44650 times) [2024-06-15 22:35:48,097][1653645] InferenceWorker_p0-w0: resuming experience collection (44650 times) [2024-06-15 22:35:48,298][1653645] Updated weights for policy 0, policy_version 860284 (0.0101) [2024-06-15 22:35:50,960][1648982] Fps is (10 sec: 42598.8, 60 sec: 44236.7, 300 sec: 44098.0). Total num frames: 1761902592. Throughput: 0: 10922.7. Samples: 440518144. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:50,961][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:35:51,551][1653645] Updated weights for policy 0, policy_version 860343 (0.0137) [2024-06-15 22:35:55,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1762000896. Throughput: 0: 11195.7. Samples: 440594944. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:35:55,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:35:56,780][1653645] Updated weights for policy 0, policy_version 860400 (0.0012) [2024-06-15 22:35:58,413][1653645] Updated weights for policy 0, policy_version 860464 (0.0039) [2024-06-15 22:35:59,785][1653645] Updated weights for policy 0, policy_version 860531 (0.0037) [2024-06-15 22:36:00,958][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 44209.0). Total num frames: 1762394112. Throughput: 0: 11059.2. Samples: 440651264. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:00,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:36:03,447][1653645] Updated weights for policy 0, policy_version 860592 (0.0012) [2024-06-15 22:36:05,962][1648982] Fps is (10 sec: 52404.9, 60 sec: 43687.5, 300 sec: 44097.2). Total num frames: 1762525184. Throughput: 0: 10978.4. Samples: 440684544. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:05,963][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:36:07,639][1653645] Updated weights for policy 0, policy_version 860640 (0.0014) [2024-06-15 22:36:09,283][1653645] Updated weights for policy 0, policy_version 860704 (0.0064) [2024-06-15 22:36:10,863][1653645] Updated weights for policy 0, policy_version 860771 (0.0014) [2024-06-15 22:36:10,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 45329.0, 300 sec: 44209.1). Total num frames: 1762852864. Throughput: 0: 11332.3. Samples: 440758784. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:36:15,199][1653645] Updated weights for policy 0, policy_version 860848 (0.0193) [2024-06-15 22:36:15,958][1648982] Fps is (10 sec: 52453.4, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1763049472. Throughput: 0: 11116.1. Samples: 440820224. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:15,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:36:19,579][1653645] Updated weights for policy 0, policy_version 860897 (0.0012) [2024-06-15 22:36:20,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45875.2, 300 sec: 44209.1). Total num frames: 1763246080. Throughput: 0: 11298.1. Samples: 440865280. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:36:21,159][1653645] Updated weights for policy 0, policy_version 860989 (0.0014) [2024-06-15 22:36:22,773][1653645] Updated weights for policy 0, policy_version 861047 (0.0015) [2024-06-15 22:36:25,958][1648982] Fps is (10 sec: 42596.2, 60 sec: 44236.5, 300 sec: 44097.9). Total num frames: 1763475456. Throughput: 0: 11332.2. Samples: 440925184. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:25,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:36:26,059][1653645] Updated weights for policy 0, policy_version 861078 (0.0013) [2024-06-15 22:36:27,105][1653645] Updated weights for policy 0, policy_version 861120 (0.0020) [2024-06-15 22:36:30,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 1763606528. Throughput: 0: 11320.9. Samples: 441002496. Policy #0 lag: (min: 15.0, avg: 134.2, max: 271.0) [2024-06-15 22:36:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:36:31,948][1653645] Updated weights for policy 0, policy_version 861188 (0.0013) [2024-06-15 22:36:32,251][1651596] Signal inference workers to stop experience collection... (44700 times) [2024-06-15 22:36:32,300][1653645] InferenceWorker_p0-w0: stopping experience collection (44700 times) [2024-06-15 22:36:32,479][1651596] Signal inference workers to resume experience collection... (44700 times) [2024-06-15 22:36:32,481][1653645] InferenceWorker_p0-w0: resuming experience collection (44700 times) [2024-06-15 22:36:33,915][1653645] Updated weights for policy 0, policy_version 861267 (0.0018) [2024-06-15 22:36:35,958][1648982] Fps is (10 sec: 49154.4, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1763966976. Throughput: 0: 11343.7. Samples: 441028608. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:36:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:36:37,609][1653645] Updated weights for policy 0, policy_version 861344 (0.0074) [2024-06-15 22:36:40,958][1648982] Fps is (10 sec: 49151.5, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1764098048. Throughput: 0: 11195.7. Samples: 441098752. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:36:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:36:42,404][1653645] Updated weights for policy 0, policy_version 861392 (0.0038) [2024-06-15 22:36:44,305][1653645] Updated weights for policy 0, policy_version 861479 (0.0012) [2024-06-15 22:36:45,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 46967.4, 300 sec: 44653.3). Total num frames: 1764425728. Throughput: 0: 11400.5. Samples: 441164288. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:36:45,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:36:46,507][1653645] Updated weights for policy 0, policy_version 861558 (0.0016) [2024-06-15 22:36:49,359][1653645] Updated weights for policy 0, policy_version 861616 (0.0033) [2024-06-15 22:36:50,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45329.1, 300 sec: 44431.2). Total num frames: 1764622336. Throughput: 0: 11379.0. Samples: 441196544. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:36:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:36:55,892][1653645] Updated weights for policy 0, policy_version 861680 (0.0013) [2024-06-15 22:36:55,958][1648982] Fps is (10 sec: 29490.6, 60 sec: 45329.0, 300 sec: 44098.0). Total num frames: 1764720640. Throughput: 0: 11275.3. Samples: 441266176. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:36:55,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:36:56,357][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000861696_1764753408.pth... [2024-06-15 22:36:56,499][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000856512_1754136576.pth [2024-06-15 22:36:58,255][1653645] Updated weights for policy 0, policy_version 861763 (0.0013) [2024-06-15 22:37:00,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 43690.5, 300 sec: 44320.1). Total num frames: 1765015552. Throughput: 0: 11036.4. Samples: 441316864. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:00,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:37:01,822][1653645] Updated weights for policy 0, policy_version 861856 (0.0013) [2024-06-15 22:37:05,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43694.1, 300 sec: 44431.2). Total num frames: 1765146624. Throughput: 0: 10831.6. Samples: 441352704. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:37:08,063][1653645] Updated weights for policy 0, policy_version 861923 (0.0013) [2024-06-15 22:37:10,193][1653645] Updated weights for policy 0, policy_version 862005 (0.0114) [2024-06-15 22:37:10,958][1648982] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 44431.2). Total num frames: 1765441536. Throughput: 0: 11013.8. Samples: 441420800. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:10,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:37:11,688][1653645] Updated weights for policy 0, policy_version 862074 (0.0012) [2024-06-15 22:37:13,815][1651596] Signal inference workers to stop experience collection... (44750 times) [2024-06-15 22:37:13,844][1653645] InferenceWorker_p0-w0: stopping experience collection (44750 times) [2024-06-15 22:37:14,044][1651596] Signal inference workers to resume experience collection... (44750 times) [2024-06-15 22:37:14,045][1653645] InferenceWorker_p0-w0: resuming experience collection (44750 times) [2024-06-15 22:37:14,233][1653645] Updated weights for policy 0, policy_version 862141 (0.0013) [2024-06-15 22:37:15,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44432.3). Total num frames: 1765670912. Throughput: 0: 10717.9. Samples: 441484800. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:37:20,476][1653645] Updated weights for policy 0, policy_version 862208 (0.0013) [2024-06-15 22:37:20,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43144.4, 300 sec: 44098.0). Total num frames: 1765834752. Throughput: 0: 11025.0. Samples: 441524736. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:20,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:37:21,491][1653645] Updated weights for policy 0, policy_version 862245 (0.0013) [2024-06-15 22:37:25,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43144.7, 300 sec: 43986.8). Total num frames: 1766064128. Throughput: 0: 10649.6. Samples: 441577984. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:25,959][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 22:37:25,975][1653645] Updated weights for policy 0, policy_version 862352 (0.0015) [2024-06-15 22:37:30,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1766195200. Throughput: 0: 10763.4. Samples: 441648640. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:30,958][1648982] Avg episode reward: [(0, '37.120')] [2024-06-15 22:37:31,888][1653645] Updated weights for policy 0, policy_version 862432 (0.0147) [2024-06-15 22:37:33,507][1653645] Updated weights for policy 0, policy_version 862496 (0.0013) [2024-06-15 22:37:35,634][1653645] Updated weights for policy 0, policy_version 862576 (0.0161) [2024-06-15 22:37:35,958][1648982] Fps is (10 sec: 49153.1, 60 sec: 43144.5, 300 sec: 44320.1). Total num frames: 1766555648. Throughput: 0: 10786.1. Samples: 441681920. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:37:38,390][1653645] Updated weights for policy 0, policy_version 862596 (0.0013) [2024-06-15 22:37:40,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1766719488. Throughput: 0: 10513.1. Samples: 441739264. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:40,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:37:43,665][1653645] Updated weights for policy 0, policy_version 862672 (0.0023) [2024-06-15 22:37:45,958][1648982] Fps is (10 sec: 36044.5, 60 sec: 41506.1, 300 sec: 44098.0). Total num frames: 1766916096. Throughput: 0: 10843.1. Samples: 441804800. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:37:46,031][1653645] Updated weights for policy 0, policy_version 862753 (0.0041) [2024-06-15 22:37:47,450][1653645] Updated weights for policy 0, policy_version 862817 (0.0013) [2024-06-15 22:37:50,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 41506.2, 300 sec: 43986.9). Total num frames: 1767112704. Throughput: 0: 10604.1. Samples: 441829888. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:37:51,890][1653645] Updated weights for policy 0, policy_version 862888 (0.0022) [2024-06-15 22:37:52,336][1653645] Updated weights for policy 0, policy_version 862911 (0.0011) [2024-06-15 22:37:55,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 42598.6, 300 sec: 43875.8). Total num frames: 1767276544. Throughput: 0: 10820.3. Samples: 441907712. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:37:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:37:56,820][1653645] Updated weights for policy 0, policy_version 862976 (0.0021) [2024-06-15 22:37:58,041][1651596] Signal inference workers to stop experience collection... (44800 times) [2024-06-15 22:37:58,089][1653645] InferenceWorker_p0-w0: stopping experience collection (44800 times) [2024-06-15 22:37:58,212][1651596] Signal inference workers to resume experience collection... (44800 times) [2024-06-15 22:37:58,218][1653645] InferenceWorker_p0-w0: resuming experience collection (44800 times) [2024-06-15 22:37:58,824][1653645] Updated weights for policy 0, policy_version 863072 (0.0012) [2024-06-15 22:38:00,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.8, 300 sec: 44431.2). Total num frames: 1767636992. Throughput: 0: 10740.6. Samples: 441968128. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:38:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:38:02,836][1653645] Updated weights for policy 0, policy_version 863105 (0.0011) [2024-06-15 22:38:04,222][1653645] Updated weights for policy 0, policy_version 863167 (0.0011) [2024-06-15 22:38:05,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 43690.7, 300 sec: 44209.1). Total num frames: 1767768064. Throughput: 0: 10581.4. Samples: 442000896. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:38:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:38:08,797][1653645] Updated weights for policy 0, policy_version 863219 (0.0014) [2024-06-15 22:38:10,747][1653645] Updated weights for policy 0, policy_version 863296 (0.0013) [2024-06-15 22:38:10,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1768030208. Throughput: 0: 10706.5. Samples: 442059776. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:38:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:38:12,171][1653645] Updated weights for policy 0, policy_version 863360 (0.0131) [2024-06-15 22:38:15,958][1648982] Fps is (10 sec: 42597.4, 60 sec: 42052.1, 300 sec: 44097.9). Total num frames: 1768194048. Throughput: 0: 10581.3. Samples: 442124800. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:38:15,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:38:16,748][1653645] Updated weights for policy 0, policy_version 863418 (0.0058) [2024-06-15 22:38:20,958][1648982] Fps is (10 sec: 36045.2, 60 sec: 42598.5, 300 sec: 43986.9). Total num frames: 1768390656. Throughput: 0: 10626.9. Samples: 442160128. Policy #0 lag: (min: 31.0, avg: 88.5, max: 255.0) [2024-06-15 22:38:20,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 22:38:21,439][1653645] Updated weights for policy 0, policy_version 863490 (0.0013) [2024-06-15 22:38:22,803][1653645] Updated weights for policy 0, policy_version 863546 (0.0012) [2024-06-15 22:38:24,359][1653645] Updated weights for policy 0, policy_version 863613 (0.0012) [2024-06-15 22:38:25,960][1648982] Fps is (10 sec: 49152.7, 60 sec: 43690.8, 300 sec: 43986.9). Total num frames: 1768685568. Throughput: 0: 10535.8. Samples: 442213376. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:25,961][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:38:28,739][1653645] Updated weights for policy 0, policy_version 863678 (0.0012) [2024-06-15 22:38:30,958][1648982] Fps is (10 sec: 42598.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1768816640. Throughput: 0: 10638.2. Samples: 442283520. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:38:34,279][1653645] Updated weights for policy 0, policy_version 863760 (0.0014) [2024-06-15 22:38:35,713][1653645] Updated weights for policy 0, policy_version 863824 (0.0011) [2024-06-15 22:38:35,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 44098.0). Total num frames: 1769111552. Throughput: 0: 10865.8. Samples: 442318848. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:38:36,746][1653645] Updated weights for policy 0, policy_version 863871 (0.0020) [2024-06-15 22:38:40,539][1653645] Updated weights for policy 0, policy_version 863920 (0.0020) [2024-06-15 22:38:40,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 1769340928. Throughput: 0: 10558.6. Samples: 442382848. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:38:44,994][1651596] Signal inference workers to stop experience collection... (44850 times) [2024-06-15 22:38:45,100][1653645] InferenceWorker_p0-w0: stopping experience collection (44850 times) [2024-06-15 22:38:45,266][1651596] Signal inference workers to resume experience collection... (44850 times) [2024-06-15 22:38:45,266][1653645] InferenceWorker_p0-w0: resuming experience collection (44850 times) [2024-06-15 22:38:45,268][1653645] Updated weights for policy 0, policy_version 863968 (0.0012) [2024-06-15 22:38:45,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1769472000. Throughput: 0: 10729.2. Samples: 442450944. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:38:46,573][1653645] Updated weights for policy 0, policy_version 864032 (0.0013) [2024-06-15 22:38:47,918][1653645] Updated weights for policy 0, policy_version 864082 (0.0011) [2024-06-15 22:38:50,958][1648982] Fps is (10 sec: 39321.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1769734144. Throughput: 0: 10661.0. Samples: 442480640. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:50,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:38:51,385][1653645] Updated weights for policy 0, policy_version 864129 (0.0106) [2024-06-15 22:38:55,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 43875.8). Total num frames: 1769865216. Throughput: 0: 10865.8. Samples: 442548736. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:38:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:38:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000864192_1769865216.pth... [2024-06-15 22:38:56,022][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000859104_1759444992.pth [2024-06-15 22:38:57,058][1653645] Updated weights for policy 0, policy_version 864213 (0.0035) [2024-06-15 22:38:58,411][1653645] Updated weights for policy 0, policy_version 864272 (0.0011) [2024-06-15 22:39:00,120][1653645] Updated weights for policy 0, policy_version 864336 (0.0012) [2024-06-15 22:39:00,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 43764.7). Total num frames: 1770192896. Throughput: 0: 10683.8. Samples: 442605568. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:00,960][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:39:04,232][1653645] Updated weights for policy 0, policy_version 864400 (0.0016) [2024-06-15 22:39:05,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1770389504. Throughput: 0: 10672.3. Samples: 442640384. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:39:10,041][1653645] Updated weights for policy 0, policy_version 864482 (0.0013) [2024-06-15 22:39:10,960][1648982] Fps is (10 sec: 32767.6, 60 sec: 41506.1, 300 sec: 43431.5). Total num frames: 1770520576. Throughput: 0: 11025.1. Samples: 442709504. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:10,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:39:11,650][1653645] Updated weights for policy 0, policy_version 864548 (0.0013) [2024-06-15 22:39:13,320][1653645] Updated weights for policy 0, policy_version 864624 (0.0013) [2024-06-15 22:39:15,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43144.7, 300 sec: 43875.8). Total num frames: 1770782720. Throughput: 0: 10934.0. Samples: 442775552. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:39:16,777][1653645] Updated weights for policy 0, policy_version 864688 (0.0015) [2024-06-15 22:39:20,958][1648982] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 43320.4). Total num frames: 1770946560. Throughput: 0: 10888.6. Samples: 442808832. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:39:21,589][1653645] Updated weights for policy 0, policy_version 864753 (0.0013) [2024-06-15 22:39:23,450][1653645] Updated weights for policy 0, policy_version 864832 (0.0015) [2024-06-15 22:39:24,035][1651596] Signal inference workers to stop experience collection... (44900 times) [2024-06-15 22:39:24,092][1653645] InferenceWorker_p0-w0: stopping experience collection (44900 times) [2024-06-15 22:39:24,276][1651596] Signal inference workers to resume experience collection... (44900 times) [2024-06-15 22:39:24,282][1653645] InferenceWorker_p0-w0: resuming experience collection (44900 times) [2024-06-15 22:39:25,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1771307008. Throughput: 0: 10740.6. Samples: 442866176. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:39:28,327][1653645] Updated weights for policy 0, policy_version 864912 (0.0013) [2024-06-15 22:39:29,170][1653645] Updated weights for policy 0, policy_version 864958 (0.0016) [2024-06-15 22:39:30,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1771438080. Throughput: 0: 10922.7. Samples: 442942464. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:30,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:39:33,452][1653645] Updated weights for policy 0, policy_version 865008 (0.0012) [2024-06-15 22:39:35,176][1653645] Updated weights for policy 0, policy_version 865075 (0.0016) [2024-06-15 22:39:35,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1771732992. Throughput: 0: 11036.4. Samples: 442977280. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:35,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:39:36,575][1653645] Updated weights for policy 0, policy_version 865149 (0.0014) [2024-06-15 22:39:40,861][1653645] Updated weights for policy 0, policy_version 865203 (0.0031) [2024-06-15 22:39:40,958][1648982] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 43875.8). Total num frames: 1771929600. Throughput: 0: 11047.8. Samples: 443045888. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:40,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:39:43,809][1653645] Updated weights for policy 0, policy_version 865238 (0.0029) [2024-06-15 22:39:45,408][1653645] Updated weights for policy 0, policy_version 865303 (0.0012) [2024-06-15 22:39:45,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 44782.9, 300 sec: 43764.7). Total num frames: 1772158976. Throughput: 0: 11241.2. Samples: 443111424. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:39:47,086][1653645] Updated weights for policy 0, policy_version 865376 (0.0015) [2024-06-15 22:39:50,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1772355584. Throughput: 0: 11173.0. Samples: 443143168. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:39:51,800][1653645] Updated weights for policy 0, policy_version 865440 (0.0015) [2024-06-15 22:39:55,958][1648982] Fps is (10 sec: 42598.7, 60 sec: 45329.1, 300 sec: 43431.5). Total num frames: 1772584960. Throughput: 0: 11320.9. Samples: 443218944. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:39:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:39:56,731][1653645] Updated weights for policy 0, policy_version 865552 (0.0078) [2024-06-15 22:39:59,108][1653645] Updated weights for policy 0, policy_version 865626 (0.0013) [2024-06-15 22:39:59,858][1653645] Updated weights for policy 0, policy_version 865664 (0.0011) [2024-06-15 22:40:00,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 44783.0, 300 sec: 43987.0). Total num frames: 1772879872. Throughput: 0: 11082.0. Samples: 443274240. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:40:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:40:04,061][1653645] Updated weights for policy 0, policy_version 865716 (0.0014) [2024-06-15 22:40:05,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 1773010944. Throughput: 0: 11298.1. Samples: 443317248. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:40:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:40:06,628][1653645] Updated weights for policy 0, policy_version 865746 (0.0016) [2024-06-15 22:40:08,024][1651596] Signal inference workers to stop experience collection... (44950 times) [2024-06-15 22:40:08,054][1653645] InferenceWorker_p0-w0: stopping experience collection (44950 times) [2024-06-15 22:40:08,240][1651596] Signal inference workers to resume experience collection... (44950 times) [2024-06-15 22:40:08,246][1653645] InferenceWorker_p0-w0: resuming experience collection (44950 times) [2024-06-15 22:40:09,051][1653645] Updated weights for policy 0, policy_version 865856 (0.0014) [2024-06-15 22:40:10,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 45875.2, 300 sec: 43764.7). Total num frames: 1773273088. Throughput: 0: 11309.5. Samples: 443375104. Policy #0 lag: (min: 69.0, avg: 157.8, max: 309.0) [2024-06-15 22:40:10,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:40:15,495][1653645] Updated weights for policy 0, policy_version 865938 (0.0014) [2024-06-15 22:40:15,958][1648982] Fps is (10 sec: 45875.8, 60 sec: 44783.0, 300 sec: 43986.9). Total num frames: 1773469696. Throughput: 0: 11195.7. Samples: 443446272. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:40:16,507][1653645] Updated weights for policy 0, policy_version 865983 (0.0012) [2024-06-15 22:40:20,105][1653645] Updated weights for policy 0, policy_version 866048 (0.0150) [2024-06-15 22:40:20,958][1648982] Fps is (10 sec: 45874.9, 60 sec: 46421.2, 300 sec: 43764.7). Total num frames: 1773731840. Throughput: 0: 11195.7. Samples: 443481088. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:40:21,323][1653645] Updated weights for policy 0, policy_version 866110 (0.0125) [2024-06-15 22:40:23,905][1653645] Updated weights for policy 0, policy_version 866172 (0.0012) [2024-06-15 22:40:25,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1773928448. Throughput: 0: 10956.8. Samples: 443538944. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:40:27,469][1653645] Updated weights for policy 0, policy_version 866224 (0.0014) [2024-06-15 22:40:30,958][1648982] Fps is (10 sec: 32768.5, 60 sec: 43690.6, 300 sec: 43431.5). Total num frames: 1774059520. Throughput: 0: 11093.4. Samples: 443610624. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:30,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:40:31,432][1653645] Updated weights for policy 0, policy_version 866274 (0.0011) [2024-06-15 22:40:33,530][1653645] Updated weights for policy 0, policy_version 866364 (0.0012) [2024-06-15 22:40:35,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 44236.8, 300 sec: 43764.7). Total num frames: 1774387200. Throughput: 0: 10968.2. Samples: 443636736. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:35,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:40:36,363][1653645] Updated weights for policy 0, policy_version 866416 (0.0019) [2024-06-15 22:40:38,263][1653645] Updated weights for policy 0, policy_version 866455 (0.0074) [2024-06-15 22:40:40,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 44236.8, 300 sec: 43986.9). Total num frames: 1774583808. Throughput: 0: 10877.2. Samples: 443708416. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:40:42,666][1653645] Updated weights for policy 0, policy_version 866500 (0.0012) [2024-06-15 22:40:44,028][1653645] Updated weights for policy 0, policy_version 866560 (0.0012) [2024-06-15 22:40:45,958][1648982] Fps is (10 sec: 45875.7, 60 sec: 44783.0, 300 sec: 43875.8). Total num frames: 1774845952. Throughput: 0: 11047.8. Samples: 443771392. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:40:47,309][1653645] Updated weights for policy 0, policy_version 866640 (0.0014) [2024-06-15 22:40:50,196][1653645] Updated weights for policy 0, policy_version 866704 (0.0011) [2024-06-15 22:40:50,957][1648982] Fps is (10 sec: 45875.7, 60 sec: 44783.0, 300 sec: 44209.1). Total num frames: 1775042560. Throughput: 0: 10843.1. Samples: 443805184. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:50,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:40:54,909][1653645] Updated weights for policy 0, policy_version 866772 (0.0013) [2024-06-15 22:40:55,508][1651596] Signal inference workers to stop experience collection... (45000 times) [2024-06-15 22:40:55,557][1653645] InferenceWorker_p0-w0: stopping experience collection (45000 times) [2024-06-15 22:40:55,774][1651596] Signal inference workers to resume experience collection... (45000 times) [2024-06-15 22:40:55,775][1653645] InferenceWorker_p0-w0: resuming experience collection (45000 times) [2024-06-15 22:40:55,958][1648982] Fps is (10 sec: 39320.0, 60 sec: 44236.6, 300 sec: 43542.5). Total num frames: 1775239168. Throughput: 0: 11207.0. Samples: 443879424. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:40:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:40:56,353][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000866848_1775304704.pth... [2024-06-15 22:40:56,353][1653645] Updated weights for policy 0, policy_version 866848 (0.0012) [2024-06-15 22:40:56,543][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000861696_1764753408.pth [2024-06-15 22:40:59,611][1653645] Updated weights for policy 0, policy_version 866912 (0.0013) [2024-06-15 22:41:00,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 43987.6). Total num frames: 1775501312. Throughput: 0: 10865.8. Samples: 443935232. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:41:02,955][1653645] Updated weights for policy 0, policy_version 866992 (0.0236) [2024-06-15 22:41:05,958][1648982] Fps is (10 sec: 39323.1, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 1775632384. Throughput: 0: 10877.2. Samples: 443970560. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:41:06,984][1653645] Updated weights for policy 0, policy_version 867041 (0.0012) [2024-06-15 22:41:09,312][1653645] Updated weights for policy 0, policy_version 867135 (0.0013) [2024-06-15 22:41:10,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 43690.6, 300 sec: 43542.5). Total num frames: 1775894528. Throughput: 0: 10968.2. Samples: 444032512. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:41:12,611][1653645] Updated weights for policy 0, policy_version 867194 (0.0112) [2024-06-15 22:41:15,484][1653645] Updated weights for policy 0, policy_version 867248 (0.0013) [2024-06-15 22:41:15,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44782.8, 300 sec: 43764.7). Total num frames: 1776156672. Throughput: 0: 10797.5. Samples: 444096512. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:15,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:41:20,277][1653645] Updated weights for policy 0, policy_version 867328 (0.0014) [2024-06-15 22:41:20,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 43542.6). Total num frames: 1776320512. Throughput: 0: 11036.5. Samples: 444133376. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:41:23,613][1653645] Updated weights for policy 0, policy_version 867395 (0.0013) [2024-06-15 22:41:25,958][1648982] Fps is (10 sec: 39321.7, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1776549888. Throughput: 0: 10729.2. Samples: 444191232. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:25,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:41:27,115][1653645] Updated weights for policy 0, policy_version 867477 (0.0012) [2024-06-15 22:41:28,027][1653645] Updated weights for policy 0, policy_version 867520 (0.0016) [2024-06-15 22:41:30,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 1776680960. Throughput: 0: 11047.8. Samples: 444268544. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:41:32,924][1653645] Updated weights for policy 0, policy_version 867600 (0.0097) [2024-06-15 22:41:35,501][1653645] Updated weights for policy 0, policy_version 867650 (0.0070) [2024-06-15 22:41:35,957][1648982] Fps is (10 sec: 45875.9, 60 sec: 43690.8, 300 sec: 43764.8). Total num frames: 1777008640. Throughput: 0: 10774.8. Samples: 444290048. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:41:36,661][1653645] Updated weights for policy 0, policy_version 867712 (0.0123) [2024-06-15 22:41:39,924][1653645] Updated weights for policy 0, policy_version 867769 (0.0016) [2024-06-15 22:41:40,958][1648982] Fps is (10 sec: 52426.9, 60 sec: 43690.5, 300 sec: 43320.4). Total num frames: 1777205248. Throughput: 0: 10604.1. Samples: 444356608. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:40,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:41:43,203][1651596] Signal inference workers to stop experience collection... (45050 times) [2024-06-15 22:41:43,264][1653645] InferenceWorker_p0-w0: stopping experience collection (45050 times) [2024-06-15 22:41:43,440][1651596] Signal inference workers to resume experience collection... (45050 times) [2024-06-15 22:41:43,441][1653645] InferenceWorker_p0-w0: resuming experience collection (45050 times) [2024-06-15 22:41:43,998][1653645] Updated weights for policy 0, policy_version 867828 (0.0013) [2024-06-15 22:41:45,267][1653645] Updated weights for policy 0, policy_version 867898 (0.0013) [2024-06-15 22:41:45,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1777467392. Throughput: 0: 10843.0. Samples: 444423168. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:41:47,886][1653645] Updated weights for policy 0, policy_version 867952 (0.0014) [2024-06-15 22:41:50,958][1648982] Fps is (10 sec: 42599.3, 60 sec: 43144.4, 300 sec: 43764.7). Total num frames: 1777631232. Throughput: 0: 10786.1. Samples: 444455936. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:50,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:41:51,349][1653645] Updated weights for policy 0, policy_version 868003 (0.0103) [2024-06-15 22:41:55,664][1653645] Updated weights for policy 0, policy_version 868080 (0.0012) [2024-06-15 22:41:55,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43144.8, 300 sec: 43431.5). Total num frames: 1777827840. Throughput: 0: 11036.5. Samples: 444529152. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:41:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:41:57,421][1653645] Updated weights for policy 0, policy_version 868154 (0.0014) [2024-06-15 22:42:00,044][1653645] Updated weights for policy 0, policy_version 868220 (0.0012) [2024-06-15 22:42:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1778122752. Throughput: 0: 10945.4. Samples: 444589056. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:42:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:42:03,719][1653645] Updated weights for policy 0, policy_version 868272 (0.0011) [2024-06-15 22:42:05,958][1648982] Fps is (10 sec: 42598.6, 60 sec: 43690.7, 300 sec: 43431.5). Total num frames: 1778253824. Throughput: 0: 10911.3. Samples: 444624384. Policy #0 lag: (min: 15.0, avg: 113.0, max: 271.0) [2024-06-15 22:42:05,975][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:42:07,109][1653645] Updated weights for policy 0, policy_version 868320 (0.0012) [2024-06-15 22:42:08,827][1653645] Updated weights for policy 0, policy_version 868385 (0.0015) [2024-06-15 22:42:10,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1778515968. Throughput: 0: 11059.2. Samples: 444688896. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:42:11,069][1653645] Updated weights for policy 0, policy_version 868432 (0.0022) [2024-06-15 22:42:14,764][1653645] Updated weights for policy 0, policy_version 868496 (0.0014) [2024-06-15 22:42:15,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1778778112. Throughput: 0: 10843.0. Samples: 444756480. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:42:18,969][1653645] Updated weights for policy 0, policy_version 868592 (0.0075) [2024-06-15 22:42:20,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 45329.0, 300 sec: 43986.9). Total num frames: 1779040256. Throughput: 0: 11150.2. Samples: 444791808. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:42:23,467][1653645] Updated weights for policy 0, policy_version 868673 (0.0015) [2024-06-15 22:42:25,041][1653645] Updated weights for policy 0, policy_version 868736 (0.0012) [2024-06-15 22:42:25,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1779171328. Throughput: 0: 11025.1. Samples: 444852736. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:42:28,052][1653645] Updated weights for policy 0, policy_version 868792 (0.0127) [2024-06-15 22:42:29,475][1651596] Signal inference workers to stop experience collection... (45100 times) [2024-06-15 22:42:29,530][1653645] InferenceWorker_p0-w0: stopping experience collection (45100 times) [2024-06-15 22:42:29,817][1651596] Signal inference workers to resume experience collection... (45100 times) [2024-06-15 22:42:29,819][1653645] InferenceWorker_p0-w0: resuming experience collection (45100 times) [2024-06-15 22:42:30,523][1653645] Updated weights for policy 0, policy_version 868848 (0.0037) [2024-06-15 22:42:30,957][1648982] Fps is (10 sec: 39322.3, 60 sec: 45875.2, 300 sec: 43653.7). Total num frames: 1779433472. Throughput: 0: 10968.2. Samples: 444916736. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:42:32,321][1653645] Updated weights for policy 0, policy_version 868883 (0.0017) [2024-06-15 22:42:35,985][1648982] Fps is (10 sec: 42481.5, 60 sec: 43124.7, 300 sec: 43649.6). Total num frames: 1779597312. Throughput: 0: 11075.2. Samples: 444954624. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:35,986][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:42:36,128][1653645] Updated weights for policy 0, policy_version 868945 (0.0014) [2024-06-15 22:42:37,164][1653645] Updated weights for policy 0, policy_version 868992 (0.0013) [2024-06-15 22:42:39,765][1653645] Updated weights for policy 0, policy_version 869056 (0.0013) [2024-06-15 22:42:40,958][1648982] Fps is (10 sec: 39317.9, 60 sec: 43690.3, 300 sec: 43764.6). Total num frames: 1779826688. Throughput: 0: 10888.3. Samples: 445019136. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:40,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:42:42,355][1653645] Updated weights for policy 0, policy_version 869104 (0.0032) [2024-06-15 22:42:43,716][1653645] Updated weights for policy 0, policy_version 869136 (0.0014) [2024-06-15 22:42:45,978][1648982] Fps is (10 sec: 49186.5, 60 sec: 43675.7, 300 sec: 43983.8). Total num frames: 1780088832. Throughput: 0: 11042.8. Samples: 445086208. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:45,979][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:42:47,445][1653645] Updated weights for policy 0, policy_version 869200 (0.0023) [2024-06-15 22:42:48,510][1653645] Updated weights for policy 0, policy_version 869248 (0.0013) [2024-06-15 22:42:50,970][1648982] Fps is (10 sec: 49093.9, 60 sec: 44773.6, 300 sec: 44207.1). Total num frames: 1780318208. Throughput: 0: 11147.1. Samples: 445126144. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:50,971][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:42:51,121][1653645] Updated weights for policy 0, policy_version 869312 (0.0013) [2024-06-15 22:42:53,187][1653645] Updated weights for policy 0, policy_version 869369 (0.0013) [2024-06-15 22:42:55,746][1653645] Updated weights for policy 0, policy_version 869433 (0.0016) [2024-06-15 22:42:55,958][1648982] Fps is (10 sec: 52531.6, 60 sec: 46420.6, 300 sec: 43986.7). Total num frames: 1780613120. Throughput: 0: 11184.1. Samples: 445192192. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:42:55,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:42:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000869440_1780613120.pth... [2024-06-15 22:42:56,073][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000864192_1769865216.pth [2024-06-15 22:43:00,027][1653645] Updated weights for policy 0, policy_version 869502 (0.0250) [2024-06-15 22:43:00,958][1648982] Fps is (10 sec: 42652.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1780744192. Throughput: 0: 11252.6. Samples: 445262848. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:43:02,456][1653645] Updated weights for policy 0, policy_version 869564 (0.0013) [2024-06-15 22:43:04,916][1653645] Updated weights for policy 0, policy_version 869624 (0.0015) [2024-06-15 22:43:05,958][1648982] Fps is (10 sec: 42602.4, 60 sec: 46421.3, 300 sec: 44098.0). Total num frames: 1781039104. Throughput: 0: 11184.4. Samples: 445295104. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:43:06,528][1653645] Updated weights for policy 0, policy_version 869680 (0.0017) [2024-06-15 22:43:10,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1781137408. Throughput: 0: 11355.0. Samples: 445363712. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:10,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:43:12,056][1653645] Updated weights for policy 0, policy_version 869731 (0.0013) [2024-06-15 22:43:12,831][1653645] Updated weights for policy 0, policy_version 869763 (0.0115) [2024-06-15 22:43:14,091][1653645] Updated weights for policy 0, policy_version 869821 (0.0201) [2024-06-15 22:43:15,909][1651596] Signal inference workers to stop experience collection... (45150 times) [2024-06-15 22:43:15,966][1648982] Fps is (10 sec: 39287.9, 60 sec: 44230.5, 300 sec: 44207.7). Total num frames: 1781432320. Throughput: 0: 11364.2. Samples: 445428224. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:15,967][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:43:15,982][1653645] InferenceWorker_p0-w0: stopping experience collection (45150 times) [2024-06-15 22:43:16,147][1651596] Signal inference workers to resume experience collection... (45150 times) [2024-06-15 22:43:16,148][1653645] InferenceWorker_p0-w0: resuming experience collection (45150 times) [2024-06-15 22:43:18,432][1653645] Updated weights for policy 0, policy_version 869920 (0.0086) [2024-06-15 22:43:20,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1781661696. Throughput: 0: 11100.1. Samples: 445453824. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:20,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:43:23,884][1653645] Updated weights for policy 0, policy_version 869972 (0.0011) [2024-06-15 22:43:25,961][1648982] Fps is (10 sec: 42621.9, 60 sec: 44780.6, 300 sec: 44208.6). Total num frames: 1781858304. Throughput: 0: 11343.1. Samples: 445529600. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:25,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:43:26,006][1653645] Updated weights for policy 0, policy_version 870064 (0.0017) [2024-06-15 22:43:28,385][1653645] Updated weights for policy 0, policy_version 870128 (0.0014) [2024-06-15 22:43:30,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.0, 300 sec: 44209.0). Total num frames: 1782153216. Throughput: 0: 11143.9. Samples: 445587456. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:30,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 22:43:31,087][1653645] Updated weights for policy 0, policy_version 870196 (0.0162) [2024-06-15 22:43:35,958][1648982] Fps is (10 sec: 36055.8, 60 sec: 43710.7, 300 sec: 43653.6). Total num frames: 1782218752. Throughput: 0: 11039.6. Samples: 445622784. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:35,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:43:36,373][1653645] Updated weights for policy 0, policy_version 870240 (0.0012) [2024-06-15 22:43:38,057][1653645] Updated weights for policy 0, policy_version 870320 (0.0014) [2024-06-15 22:43:40,348][1653645] Updated weights for policy 0, policy_version 870369 (0.0015) [2024-06-15 22:43:40,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 45329.7, 300 sec: 44320.1). Total num frames: 1782546432. Throughput: 0: 11048.1. Samples: 445689344. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:43:41,494][1653645] Updated weights for policy 0, policy_version 870416 (0.0013) [2024-06-15 22:43:42,720][1653645] Updated weights for policy 0, policy_version 870463 (0.0014) [2024-06-15 22:43:45,975][1648982] Fps is (10 sec: 49065.8, 60 sec: 43692.8, 300 sec: 43984.3). Total num frames: 1782710272. Throughput: 0: 11009.4. Samples: 445758464. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:45,976][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:43:48,755][1653645] Updated weights for policy 0, policy_version 870517 (0.0013) [2024-06-15 22:43:49,924][1653645] Updated weights for policy 0, policy_version 870585 (0.0013) [2024-06-15 22:43:50,957][1648982] Fps is (10 sec: 42598.6, 60 sec: 44246.2, 300 sec: 44431.2). Total num frames: 1782972416. Throughput: 0: 11036.5. Samples: 445791744. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:43:52,651][1653645] Updated weights for policy 0, policy_version 870624 (0.0012) [2024-06-15 22:43:54,692][1653645] Updated weights for policy 0, policy_version 870704 (0.0021) [2024-06-15 22:43:55,960][1648982] Fps is (10 sec: 52506.9, 60 sec: 43689.4, 300 sec: 44208.6). Total num frames: 1783234560. Throughput: 0: 10785.5. Samples: 445849088. Policy #0 lag: (min: 31.0, avg: 117.8, max: 287.0) [2024-06-15 22:43:55,961][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:44:00,630][1653645] Updated weights for policy 0, policy_version 870758 (0.0012) [2024-06-15 22:44:00,958][1648982] Fps is (10 sec: 39320.1, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1783365632. Throughput: 0: 10958.8. Samples: 445921280. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:00,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:44:01,484][1651596] Signal inference workers to stop experience collection... (45200 times) [2024-06-15 22:44:01,528][1653645] InferenceWorker_p0-w0: stopping experience collection (45200 times) [2024-06-15 22:44:01,724][1651596] Signal inference workers to resume experience collection... (45200 times) [2024-06-15 22:44:01,725][1653645] InferenceWorker_p0-w0: resuming experience collection (45200 times) [2024-06-15 22:44:02,053][1653645] Updated weights for policy 0, policy_version 870832 (0.0012) [2024-06-15 22:44:04,105][1653645] Updated weights for policy 0, policy_version 870883 (0.0013) [2024-06-15 22:44:05,564][1653645] Updated weights for policy 0, policy_version 870944 (0.0016) [2024-06-15 22:44:05,958][1648982] Fps is (10 sec: 49165.3, 60 sec: 44782.9, 300 sec: 44764.4). Total num frames: 1783726080. Throughput: 0: 11161.6. Samples: 445956096. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:44:10,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1783758848. Throughput: 0: 10877.8. Samples: 446019072. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:44:12,719][1653645] Updated weights for policy 0, policy_version 871029 (0.0015) [2024-06-15 22:44:14,043][1653645] Updated weights for policy 0, policy_version 871075 (0.0040) [2024-06-15 22:44:15,927][1653645] Updated weights for policy 0, policy_version 871136 (0.0012) [2024-06-15 22:44:15,957][1648982] Fps is (10 sec: 36045.0, 60 sec: 44243.2, 300 sec: 44542.3). Total num frames: 1784086528. Throughput: 0: 11104.7. Samples: 446087168. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:44:17,649][1653645] Updated weights for policy 0, policy_version 871202 (0.0014) [2024-06-15 22:44:20,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1784283136. Throughput: 0: 10877.2. Samples: 446112256. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:20,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:44:24,283][1653645] Updated weights for policy 0, policy_version 871249 (0.0016) [2024-06-15 22:44:25,958][1648982] Fps is (10 sec: 36043.8, 60 sec: 43146.6, 300 sec: 44097.9). Total num frames: 1784446976. Throughput: 0: 11263.9. Samples: 446196224. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:25,959][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:44:27,222][1653645] Updated weights for policy 0, policy_version 871360 (0.0175) [2024-06-15 22:44:28,609][1653645] Updated weights for policy 0, policy_version 871424 (0.0017) [2024-06-15 22:44:29,696][1653645] Updated weights for policy 0, policy_version 871477 (0.0014) [2024-06-15 22:44:30,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 44236.6, 300 sec: 44320.1). Total num frames: 1784807424. Throughput: 0: 10847.2. Samples: 446246400. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:30,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:44:35,958][1648982] Fps is (10 sec: 36045.0, 60 sec: 43144.4, 300 sec: 43653.6). Total num frames: 1784807424. Throughput: 0: 10922.6. Samples: 446283264. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:44:36,381][1653645] Updated weights for policy 0, policy_version 871506 (0.0012) [2024-06-15 22:44:38,092][1653645] Updated weights for policy 0, policy_version 871568 (0.0011) [2024-06-15 22:44:40,155][1653645] Updated weights for policy 0, policy_version 871648 (0.0020) [2024-06-15 22:44:40,958][1648982] Fps is (10 sec: 36045.1, 60 sec: 43690.6, 300 sec: 44098.0). Total num frames: 1785167872. Throughput: 0: 11162.2. Samples: 446351360. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:44:41,559][1651596] Signal inference workers to stop experience collection... (45250 times) [2024-06-15 22:44:41,608][1653645] InferenceWorker_p0-w0: stopping experience collection (45250 times) [2024-06-15 22:44:41,913][1651596] Signal inference workers to resume experience collection... (45250 times) [2024-06-15 22:44:41,915][1653645] InferenceWorker_p0-w0: resuming experience collection (45250 times) [2024-06-15 22:44:42,430][1653645] Updated weights for policy 0, policy_version 871731 (0.0026) [2024-06-15 22:44:45,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 43703.5, 300 sec: 43986.9). Total num frames: 1785331712. Throughput: 0: 10797.6. Samples: 446407168. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:44:48,221][1653645] Updated weights for policy 0, policy_version 871748 (0.0010) [2024-06-15 22:44:49,805][1653645] Updated weights for policy 0, policy_version 871812 (0.0080) [2024-06-15 22:44:50,958][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 43986.9). Total num frames: 1785561088. Throughput: 0: 11082.0. Samples: 446454784. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:44:51,380][1653645] Updated weights for policy 0, policy_version 871888 (0.0069) [2024-06-15 22:44:52,320][1653645] Updated weights for policy 0, policy_version 871929 (0.0013) [2024-06-15 22:44:53,850][1653645] Updated weights for policy 0, policy_version 871997 (0.0021) [2024-06-15 22:44:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43692.6, 300 sec: 43986.9). Total num frames: 1785856000. Throughput: 0: 10900.0. Samples: 446509568. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:44:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:44:55,962][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000872000_1785856000.pth... [2024-06-15 22:44:55,996][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000866848_1775304704.pth [2024-06-15 22:45:00,958][1648982] Fps is (10 sec: 39320.7, 60 sec: 43144.6, 300 sec: 43875.8). Total num frames: 1785954304. Throughput: 0: 11059.1. Samples: 446584832. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:45:01,237][1653645] Updated weights for policy 0, policy_version 872061 (0.0027) [2024-06-15 22:45:03,039][1653645] Updated weights for policy 0, policy_version 872129 (0.0011) [2024-06-15 22:45:05,097][1653645] Updated weights for policy 0, policy_version 872224 (0.0119) [2024-06-15 22:45:05,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 44236.8, 300 sec: 44431.2). Total num frames: 1786380288. Throughput: 0: 11025.1. Samples: 446608384. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:45:10,959][1648982] Fps is (10 sec: 42595.4, 60 sec: 43690.2, 300 sec: 43764.6). Total num frames: 1786380288. Throughput: 0: 10717.7. Samples: 446678528. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:10,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:45:12,317][1653645] Updated weights for policy 0, policy_version 872272 (0.0013) [2024-06-15 22:45:14,229][1653645] Updated weights for policy 0, policy_version 872340 (0.0025) [2024-06-15 22:45:15,869][1653645] Updated weights for policy 0, policy_version 872417 (0.0099) [2024-06-15 22:45:15,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 43690.6, 300 sec: 43986.9). Total num frames: 1786707968. Throughput: 0: 11025.1. Samples: 446742528. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:15,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:45:16,824][1653645] Updated weights for policy 0, policy_version 872468 (0.0013) [2024-06-15 22:45:20,958][1648982] Fps is (10 sec: 52433.9, 60 sec: 43690.7, 300 sec: 43986.9). Total num frames: 1786904576. Throughput: 0: 11047.9. Samples: 446780416. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:45:24,170][1653645] Updated weights for policy 0, policy_version 872532 (0.0015) [2024-06-15 22:45:25,623][1653645] Updated weights for policy 0, policy_version 872592 (0.0015) [2024-06-15 22:45:25,791][1651596] Signal inference workers to stop experience collection... (45300 times) [2024-06-15 22:45:25,829][1653645] InferenceWorker_p0-w0: stopping experience collection (45300 times) [2024-06-15 22:45:25,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43690.9, 300 sec: 44098.0). Total num frames: 1787068416. Throughput: 0: 11138.9. Samples: 446852608. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:45:26,071][1651596] Signal inference workers to resume experience collection... (45300 times) [2024-06-15 22:45:26,072][1653645] InferenceWorker_p0-w0: resuming experience collection (45300 times) [2024-06-15 22:45:27,496][1653645] Updated weights for policy 0, policy_version 872675 (0.0014) [2024-06-15 22:45:29,045][1653645] Updated weights for policy 0, policy_version 872737 (0.0129) [2024-06-15 22:45:30,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 44209.1). Total num frames: 1787428864. Throughput: 0: 11082.0. Samples: 446905856. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:30,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:45:35,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 43690.8, 300 sec: 43542.6). Total num frames: 1787428864. Throughput: 0: 10865.8. Samples: 446943744. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:45:37,031][1653645] Updated weights for policy 0, policy_version 872805 (0.0027) [2024-06-15 22:45:38,685][1653645] Updated weights for policy 0, policy_version 872885 (0.0013) [2024-06-15 22:45:40,180][1653645] Updated weights for policy 0, policy_version 872944 (0.0014) [2024-06-15 22:45:40,958][1648982] Fps is (10 sec: 39321.6, 60 sec: 44236.9, 300 sec: 43986.9). Total num frames: 1787822080. Throughput: 0: 11013.7. Samples: 447005184. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:45:41,512][1653645] Updated weights for policy 0, policy_version 872994 (0.0013) [2024-06-15 22:45:45,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 43764.7). Total num frames: 1787953152. Throughput: 0: 10820.3. Samples: 447071744. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:45,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:45:48,807][1653645] Updated weights for policy 0, policy_version 873032 (0.0040) [2024-06-15 22:45:50,958][1648982] Fps is (10 sec: 29491.2, 60 sec: 42598.4, 300 sec: 43653.7). Total num frames: 1788116992. Throughput: 0: 11184.4. Samples: 447111680. Policy #0 lag: (min: 1.0, avg: 72.8, max: 257.0) [2024-06-15 22:45:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:45:51,162][1653645] Updated weights for policy 0, policy_version 873123 (0.0233) [2024-06-15 22:45:52,608][1653645] Updated weights for policy 0, policy_version 873186 (0.0011) [2024-06-15 22:45:54,369][1653645] Updated weights for policy 0, policy_version 873249 (0.0012) [2024-06-15 22:45:55,959][1648982] Fps is (10 sec: 52423.8, 60 sec: 43690.0, 300 sec: 43986.7). Total num frames: 1788477440. Throughput: 0: 10638.2. Samples: 447157248. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:45:55,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:46:00,957][1648982] Fps is (10 sec: 36044.9, 60 sec: 42052.5, 300 sec: 43542.6). Total num frames: 1788477440. Throughput: 0: 10820.3. Samples: 447229440. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:46:01,675][1653645] Updated weights for policy 0, policy_version 873284 (0.0037) [2024-06-15 22:46:03,784][1653645] Updated weights for policy 0, policy_version 873376 (0.0013) [2024-06-15 22:46:05,694][1651596] Signal inference workers to stop experience collection... (45350 times) [2024-06-15 22:46:05,742][1653645] Updated weights for policy 0, policy_version 873445 (0.0012) [2024-06-15 22:46:05,807][1653645] InferenceWorker_p0-w0: stopping experience collection (45350 times) [2024-06-15 22:46:05,917][1651596] Signal inference workers to resume experience collection... (45350 times) [2024-06-15 22:46:05,917][1653645] InferenceWorker_p0-w0: resuming experience collection (45350 times) [2024-06-15 22:46:05,958][1648982] Fps is (10 sec: 36048.2, 60 sec: 40960.0, 300 sec: 43875.8). Total num frames: 1788837888. Throughput: 0: 10683.7. Samples: 447261184. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:46:07,099][1653645] Updated weights for policy 0, policy_version 873494 (0.0013) [2024-06-15 22:46:07,973][1653645] Updated weights for policy 0, policy_version 873536 (0.0016) [2024-06-15 22:46:10,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 43691.4, 300 sec: 43542.6). Total num frames: 1789001728. Throughput: 0: 10387.9. Samples: 447320064. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:46:15,601][1653645] Updated weights for policy 0, policy_version 873616 (0.0111) [2024-06-15 22:46:15,958][1648982] Fps is (10 sec: 36044.7, 60 sec: 41506.1, 300 sec: 43653.6). Total num frames: 1789198336. Throughput: 0: 10797.5. Samples: 447391744. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:46:17,744][1653645] Updated weights for policy 0, policy_version 873712 (0.0038) [2024-06-15 22:46:19,514][1653645] Updated weights for policy 0, policy_version 873785 (0.0126) [2024-06-15 22:46:20,958][1648982] Fps is (10 sec: 52426.5, 60 sec: 43690.4, 300 sec: 43986.8). Total num frames: 1789526016. Throughput: 0: 10399.2. Samples: 447411712. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:46:25,958][1648982] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 43653.6). Total num frames: 1789558784. Throughput: 0: 10774.7. Samples: 447490048. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:46:26,903][1653645] Updated weights for policy 0, policy_version 873856 (0.0012) [2024-06-15 22:46:28,867][1653645] Updated weights for policy 0, policy_version 873936 (0.0013) [2024-06-15 22:46:30,548][1653645] Updated weights for policy 0, policy_version 874016 (0.0013) [2024-06-15 22:46:30,958][1648982] Fps is (10 sec: 49153.3, 60 sec: 43144.5, 300 sec: 44097.9). Total num frames: 1790017536. Throughput: 0: 10558.5. Samples: 447546880. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:46:35,957][1648982] Fps is (10 sec: 49152.5, 60 sec: 43690.7, 300 sec: 43542.6). Total num frames: 1790050304. Throughput: 0: 10399.3. Samples: 447579648. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:35,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:46:37,752][1653645] Updated weights for policy 0, policy_version 874085 (0.0014) [2024-06-15 22:46:38,667][1653645] Updated weights for policy 0, policy_version 874113 (0.0134) [2024-06-15 22:46:39,890][1653645] Updated weights for policy 0, policy_version 874164 (0.0052) [2024-06-15 22:46:40,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 43764.7). Total num frames: 1790377984. Throughput: 0: 10991.1. Samples: 447651840. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:40,959][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 22:46:41,144][1653645] Updated weights for policy 0, policy_version 874224 (0.0033) [2024-06-15 22:46:42,944][1653645] Updated weights for policy 0, policy_version 874304 (0.0209) [2024-06-15 22:46:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1790574592. Throughput: 0: 10786.1. Samples: 447714816. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:46:48,868][1651596] Signal inference workers to stop experience collection... (45400 times) [2024-06-15 22:46:48,910][1653645] InferenceWorker_p0-w0: stopping experience collection (45400 times) [2024-06-15 22:46:49,162][1651596] Signal inference workers to resume experience collection... (45400 times) [2024-06-15 22:46:49,174][1653645] InferenceWorker_p0-w0: resuming experience collection (45400 times) [2024-06-15 22:46:50,343][1653645] Updated weights for policy 0, policy_version 874372 (0.0073) [2024-06-15 22:46:50,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 44236.8, 300 sec: 43875.8). Total num frames: 1790771200. Throughput: 0: 11059.2. Samples: 447758848. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:50,961][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:46:51,836][1653645] Updated weights for policy 0, policy_version 874448 (0.0015) [2024-06-15 22:46:53,966][1653645] Updated weights for policy 0, policy_version 874519 (0.0014) [2024-06-15 22:46:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43691.3, 300 sec: 43986.9). Total num frames: 1791098880. Throughput: 0: 10922.6. Samples: 447811584. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:46:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:46:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000874560_1791098880.pth... [2024-06-15 22:46:56,032][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000869440_1780613120.pth [2024-06-15 22:47:00,958][1648982] Fps is (10 sec: 32768.2, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1791098880. Throughput: 0: 11059.2. Samples: 447889408. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:00,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:47:01,437][1653645] Updated weights for policy 0, policy_version 874564 (0.0012) [2024-06-15 22:47:03,653][1653645] Updated weights for policy 0, policy_version 874656 (0.0014) [2024-06-15 22:47:05,433][1653645] Updated weights for policy 0, policy_version 874720 (0.0013) [2024-06-15 22:47:05,958][1648982] Fps is (10 sec: 36044.1, 60 sec: 43690.5, 300 sec: 43875.8). Total num frames: 1791459328. Throughput: 0: 11309.5. Samples: 447920640. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:05,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:47:10,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43542.6). Total num frames: 1791623168. Throughput: 0: 10888.5. Samples: 447980032. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:47:13,110][1653645] Updated weights for policy 0, policy_version 874832 (0.0012) [2024-06-15 22:47:14,901][1653645] Updated weights for policy 0, policy_version 874896 (0.0019) [2024-06-15 22:47:15,958][1648982] Fps is (10 sec: 39322.4, 60 sec: 44236.7, 300 sec: 43431.5). Total num frames: 1791852544. Throughput: 0: 11082.0. Samples: 448045568. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:47:17,126][1653645] Updated weights for policy 0, policy_version 874976 (0.0246) [2024-06-15 22:47:19,223][1653645] Updated weights for policy 0, policy_version 875071 (0.0014) [2024-06-15 22:47:20,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 43691.0, 300 sec: 43986.9). Total num frames: 1792147456. Throughput: 0: 10786.1. Samples: 448065024. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:20,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:47:25,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 44236.8, 300 sec: 43320.4). Total num frames: 1792212992. Throughput: 0: 10990.9. Samples: 448146432. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:25,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:47:27,128][1653645] Updated weights for policy 0, policy_version 875137 (0.0014) [2024-06-15 22:47:27,804][1651596] Signal inference workers to stop experience collection... (45450 times) [2024-06-15 22:47:27,855][1653645] InferenceWorker_p0-w0: stopping experience collection (45450 times) [2024-06-15 22:47:28,044][1651596] Signal inference workers to resume experience collection... (45450 times) [2024-06-15 22:47:28,045][1653645] InferenceWorker_p0-w0: resuming experience collection (45450 times) [2024-06-15 22:47:29,206][1653645] Updated weights for policy 0, policy_version 875232 (0.0015) [2024-06-15 22:47:30,958][1648982] Fps is (10 sec: 45873.6, 60 sec: 43144.4, 300 sec: 44102.0). Total num frames: 1792606208. Throughput: 0: 10660.9. Samples: 448194560. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:30,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:47:31,145][1653645] Updated weights for policy 0, policy_version 875317 (0.0014) [2024-06-15 22:47:35,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 43690.6, 300 sec: 43542.7). Total num frames: 1792671744. Throughput: 0: 10558.6. Samples: 448233984. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:47:37,983][1653645] Updated weights for policy 0, policy_version 875360 (0.0012) [2024-06-15 22:47:38,991][1653645] Updated weights for policy 0, policy_version 875400 (0.0013) [2024-06-15 22:47:40,440][1653645] Updated weights for policy 0, policy_version 875472 (0.0013) [2024-06-15 22:47:40,958][1648982] Fps is (10 sec: 39319.8, 60 sec: 43690.2, 300 sec: 43767.7). Total num frames: 1792999424. Throughput: 0: 10968.0. Samples: 448305152. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:40,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:47:42,026][1653645] Updated weights for policy 0, policy_version 875524 (0.0011) [2024-06-15 22:47:43,130][1653645] Updated weights for policy 0, policy_version 875573 (0.0011) [2024-06-15 22:47:45,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 43690.5, 300 sec: 43655.5). Total num frames: 1793196032. Throughput: 0: 10649.5. Samples: 448368640. Policy #0 lag: (min: 118.0, avg: 165.5, max: 339.0) [2024-06-15 22:47:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:47:49,540][1653645] Updated weights for policy 0, policy_version 875604 (0.0016) [2024-06-15 22:47:50,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 43144.0, 300 sec: 43209.4). Total num frames: 1793359872. Throughput: 0: 10979.4. Samples: 448414720. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:47:50,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 22:47:51,594][1653645] Updated weights for policy 0, policy_version 875697 (0.0014) [2024-06-15 22:47:53,787][1653645] Updated weights for policy 0, policy_version 875792 (0.0012) [2024-06-15 22:47:55,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.5, 300 sec: 43986.9). Total num frames: 1793720320. Throughput: 0: 10797.5. Samples: 448465920. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:47:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:48:00,307][1653645] Updated weights for policy 0, policy_version 875856 (0.0082) [2024-06-15 22:48:00,958][1648982] Fps is (10 sec: 45878.4, 60 sec: 45329.1, 300 sec: 43320.4). Total num frames: 1793818624. Throughput: 0: 11229.9. Samples: 448550912. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:48:01,940][1653645] Updated weights for policy 0, policy_version 875920 (0.0018) [2024-06-15 22:48:03,428][1653645] Updated weights for policy 0, policy_version 875970 (0.0013) [2024-06-15 22:48:04,744][1653645] Updated weights for policy 0, policy_version 876023 (0.0013) [2024-06-15 22:48:04,996][1651596] Signal inference workers to stop experience collection... (45500 times) [2024-06-15 22:48:05,078][1653645] InferenceWorker_p0-w0: stopping experience collection (45500 times) [2024-06-15 22:48:05,239][1651596] Signal inference workers to resume experience collection... (45500 times) [2024-06-15 22:48:05,240][1653645] InferenceWorker_p0-w0: resuming experience collection (45500 times) [2024-06-15 22:48:05,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 45329.1, 300 sec: 44209.0). Total num frames: 1794179072. Throughput: 0: 11366.3. Samples: 448576512. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:48:10,982][1648982] Fps is (10 sec: 42493.7, 60 sec: 43672.7, 300 sec: 43429.1). Total num frames: 1794244608. Throughput: 0: 11087.3. Samples: 448645632. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:10,983][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:48:11,859][1653645] Updated weights for policy 0, policy_version 876101 (0.0279) [2024-06-15 22:48:12,840][1653645] Updated weights for policy 0, policy_version 876156 (0.0013) [2024-06-15 22:48:15,711][1653645] Updated weights for policy 0, policy_version 876241 (0.0014) [2024-06-15 22:48:15,959][1648982] Fps is (10 sec: 36041.8, 60 sec: 44782.2, 300 sec: 43653.5). Total num frames: 1794539520. Throughput: 0: 11468.6. Samples: 448710656. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:15,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:48:16,940][1653645] Updated weights for policy 0, policy_version 876290 (0.0014) [2024-06-15 22:48:18,178][1653645] Updated weights for policy 0, policy_version 876351 (0.0013) [2024-06-15 22:48:20,958][1648982] Fps is (10 sec: 52558.1, 60 sec: 43690.6, 300 sec: 43765.2). Total num frames: 1794768896. Throughput: 0: 11207.1. Samples: 448738304. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:48:24,049][1653645] Updated weights for policy 0, policy_version 876414 (0.0015) [2024-06-15 22:48:25,962][1648982] Fps is (10 sec: 42583.5, 60 sec: 45871.8, 300 sec: 43430.8). Total num frames: 1794965504. Throughput: 0: 11399.6. Samples: 448818176. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:25,963][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:48:26,852][1653645] Updated weights for policy 0, policy_version 876483 (0.0015) [2024-06-15 22:48:28,311][1653645] Updated weights for policy 0, policy_version 876546 (0.0033) [2024-06-15 22:48:29,455][1653645] Updated weights for policy 0, policy_version 876595 (0.0012) [2024-06-15 22:48:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 44783.1, 300 sec: 44320.1). Total num frames: 1795293184. Throughput: 0: 11366.5. Samples: 448880128. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:30,960][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:48:35,626][1653645] Updated weights for policy 0, policy_version 876656 (0.0011) [2024-06-15 22:48:35,957][1648982] Fps is (10 sec: 42618.2, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 1795391488. Throughput: 0: 11218.7. Samples: 448919552. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:35,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:48:37,693][1653645] Updated weights for policy 0, policy_version 876689 (0.0046) [2024-06-15 22:48:39,050][1653645] Updated weights for policy 0, policy_version 876752 (0.0014) [2024-06-15 22:48:40,958][1648982] Fps is (10 sec: 39320.8, 60 sec: 44783.3, 300 sec: 43989.5). Total num frames: 1795686400. Throughput: 0: 11468.8. Samples: 448982016. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:40,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:48:41,463][1653645] Updated weights for policy 0, policy_version 876832 (0.0011) [2024-06-15 22:48:45,962][1648982] Fps is (10 sec: 42578.1, 60 sec: 43687.4, 300 sec: 43541.9). Total num frames: 1795817472. Throughput: 0: 10944.3. Samples: 449043456. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:45,963][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:48:47,171][1653645] Updated weights for policy 0, policy_version 876866 (0.0013) [2024-06-15 22:48:48,279][1653645] Updated weights for policy 0, policy_version 876919 (0.0013) [2024-06-15 22:48:49,612][1653645] Updated weights for policy 0, policy_version 876960 (0.0013) [2024-06-15 22:48:49,749][1651596] Signal inference workers to stop experience collection... (45550 times) [2024-06-15 22:48:49,780][1653645] InferenceWorker_p0-w0: stopping experience collection (45550 times) [2024-06-15 22:48:50,053][1651596] Signal inference workers to resume experience collection... (45550 times) [2024-06-15 22:48:50,054][1653645] InferenceWorker_p0-w0: resuming experience collection (45550 times) [2024-06-15 22:48:50,958][1648982] Fps is (10 sec: 39322.3, 60 sec: 45329.6, 300 sec: 43543.0). Total num frames: 1796079616. Throughput: 0: 11173.0. Samples: 449079296. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:50,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:48:51,848][1653645] Updated weights for policy 0, policy_version 877040 (0.0014) [2024-06-15 22:48:53,758][1653645] Updated weights for policy 0, policy_version 877108 (0.0015) [2024-06-15 22:48:55,962][1648982] Fps is (10 sec: 52428.7, 60 sec: 43687.4, 300 sec: 43986.2). Total num frames: 1796341760. Throughput: 0: 10756.8. Samples: 449129472. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:48:55,963][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:48:55,985][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000877120_1796341760.pth... [2024-06-15 22:48:56,085][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000872000_1785856000.pth [2024-06-15 22:49:00,588][1653645] Updated weights for policy 0, policy_version 877140 (0.0013) [2024-06-15 22:49:00,958][1648982] Fps is (10 sec: 32768.0, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1796407296. Throughput: 0: 11048.1. Samples: 449207808. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:49:02,147][1653645] Updated weights for policy 0, policy_version 877206 (0.0013) [2024-06-15 22:49:04,317][1653645] Updated weights for policy 0, policy_version 877283 (0.0013) [2024-06-15 22:49:05,246][1653645] Updated weights for policy 0, policy_version 877338 (0.0011) [2024-06-15 22:49:05,885][1653645] Updated weights for policy 0, policy_version 877376 (0.0017) [2024-06-15 22:49:05,958][1648982] Fps is (10 sec: 52453.3, 60 sec: 44783.1, 300 sec: 44431.2). Total num frames: 1796866048. Throughput: 0: 10945.4. Samples: 449230848. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:05,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:49:10,958][1648982] Fps is (10 sec: 45875.0, 60 sec: 43708.6, 300 sec: 43320.4). Total num frames: 1796866048. Throughput: 0: 10741.7. Samples: 449301504. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:49:13,687][1653645] Updated weights for policy 0, policy_version 877472 (0.0142) [2024-06-15 22:49:15,958][1648982] Fps is (10 sec: 29491.4, 60 sec: 43691.5, 300 sec: 43653.6). Total num frames: 1797160960. Throughput: 0: 10695.1. Samples: 449361408. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:49:16,647][1653645] Updated weights for policy 0, policy_version 877554 (0.0014) [2024-06-15 22:49:20,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 43690.7, 300 sec: 43875.8). Total num frames: 1797390336. Throughput: 0: 10285.5. Samples: 449382400. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:20,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:49:24,488][1653645] Updated weights for policy 0, policy_version 877633 (0.0015) [2024-06-15 22:49:25,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 43147.8, 300 sec: 43209.4). Total num frames: 1797554176. Throughput: 0: 10717.9. Samples: 449464320. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:25,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:49:26,223][1653645] Updated weights for policy 0, policy_version 877728 (0.0117) [2024-06-15 22:49:28,468][1653645] Updated weights for policy 0, policy_version 877794 (0.0034) [2024-06-15 22:49:29,686][1651596] Signal inference workers to stop experience collection... (45600 times) [2024-06-15 22:49:29,726][1653645] InferenceWorker_p0-w0: stopping experience collection (45600 times) [2024-06-15 22:49:29,866][1651596] Signal inference workers to resume experience collection... (45600 times) [2024-06-15 22:49:29,867][1653645] InferenceWorker_p0-w0: resuming experience collection (45600 times) [2024-06-15 22:49:30,261][1653645] Updated weights for policy 0, policy_version 877872 (0.0014) [2024-06-15 22:49:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43690.6, 300 sec: 44431.2). Total num frames: 1797914624. Throughput: 0: 10502.8. Samples: 449516032. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:30,961][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 22:49:35,958][1648982] Fps is (10 sec: 36044.8, 60 sec: 42052.2, 300 sec: 43209.3). Total num frames: 1797914624. Throughput: 0: 10535.8. Samples: 449553408. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:35,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 22:49:37,520][1653645] Updated weights for policy 0, policy_version 877936 (0.0033) [2024-06-15 22:49:39,411][1653645] Updated weights for policy 0, policy_version 878016 (0.0131) [2024-06-15 22:49:40,958][1648982] Fps is (10 sec: 32767.9, 60 sec: 42598.5, 300 sec: 43764.7). Total num frames: 1798242304. Throughput: 0: 10923.8. Samples: 449620992. Policy #0 lag: (min: 15.0, avg: 69.5, max: 271.0) [2024-06-15 22:49:40,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:49:41,449][1653645] Updated weights for policy 0, policy_version 878080 (0.0012) [2024-06-15 22:49:42,888][1653645] Updated weights for policy 0, policy_version 878140 (0.0020) [2024-06-15 22:49:45,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 43694.1, 300 sec: 43653.6). Total num frames: 1798438912. Throughput: 0: 10399.3. Samples: 449675776. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:49:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:49:50,883][1653645] Updated weights for policy 0, policy_version 878211 (0.0012) [2024-06-15 22:49:50,957][1648982] Fps is (10 sec: 32768.5, 60 sec: 41506.2, 300 sec: 43098.3). Total num frames: 1798569984. Throughput: 0: 10888.6. Samples: 449720832. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:49:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:49:53,991][1653645] Updated weights for policy 0, policy_version 878326 (0.0109) [2024-06-15 22:49:55,566][1653645] Updated weights for policy 0, policy_version 878391 (0.0013) [2024-06-15 22:49:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 43694.0, 300 sec: 44098.0). Total num frames: 1798963200. Throughput: 0: 10262.7. Samples: 449763328. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:49:55,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:50:00,958][1648982] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1798963200. Throughput: 0: 10717.9. Samples: 449843712. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:50:02,872][1653645] Updated weights for policy 0, policy_version 878448 (0.0014) [2024-06-15 22:50:04,747][1653645] Updated weights for policy 0, policy_version 878515 (0.0012) [2024-06-15 22:50:05,958][1648982] Fps is (10 sec: 32767.3, 60 sec: 40413.7, 300 sec: 43764.8). Total num frames: 1799290880. Throughput: 0: 10990.8. Samples: 449876992. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:05,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:50:06,709][1653645] Updated weights for policy 0, policy_version 878596 (0.0141) [2024-06-15 22:50:07,848][1653645] Updated weights for policy 0, policy_version 878652 (0.0024) [2024-06-15 22:50:10,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 43320.4). Total num frames: 1799487488. Throughput: 0: 10399.3. Samples: 449932288. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:50:15,215][1651596] Signal inference workers to stop experience collection... (45650 times) [2024-06-15 22:50:15,295][1653645] InferenceWorker_p0-w0: stopping experience collection (45650 times) [2024-06-15 22:50:15,532][1651596] Signal inference workers to resume experience collection... (45650 times) [2024-06-15 22:50:15,533][1653645] InferenceWorker_p0-w0: resuming experience collection (45650 times) [2024-06-15 22:50:15,537][1653645] Updated weights for policy 0, policy_version 878720 (0.0016) [2024-06-15 22:50:15,958][1648982] Fps is (10 sec: 32769.2, 60 sec: 40960.0, 300 sec: 43098.3). Total num frames: 1799618560. Throughput: 0: 10911.3. Samples: 450007040. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:15,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:50:17,708][1653645] Updated weights for policy 0, policy_version 878800 (0.0121) [2024-06-15 22:50:19,040][1653645] Updated weights for policy 0, policy_version 878864 (0.0013) [2024-06-15 22:50:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 43875.8). Total num frames: 1800011776. Throughput: 0: 10513.1. Samples: 450026496. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:20,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:50:25,773][1653645] Updated weights for policy 0, policy_version 878916 (0.0013) [2024-06-15 22:50:25,957][1648982] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 42765.0). Total num frames: 1800044544. Throughput: 0: 10717.9. Samples: 450103296. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:25,959][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 22:50:27,891][1653645] Updated weights for policy 0, policy_version 879008 (0.0013) [2024-06-15 22:50:29,720][1653645] Updated weights for policy 0, policy_version 879088 (0.0013) [2024-06-15 22:50:30,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 44209.0). Total num frames: 1800470528. Throughput: 0: 10808.9. Samples: 450162176. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:50:31,467][1653645] Updated weights for policy 0, policy_version 879168 (0.0012) [2024-06-15 22:50:35,957][1648982] Fps is (10 sec: 49152.1, 60 sec: 43690.7, 300 sec: 43098.3). Total num frames: 1800536064. Throughput: 0: 10535.8. Samples: 450194944. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:50:39,274][1653645] Updated weights for policy 0, policy_version 879264 (0.0114) [2024-06-15 22:50:40,496][1653645] Updated weights for policy 0, policy_version 879328 (0.0013) [2024-06-15 22:50:40,975][1648982] Fps is (10 sec: 42524.0, 60 sec: 44224.0, 300 sec: 43873.2). Total num frames: 1800896512. Throughput: 0: 11282.4. Samples: 450271232. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:40,976][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:50:41,599][1653645] Updated weights for policy 0, policy_version 879376 (0.0021) [2024-06-15 22:50:45,957][1648982] Fps is (10 sec: 52428.8, 60 sec: 43690.8, 300 sec: 43875.8). Total num frames: 1801060352. Throughput: 0: 10899.9. Samples: 450334208. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:50:48,481][1653645] Updated weights for policy 0, policy_version 879442 (0.0014) [2024-06-15 22:50:50,756][1653645] Updated weights for policy 0, policy_version 879547 (0.0014) [2024-06-15 22:50:50,958][1648982] Fps is (10 sec: 42673.0, 60 sec: 45875.1, 300 sec: 43542.7). Total num frames: 1801322496. Throughput: 0: 11241.3. Samples: 450382848. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:50,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:50:51,666][1651596] Signal inference workers to stop experience collection... (45700 times) [2024-06-15 22:50:51,696][1653645] InferenceWorker_p0-w0: stopping experience collection (45700 times) [2024-06-15 22:50:51,909][1651596] Signal inference workers to resume experience collection... (45700 times) [2024-06-15 22:50:51,910][1653645] InferenceWorker_p0-w0: resuming experience collection (45700 times) [2024-06-15 22:50:52,542][1653645] Updated weights for policy 0, policy_version 879613 (0.0013) [2024-06-15 22:50:55,958][1648982] Fps is (10 sec: 52427.9, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1801584640. Throughput: 0: 11184.3. Samples: 450435584. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:50:55,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:50:55,963][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000879680_1801584640.pth... [2024-06-15 22:50:56,020][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000874560_1791098880.pth [2024-06-15 22:50:56,040][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000879680_1801584640.pth [2024-06-15 22:51:00,239][1653645] Updated weights for policy 0, policy_version 879696 (0.0014) [2024-06-15 22:51:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 45329.1, 300 sec: 43542.6). Total num frames: 1801682944. Throughput: 0: 11366.4. Samples: 450518528. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:00,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:51:01,615][1653645] Updated weights for policy 0, policy_version 879760 (0.0013) [2024-06-15 22:51:03,088][1653645] Updated weights for policy 0, policy_version 879824 (0.0014) [2024-06-15 22:51:04,652][1653645] Updated weights for policy 0, policy_version 879893 (0.0098) [2024-06-15 22:51:05,641][1653645] Updated weights for policy 0, policy_version 879935 (0.0015) [2024-06-15 22:51:05,958][1648982] Fps is (10 sec: 52429.4, 60 sec: 46967.7, 300 sec: 44431.2). Total num frames: 1802108928. Throughput: 0: 11571.2. Samples: 450547200. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:51:10,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 43690.6, 300 sec: 43764.7). Total num frames: 1802108928. Throughput: 0: 11366.3. Samples: 450614784. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:51:12,482][1653645] Updated weights for policy 0, policy_version 879994 (0.0012) [2024-06-15 22:51:13,820][1653645] Updated weights for policy 0, policy_version 880049 (0.0027) [2024-06-15 22:51:15,443][1653645] Updated weights for policy 0, policy_version 880102 (0.0013) [2024-06-15 22:51:15,957][1648982] Fps is (10 sec: 36045.0, 60 sec: 47513.6, 300 sec: 43875.9). Total num frames: 1802469376. Throughput: 0: 11446.1. Samples: 450677248. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:51:17,476][1653645] Updated weights for policy 0, policy_version 880187 (0.0015) [2024-06-15 22:51:20,957][1648982] Fps is (10 sec: 52430.2, 60 sec: 43690.8, 300 sec: 44320.1). Total num frames: 1802633216. Throughput: 0: 11264.0. Samples: 450701824. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:51:24,530][1653645] Updated weights for policy 0, policy_version 880244 (0.0013) [2024-06-15 22:51:25,958][1648982] Fps is (10 sec: 39320.9, 60 sec: 46967.3, 300 sec: 43542.6). Total num frames: 1802862592. Throughput: 0: 11325.3. Samples: 450780672. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:25,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:51:26,160][1653645] Updated weights for policy 0, policy_version 880314 (0.0099) [2024-06-15 22:51:28,585][1653645] Updated weights for policy 0, policy_version 880384 (0.0122) [2024-06-15 22:51:30,038][1653645] Updated weights for policy 0, policy_version 880444 (0.0012) [2024-06-15 22:51:30,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1803157504. Throughput: 0: 11047.8. Samples: 450831360. Policy #0 lag: (min: 47.0, avg: 200.0, max: 332.0) [2024-06-15 22:51:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:51:35,979][1648982] Fps is (10 sec: 32699.5, 60 sec: 44221.2, 300 sec: 43428.4). Total num frames: 1803190272. Throughput: 0: 10906.2. Samples: 450873856. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:51:35,979][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:51:36,797][1651596] Signal inference workers to stop experience collection... (45750 times) [2024-06-15 22:51:36,851][1653645] InferenceWorker_p0-w0: stopping experience collection (45750 times) [2024-06-15 22:51:37,086][1651596] Signal inference workers to resume experience collection... (45750 times) [2024-06-15 22:51:37,102][1653645] InferenceWorker_p0-w0: resuming experience collection (45750 times) [2024-06-15 22:51:37,536][1653645] Updated weights for policy 0, policy_version 880528 (0.0014) [2024-06-15 22:51:38,740][1653645] Updated weights for policy 0, policy_version 880576 (0.0021) [2024-06-15 22:51:40,958][1648982] Fps is (10 sec: 36044.0, 60 sec: 43703.3, 300 sec: 43875.8). Total num frames: 1803517952. Throughput: 0: 11081.9. Samples: 450934272. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:51:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:51:41,724][1653645] Updated weights for policy 0, policy_version 880656 (0.0036) [2024-06-15 22:51:42,881][1653645] Updated weights for policy 0, policy_version 880700 (0.0014) [2024-06-15 22:51:45,958][1648982] Fps is (10 sec: 49255.3, 60 sec: 43690.5, 300 sec: 43764.7). Total num frames: 1803681792. Throughput: 0: 10649.6. Samples: 450997760. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:51:45,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:51:49,608][1653645] Updated weights for policy 0, policy_version 880770 (0.0025) [2024-06-15 22:51:50,958][1648982] Fps is (10 sec: 39322.2, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 1803911168. Throughput: 0: 10911.3. Samples: 451038208. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:51:50,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:51:52,134][1653645] Updated weights for policy 0, policy_version 880833 (0.0014) [2024-06-15 22:51:54,176][1653645] Updated weights for policy 0, policy_version 880929 (0.0012) [2024-06-15 22:51:55,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1804206080. Throughput: 0: 10501.7. Samples: 451087360. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:51:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:52:00,261][1653645] Updated weights for policy 0, policy_version 880976 (0.0020) [2024-06-15 22:52:00,958][1648982] Fps is (10 sec: 36044.9, 60 sec: 43144.5, 300 sec: 43431.5). Total num frames: 1804271616. Throughput: 0: 10990.9. Samples: 451171840. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:00,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:52:02,242][1653645] Updated weights for policy 0, policy_version 881056 (0.0014) [2024-06-15 22:52:03,606][1653645] Updated weights for policy 0, policy_version 881104 (0.0014) [2024-06-15 22:52:05,757][1653645] Updated weights for policy 0, policy_version 881200 (0.0014) [2024-06-15 22:52:05,959][1648982] Fps is (10 sec: 49146.6, 60 sec: 43143.7, 300 sec: 44319.9). Total num frames: 1804697600. Throughput: 0: 10956.5. Samples: 451194880. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:52:10,958][1648982] Fps is (10 sec: 45875.2, 60 sec: 43690.8, 300 sec: 43653.6). Total num frames: 1804730368. Throughput: 0: 10763.4. Samples: 451265024. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:52:12,421][1653645] Updated weights for policy 0, policy_version 881248 (0.0012) [2024-06-15 22:52:14,458][1653645] Updated weights for policy 0, policy_version 881339 (0.0015) [2024-06-15 22:52:15,960][1648982] Fps is (10 sec: 39318.6, 60 sec: 43689.2, 300 sec: 43875.5). Total num frames: 1805090816. Throughput: 0: 11058.7. Samples: 451329024. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:15,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:52:16,330][1651596] Signal inference workers to stop experience collection... (45800 times) [2024-06-15 22:52:16,376][1653645] Updated weights for policy 0, policy_version 881409 (0.0013) [2024-06-15 22:52:16,390][1653645] InferenceWorker_p0-w0: stopping experience collection (45800 times) [2024-06-15 22:52:16,639][1651596] Signal inference workers to resume experience collection... (45800 times) [2024-06-15 22:52:16,640][1653645] InferenceWorker_p0-w0: resuming experience collection (45800 times) [2024-06-15 22:52:17,853][1653645] Updated weights for policy 0, policy_version 881472 (0.0012) [2024-06-15 22:52:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 43690.6, 300 sec: 44209.0). Total num frames: 1805254656. Throughput: 0: 10677.3. Samples: 451354112. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:52:25,115][1653645] Updated weights for policy 0, policy_version 881537 (0.0015) [2024-06-15 22:52:25,958][1648982] Fps is (10 sec: 36051.4, 60 sec: 43144.6, 300 sec: 43542.6). Total num frames: 1805451264. Throughput: 0: 11116.1. Samples: 451434496. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:52:26,760][1653645] Updated weights for policy 0, policy_version 881605 (0.0013) [2024-06-15 22:52:28,061][1653645] Updated weights for policy 0, policy_version 881661 (0.0012) [2024-06-15 22:52:30,963][1648982] Fps is (10 sec: 52398.5, 60 sec: 43686.4, 300 sec: 44430.3). Total num frames: 1805778944. Throughput: 0: 10932.6. Samples: 451489792. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:30,964][1648982] Avg episode reward: [(0, '37.160')] [2024-06-15 22:52:34,473][1653645] Updated weights for policy 0, policy_version 881731 (0.0012) [2024-06-15 22:52:35,958][1648982] Fps is (10 sec: 45874.6, 60 sec: 45344.8, 300 sec: 43764.8). Total num frames: 1805910016. Throughput: 0: 11127.4. Samples: 451538944. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:35,958][1648982] Avg episode reward: [(0, '37.200')] [2024-06-15 22:52:36,026][1653645] Updated weights for policy 0, policy_version 881794 (0.0012) [2024-06-15 22:52:39,011][1653645] Updated weights for policy 0, policy_version 881888 (0.0015) [2024-06-15 22:52:40,958][1648982] Fps is (10 sec: 45901.9, 60 sec: 45329.2, 300 sec: 44209.1). Total num frames: 1806237696. Throughput: 0: 11389.2. Samples: 451599872. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:52:41,193][1653645] Updated weights for policy 0, policy_version 881968 (0.0013) [2024-06-15 22:52:45,965][1648982] Fps is (10 sec: 39292.2, 60 sec: 43685.1, 300 sec: 43874.8). Total num frames: 1806303232. Throughput: 0: 11273.5. Samples: 451679232. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:45,968][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 22:52:46,400][1653645] Updated weights for policy 0, policy_version 882016 (0.0014) [2024-06-15 22:52:47,864][1653645] Updated weights for policy 0, policy_version 882080 (0.0013) [2024-06-15 22:52:50,228][1653645] Updated weights for policy 0, policy_version 882135 (0.0011) [2024-06-15 22:52:50,958][1648982] Fps is (10 sec: 42598.5, 60 sec: 45875.2, 300 sec: 43875.8). Total num frames: 1806663680. Throughput: 0: 11400.8. Samples: 451707904. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:50,958][1648982] Avg episode reward: [(0, '36.420')] [2024-06-15 22:52:51,838][1653645] Updated weights for policy 0, policy_version 882208 (0.0089) [2024-06-15 22:52:55,958][1648982] Fps is (10 sec: 52467.3, 60 sec: 43690.5, 300 sec: 44097.9). Total num frames: 1806827520. Throughput: 0: 11332.2. Samples: 451774976. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:52:55,959][1648982] Avg episode reward: [(0, '37.020')] [2024-06-15 22:52:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000882240_1806827520.pth... [2024-06-15 22:52:56,000][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000877120_1796341760.pth [2024-06-15 22:52:57,185][1653645] Updated weights for policy 0, policy_version 882241 (0.0020) [2024-06-15 22:52:59,319][1653645] Updated weights for policy 0, policy_version 882336 (0.0013) [2024-06-15 22:52:59,459][1651596] Signal inference workers to stop experience collection... (45850 times) [2024-06-15 22:52:59,491][1653645] InferenceWorker_p0-w0: stopping experience collection (45850 times) [2024-06-15 22:52:59,696][1651596] Signal inference workers to resume experience collection... (45850 times) [2024-06-15 22:52:59,697][1653645] InferenceWorker_p0-w0: resuming experience collection (45850 times) [2024-06-15 22:53:00,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 46967.4, 300 sec: 43764.7). Total num frames: 1807089664. Throughput: 0: 11503.4. Samples: 451846656. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:53:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:53:01,761][1653645] Updated weights for policy 0, policy_version 882391 (0.0014) [2024-06-15 22:53:02,863][1653645] Updated weights for policy 0, policy_version 882448 (0.0012) [2024-06-15 22:53:05,966][1648982] Fps is (10 sec: 52386.7, 60 sec: 44231.5, 300 sec: 44433.6). Total num frames: 1807351808. Throughput: 0: 11614.6. Samples: 451876864. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:53:05,969][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:53:08,354][1653645] Updated weights for policy 0, policy_version 882501 (0.0012) [2024-06-15 22:53:10,276][1653645] Updated weights for policy 0, policy_version 882581 (0.0139) [2024-06-15 22:53:10,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 44209.2). Total num frames: 1807581184. Throughput: 0: 11696.4. Samples: 451960832. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:53:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:53:12,479][1653645] Updated weights for policy 0, policy_version 882628 (0.0012) [2024-06-15 22:53:14,156][1653645] Updated weights for policy 0, policy_version 882707 (0.0167) [2024-06-15 22:53:14,834][1653645] Updated weights for policy 0, policy_version 882752 (0.0013) [2024-06-15 22:53:15,958][1648982] Fps is (10 sec: 52471.9, 60 sec: 46422.7, 300 sec: 44431.2). Total num frames: 1807876096. Throughput: 0: 11754.7. Samples: 452018688. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:53:15,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:53:20,719][1653645] Updated weights for policy 0, policy_version 882805 (0.0012) [2024-06-15 22:53:20,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 45875.2, 300 sec: 44209.7). Total num frames: 1808007168. Throughput: 0: 11628.1. Samples: 452062208. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:53:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:53:22,410][1653645] Updated weights for policy 0, policy_version 882877 (0.0015) [2024-06-15 22:53:24,968][1653645] Updated weights for policy 0, policy_version 882946 (0.0012) [2024-06-15 22:53:25,958][1648982] Fps is (10 sec: 45875.4, 60 sec: 48059.7, 300 sec: 44209.0). Total num frames: 1808334848. Throughput: 0: 11639.4. Samples: 452123648. Policy #0 lag: (min: 15.0, avg: 78.1, max: 271.0) [2024-06-15 22:53:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:53:26,254][1653645] Updated weights for policy 0, policy_version 883002 (0.0013) [2024-06-15 22:53:30,958][1648982] Fps is (10 sec: 42598.9, 60 sec: 44241.1, 300 sec: 44209.0). Total num frames: 1808433152. Throughput: 0: 11630.1. Samples: 452202496. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:53:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:53:31,149][1653645] Updated weights for policy 0, policy_version 883042 (0.0013) [2024-06-15 22:53:31,801][1653645] Updated weights for policy 0, policy_version 883072 (0.0010) [2024-06-15 22:53:35,490][1653645] Updated weights for policy 0, policy_version 883153 (0.0031) [2024-06-15 22:53:35,958][1648982] Fps is (10 sec: 39321.9, 60 sec: 46967.6, 300 sec: 44209.0). Total num frames: 1808728064. Throughput: 0: 11628.1. Samples: 452231168. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:53:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:53:36,720][1653645] Updated weights for policy 0, policy_version 883205 (0.0021) [2024-06-15 22:53:40,958][1648982] Fps is (10 sec: 49151.4, 60 sec: 44782.9, 300 sec: 44431.9). Total num frames: 1808924672. Throughput: 0: 11434.7. Samples: 452289536. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:53:40,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:53:42,096][1653645] Updated weights for policy 0, policy_version 883265 (0.0013) [2024-06-15 22:53:42,431][1651596] Signal inference workers to stop experience collection... (45900 times) [2024-06-15 22:53:42,463][1653645] InferenceWorker_p0-w0: stopping experience collection (45900 times) [2024-06-15 22:53:42,754][1651596] Signal inference workers to resume experience collection... (45900 times) [2024-06-15 22:53:42,755][1653645] InferenceWorker_p0-w0: resuming experience collection (45900 times) [2024-06-15 22:53:43,474][1653645] Updated weights for policy 0, policy_version 883326 (0.0011) [2024-06-15 22:53:45,827][1653645] Updated weights for policy 0, policy_version 883378 (0.0172) [2024-06-15 22:53:45,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 47519.7, 300 sec: 44320.1). Total num frames: 1809154048. Throughput: 0: 11582.6. Samples: 452367872. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:53:45,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:53:48,015][1653645] Updated weights for policy 0, policy_version 883446 (0.0012) [2024-06-15 22:53:49,288][1653645] Updated weights for policy 0, policy_version 883504 (0.0013) [2024-06-15 22:53:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 46421.3, 300 sec: 44431.9). Total num frames: 1809448960. Throughput: 0: 11493.7. Samples: 452393984. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:53:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:53:53,989][1653645] Updated weights for policy 0, policy_version 883568 (0.0091) [2024-06-15 22:53:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 45875.4, 300 sec: 44653.3). Total num frames: 1809580032. Throughput: 0: 11241.2. Samples: 452466688. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:53:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:53:56,936][1653645] Updated weights for policy 0, policy_version 883632 (0.0013) [2024-06-15 22:53:59,452][1653645] Updated weights for policy 0, policy_version 883681 (0.0018) [2024-06-15 22:54:00,543][1653645] Updated weights for policy 0, policy_version 883744 (0.0012) [2024-06-15 22:54:00,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 47513.6, 300 sec: 44320.1). Total num frames: 1809940480. Throughput: 0: 11468.8. Samples: 452534784. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:54:05,416][1653645] Updated weights for policy 0, policy_version 883835 (0.0128) [2024-06-15 22:54:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 45881.6, 300 sec: 44875.5). Total num frames: 1810104320. Throughput: 0: 11377.8. Samples: 452574208. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:54:08,707][1653645] Updated weights for policy 0, policy_version 883896 (0.0012) [2024-06-15 22:54:10,958][1648982] Fps is (10 sec: 32768.1, 60 sec: 44783.0, 300 sec: 44431.2). Total num frames: 1810268160. Throughput: 0: 11377.8. Samples: 452635648. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:10,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:54:11,136][1653645] Updated weights for policy 0, policy_version 883936 (0.0013) [2024-06-15 22:54:13,267][1653645] Updated weights for policy 0, policy_version 884029 (0.0014) [2024-06-15 22:54:15,958][1648982] Fps is (10 sec: 39321.4, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1810497536. Throughput: 0: 11161.6. Samples: 452704768. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:15,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:54:17,679][1653645] Updated weights for policy 0, policy_version 884096 (0.0012) [2024-06-15 22:54:20,958][1648982] Fps is (10 sec: 49151.8, 60 sec: 45875.2, 300 sec: 44764.4). Total num frames: 1810759680. Throughput: 0: 11229.9. Samples: 452736512. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:54:22,635][1653645] Updated weights for policy 0, policy_version 884166 (0.0015) [2024-06-15 22:54:24,781][1653645] Updated weights for policy 0, policy_version 884256 (0.0016) [2024-06-15 22:54:24,902][1651596] Signal inference workers to stop experience collection... (45950 times) [2024-06-15 22:54:24,936][1653645] InferenceWorker_p0-w0: stopping experience collection (45950 times) [2024-06-15 22:54:25,211][1651596] Signal inference workers to resume experience collection... (45950 times) [2024-06-15 22:54:25,211][1653645] InferenceWorker_p0-w0: resuming experience collection (45950 times) [2024-06-15 22:54:25,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 44782.9, 300 sec: 44431.2). Total num frames: 1811021824. Throughput: 0: 11241.2. Samples: 452795392. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:54:28,944][1653645] Updated weights for policy 0, policy_version 884304 (0.0012) [2024-06-15 22:54:30,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 45329.0, 300 sec: 44875.5). Total num frames: 1811152896. Throughput: 0: 10990.9. Samples: 452862464. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:54:31,972][1653645] Updated weights for policy 0, policy_version 884387 (0.0016) [2024-06-15 22:54:34,527][1653645] Updated weights for policy 0, policy_version 884432 (0.0054) [2024-06-15 22:54:35,958][1648982] Fps is (10 sec: 39322.1, 60 sec: 44782.9, 300 sec: 44653.3). Total num frames: 1811415040. Throughput: 0: 11355.0. Samples: 452904960. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:54:36,191][1653645] Updated weights for policy 0, policy_version 884496 (0.0013) [2024-06-15 22:54:40,575][1653645] Updated weights for policy 0, policy_version 884563 (0.0015) [2024-06-15 22:54:40,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.0, 300 sec: 44653.3). Total num frames: 1811611648. Throughput: 0: 11116.1. Samples: 452966912. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:54:43,112][1653645] Updated weights for policy 0, policy_version 884640 (0.0037) [2024-06-15 22:54:45,958][1648982] Fps is (10 sec: 39321.8, 60 sec: 44236.8, 300 sec: 44875.5). Total num frames: 1811808256. Throughput: 0: 10991.0. Samples: 453029376. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:54:47,632][1653645] Updated weights for policy 0, policy_version 884705 (0.0014) [2024-06-15 22:54:49,787][1653645] Updated weights for policy 0, policy_version 884794 (0.0013) [2024-06-15 22:54:50,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 43690.7, 300 sec: 44431.2). Total num frames: 1812070400. Throughput: 0: 10808.9. Samples: 453060608. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:54:55,805][1653645] Updated weights for policy 0, policy_version 884896 (0.0013) [2024-06-15 22:54:55,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44783.0, 300 sec: 45097.6). Total num frames: 1812267008. Throughput: 0: 10843.0. Samples: 453123584. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:54:55,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:54:56,476][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000884928_1812332544.pth... [2024-06-15 22:54:56,532][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000879680_1801584640.pth [2024-06-15 22:55:00,268][1653645] Updated weights for policy 0, policy_version 884965 (0.0014) [2024-06-15 22:55:00,958][1648982] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 44653.4). Total num frames: 1812463616. Throughput: 0: 10752.0. Samples: 453188608. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:55:00,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:55:01,659][1653645] Updated weights for policy 0, policy_version 885024 (0.0026) [2024-06-15 22:55:05,042][1653645] Updated weights for policy 0, policy_version 885108 (0.0014) [2024-06-15 22:55:05,958][1648982] Fps is (10 sec: 45875.6, 60 sec: 43690.7, 300 sec: 44875.5). Total num frames: 1812725760. Throughput: 0: 10843.0. Samples: 453224448. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:55:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:55:06,950][1653645] Updated weights for policy 0, policy_version 885122 (0.0017) [2024-06-15 22:55:08,217][1653645] Updated weights for policy 0, policy_version 885182 (0.0012) [2024-06-15 22:55:10,957][1648982] Fps is (10 sec: 39322.0, 60 sec: 43144.6, 300 sec: 44875.5). Total num frames: 1812856832. Throughput: 0: 10979.6. Samples: 453289472. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:55:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:55:13,043][1653645] Updated weights for policy 0, policy_version 885254 (0.0012) [2024-06-15 22:55:13,608][1651596] Signal inference workers to stop experience collection... (46000 times) [2024-06-15 22:55:13,654][1653645] InferenceWorker_p0-w0: stopping experience collection (46000 times) [2024-06-15 22:55:13,808][1651596] Signal inference workers to resume experience collection... (46000 times) [2024-06-15 22:55:13,809][1653645] InferenceWorker_p0-w0: resuming experience collection (46000 times) [2024-06-15 22:55:15,544][1653645] Updated weights for policy 0, policy_version 885328 (0.0144) [2024-06-15 22:55:15,958][1648982] Fps is (10 sec: 42597.6, 60 sec: 44236.7, 300 sec: 44542.2). Total num frames: 1813151744. Throughput: 0: 10934.0. Samples: 453354496. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:55:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:55:16,681][1653645] Updated weights for policy 0, policy_version 885371 (0.0016) [2024-06-15 22:55:19,868][1653645] Updated weights for policy 0, policy_version 885408 (0.0012) [2024-06-15 22:55:20,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 43690.7, 300 sec: 45208.7). Total num frames: 1813381120. Throughput: 0: 10820.3. Samples: 453391872. Policy #0 lag: (min: 15.0, avg: 92.5, max: 271.0) [2024-06-15 22:55:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:55:22,821][1653645] Updated weights for policy 0, policy_version 885462 (0.0017) [2024-06-15 22:55:25,096][1653645] Updated weights for policy 0, policy_version 885536 (0.0012) [2024-06-15 22:55:25,958][1648982] Fps is (10 sec: 49151.2, 60 sec: 43690.5, 300 sec: 44653.3). Total num frames: 1813643264. Throughput: 0: 11093.3. Samples: 453466112. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:25,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:55:27,094][1653645] Updated weights for policy 0, policy_version 885600 (0.0016) [2024-06-15 22:55:30,958][1648982] Fps is (10 sec: 42598.4, 60 sec: 44236.8, 300 sec: 44986.6). Total num frames: 1813807104. Throughput: 0: 11093.3. Samples: 453528576. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:30,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 22:55:30,972][1653645] Updated weights for policy 0, policy_version 885664 (0.0011) [2024-06-15 22:55:34,535][1653645] Updated weights for policy 0, policy_version 885729 (0.0013) [2024-06-15 22:55:35,958][1648982] Fps is (10 sec: 39322.7, 60 sec: 43690.6, 300 sec: 44544.9). Total num frames: 1814036480. Throughput: 0: 11184.3. Samples: 453563904. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:55:36,974][1653645] Updated weights for policy 0, policy_version 885792 (0.0101) [2024-06-15 22:55:39,148][1653645] Updated weights for policy 0, policy_version 885880 (0.0014) [2024-06-15 22:55:40,958][1648982] Fps is (10 sec: 49152.0, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1814298624. Throughput: 0: 11070.6. Samples: 453621760. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:55:43,324][1653645] Updated weights for policy 0, policy_version 885936 (0.0048) [2024-06-15 22:55:45,958][1648982] Fps is (10 sec: 42598.3, 60 sec: 44236.7, 300 sec: 44542.3). Total num frames: 1814462464. Throughput: 0: 11264.0. Samples: 453695488. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:45,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 22:55:46,572][1653645] Updated weights for policy 0, policy_version 886000 (0.0011) [2024-06-15 22:55:48,780][1653645] Updated weights for policy 0, policy_version 886048 (0.0013) [2024-06-15 22:55:50,958][1648982] Fps is (10 sec: 49152.1, 60 sec: 45329.1, 300 sec: 44764.4). Total num frames: 1814790144. Throughput: 0: 11332.3. Samples: 453734400. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:55:50,995][1653645] Updated weights for policy 0, policy_version 886135 (0.0013) [2024-06-15 22:55:54,608][1653645] Updated weights for policy 0, policy_version 886200 (0.0128) [2024-06-15 22:55:55,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 44783.0, 300 sec: 44986.6). Total num frames: 1814953984. Throughput: 0: 11207.1. Samples: 453793792. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:55:55,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:55:58,338][1653645] Updated weights for policy 0, policy_version 886266 (0.0016) [2024-06-15 22:56:00,569][1651596] Signal inference workers to stop experience collection... (46050 times) [2024-06-15 22:56:00,629][1653645] InferenceWorker_p0-w0: stopping experience collection (46050 times) [2024-06-15 22:56:00,785][1651596] Signal inference workers to resume experience collection... (46050 times) [2024-06-15 22:56:00,785][1653645] InferenceWorker_p0-w0: resuming experience collection (46050 times) [2024-06-15 22:56:00,960][1648982] Fps is (10 sec: 39321.2, 60 sec: 45329.0, 300 sec: 44320.1). Total num frames: 1815183360. Throughput: 0: 11468.8. Samples: 453870592. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:00,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 22:56:01,120][1653645] Updated weights for policy 0, policy_version 886336 (0.0013) [2024-06-15 22:56:05,228][1653645] Updated weights for policy 0, policy_version 886401 (0.0021) [2024-06-15 22:56:05,958][1648982] Fps is (10 sec: 45874.8, 60 sec: 44782.8, 300 sec: 45097.7). Total num frames: 1815412736. Throughput: 0: 11184.3. Samples: 453895168. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:05,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:56:06,168][1653645] Updated weights for policy 0, policy_version 886452 (0.0013) [2024-06-15 22:56:09,983][1653645] Updated weights for policy 0, policy_version 886496 (0.0015) [2024-06-15 22:56:10,958][1648982] Fps is (10 sec: 42599.2, 60 sec: 45875.2, 300 sec: 44542.3). Total num frames: 1815609344. Throughput: 0: 11264.1. Samples: 453972992. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:56:11,698][1653645] Updated weights for policy 0, policy_version 886531 (0.0013) [2024-06-15 22:56:13,144][1653645] Updated weights for policy 0, policy_version 886592 (0.0013) [2024-06-15 22:56:15,958][1648982] Fps is (10 sec: 45875.5, 60 sec: 45329.2, 300 sec: 44875.5). Total num frames: 1815871488. Throughput: 0: 11229.9. Samples: 454033920. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:15,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:56:16,168][1653645] Updated weights for policy 0, policy_version 886658 (0.0016) [2024-06-15 22:56:20,618][1653645] Updated weights for policy 0, policy_version 886721 (0.0013) [2024-06-15 22:56:20,958][1648982] Fps is (10 sec: 42597.8, 60 sec: 44236.8, 300 sec: 44653.4). Total num frames: 1816035328. Throughput: 0: 11173.0. Samples: 454066688. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:56:21,849][1653645] Updated weights for policy 0, policy_version 886779 (0.0013) [2024-06-15 22:56:24,538][1653645] Updated weights for policy 0, policy_version 886833 (0.0014) [2024-06-15 22:56:25,958][1648982] Fps is (10 sec: 45875.1, 60 sec: 44783.1, 300 sec: 44653.3). Total num frames: 1816330240. Throughput: 0: 11423.3. Samples: 454135808. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:25,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 22:56:27,995][1653645] Updated weights for policy 0, policy_version 886913 (0.0015) [2024-06-15 22:56:29,463][1653645] Updated weights for policy 0, policy_version 886973 (0.0012) [2024-06-15 22:56:30,958][1648982] Fps is (10 sec: 49151.9, 60 sec: 45329.0, 300 sec: 45211.9). Total num frames: 1816526848. Throughput: 0: 11184.4. Samples: 454198784. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:56:33,370][1653645] Updated weights for policy 0, policy_version 887010 (0.0012) [2024-06-15 22:56:35,958][1648982] Fps is (10 sec: 36043.9, 60 sec: 44236.6, 300 sec: 44653.3). Total num frames: 1816690688. Throughput: 0: 11127.4. Samples: 454235136. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:56:36,262][1653645] Updated weights for policy 0, policy_version 887072 (0.0012) [2024-06-15 22:56:38,318][1653645] Updated weights for policy 0, policy_version 887164 (0.0013) [2024-06-15 22:56:40,958][1648982] Fps is (10 sec: 49152.3, 60 sec: 45329.1, 300 sec: 45208.7). Total num frames: 1817018368. Throughput: 0: 11184.4. Samples: 454297088. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 22:56:40,987][1653645] Updated weights for policy 0, policy_version 887226 (0.0013) [2024-06-15 22:56:45,848][1653645] Updated weights for policy 0, policy_version 887288 (0.0017) [2024-06-15 22:56:45,958][1648982] Fps is (10 sec: 45876.2, 60 sec: 44782.9, 300 sec: 44875.5). Total num frames: 1817149440. Throughput: 0: 10945.4. Samples: 454363136. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:45,958][1648982] Avg episode reward: [(0, '37.250')] [2024-06-15 22:56:47,723][1651596] Signal inference workers to stop experience collection... (46100 times) [2024-06-15 22:56:47,774][1653645] InferenceWorker_p0-w0: stopping experience collection (46100 times) [2024-06-15 22:56:47,973][1651596] Signal inference workers to resume experience collection... (46100 times) [2024-06-15 22:56:47,974][1653645] InferenceWorker_p0-w0: resuming experience collection (46100 times) [2024-06-15 22:56:48,508][1653645] Updated weights for policy 0, policy_version 887335 (0.0013) [2024-06-15 22:56:49,714][1653645] Updated weights for policy 0, policy_version 887392 (0.0016) [2024-06-15 22:56:50,958][1648982] Fps is (10 sec: 42597.0, 60 sec: 44236.5, 300 sec: 44875.5). Total num frames: 1817444352. Throughput: 0: 11218.4. Samples: 454400000. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:56:52,714][1653645] Updated weights for policy 0, policy_version 887449 (0.0013) [2024-06-15 22:56:55,958][1648982] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 45097.6). Total num frames: 1817575424. Throughput: 0: 11013.6. Samples: 454468608. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:56:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:56:55,996][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000887488_1817575424.pth... [2024-06-15 22:56:56,288][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000882240_1806827520.pth [2024-06-15 22:56:56,352][1653645] Updated weights for policy 0, policy_version 887489 (0.0017) [2024-06-15 22:56:59,118][1653645] Updated weights for policy 0, policy_version 887568 (0.0118) [2024-06-15 22:57:00,356][1653645] Updated weights for policy 0, policy_version 887632 (0.0015) [2024-06-15 22:57:00,958][1648982] Fps is (10 sec: 45876.6, 60 sec: 45329.1, 300 sec: 44764.6). Total num frames: 1817903104. Throughput: 0: 11070.6. Samples: 454532096. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:57:00,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:57:01,297][1653645] Updated weights for policy 0, policy_version 887675 (0.0017) [2024-06-15 22:57:04,559][1653645] Updated weights for policy 0, policy_version 887713 (0.0015) [2024-06-15 22:57:05,958][1648982] Fps is (10 sec: 52429.6, 60 sec: 44783.0, 300 sec: 45319.8). Total num frames: 1818099712. Throughput: 0: 11218.5. Samples: 454571520. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:57:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:57:07,833][1653645] Updated weights for policy 0, policy_version 887761 (0.0013) [2024-06-15 22:57:10,464][1653645] Updated weights for policy 0, policy_version 887840 (0.0013) [2024-06-15 22:57:10,958][1648982] Fps is (10 sec: 42596.2, 60 sec: 45328.6, 300 sec: 44875.7). Total num frames: 1818329088. Throughput: 0: 11286.6. Samples: 454643712. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:57:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:57:11,501][1653645] Updated weights for policy 0, policy_version 887891 (0.0081) [2024-06-15 22:57:14,178][1653645] Updated weights for policy 0, policy_version 887952 (0.0067) [2024-06-15 22:57:15,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 45875.2, 300 sec: 45319.8). Total num frames: 1818624000. Throughput: 0: 11673.6. Samples: 454724096. Policy #0 lag: (min: 34.0, avg: 125.7, max: 290.0) [2024-06-15 22:57:15,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:57:18,142][1653645] Updated weights for policy 0, policy_version 888032 (0.0014) [2024-06-15 22:57:20,032][1653645] Updated weights for policy 0, policy_version 888082 (0.0036) [2024-06-15 22:57:20,958][1648982] Fps is (10 sec: 55708.6, 60 sec: 47513.6, 300 sec: 45542.0). Total num frames: 1818886144. Throughput: 0: 11810.2. Samples: 454766592. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 22:57:22,236][1653645] Updated weights for policy 0, policy_version 888192 (0.0014) [2024-06-15 22:57:24,955][1653645] Updated weights for policy 0, policy_version 888248 (0.0015) [2024-06-15 22:57:25,958][1648982] Fps is (10 sec: 52426.9, 60 sec: 46967.3, 300 sec: 45320.7). Total num frames: 1819148288. Throughput: 0: 12049.0. Samples: 454839296. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:57:27,501][1651596] Signal inference workers to stop experience collection... (46150 times) [2024-06-15 22:57:27,528][1653645] InferenceWorker_p0-w0: stopping experience collection (46150 times) [2024-06-15 22:57:27,649][1651596] Signal inference workers to resume experience collection... (46150 times) [2024-06-15 22:57:27,649][1653645] InferenceWorker_p0-w0: resuming experience collection (46150 times) [2024-06-15 22:57:28,029][1653645] Updated weights for policy 0, policy_version 888311 (0.0015) [2024-06-15 22:57:29,042][1653645] Updated weights for policy 0, policy_version 888339 (0.0012) [2024-06-15 22:57:30,670][1653645] Updated weights for policy 0, policy_version 888421 (0.0102) [2024-06-15 22:57:30,958][1648982] Fps is (10 sec: 62259.1, 60 sec: 49698.2, 300 sec: 46097.4). Total num frames: 1819508736. Throughput: 0: 12572.5. Samples: 454928896. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:57:33,492][1653645] Updated weights for policy 0, policy_version 888466 (0.0019) [2024-06-15 22:57:35,958][1648982] Fps is (10 sec: 52430.6, 60 sec: 49698.4, 300 sec: 45542.0). Total num frames: 1819672576. Throughput: 0: 12618.0. Samples: 454967808. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 22:57:36,199][1653645] Updated weights for policy 0, policy_version 888528 (0.0013) [2024-06-15 22:57:38,424][1653645] Updated weights for policy 0, policy_version 888592 (0.0208) [2024-06-15 22:57:40,097][1653645] Updated weights for policy 0, policy_version 888673 (0.0039) [2024-06-15 22:57:40,958][1648982] Fps is (10 sec: 55705.5, 60 sec: 50790.4, 300 sec: 46654.0). Total num frames: 1820065792. Throughput: 0: 12959.3. Samples: 455051776. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:40,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:57:43,245][1653645] Updated weights for policy 0, policy_version 888765 (0.0018) [2024-06-15 22:57:45,958][1648982] Fps is (10 sec: 55704.4, 60 sec: 51336.4, 300 sec: 45986.2). Total num frames: 1820229632. Throughput: 0: 13528.1. Samples: 455140864. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:45,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:57:46,356][1653645] Updated weights for policy 0, policy_version 888816 (0.0012) [2024-06-15 22:57:47,902][1653645] Updated weights for policy 0, policy_version 888892 (0.0011) [2024-06-15 22:57:50,094][1653645] Updated weights for policy 0, policy_version 888947 (0.0021) [2024-06-15 22:57:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 52429.1, 300 sec: 46652.8). Total num frames: 1820590080. Throughput: 0: 13596.5. Samples: 455183360. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:57:52,121][1653645] Updated weights for policy 0, policy_version 888978 (0.0049) [2024-06-15 22:57:54,957][1653645] Updated weights for policy 0, policy_version 889040 (0.0013) [2024-06-15 22:57:55,958][1648982] Fps is (10 sec: 62258.2, 60 sec: 54613.1, 300 sec: 46652.7). Total num frames: 1820852224. Throughput: 0: 13915.1. Samples: 455269888. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:57:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:57:56,972][1653645] Updated weights for policy 0, policy_version 889127 (0.0078) [2024-06-15 22:57:59,351][1653645] Updated weights for policy 0, policy_version 889187 (0.0013) [2024-06-15 22:58:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 53521.1, 300 sec: 46654.1). Total num frames: 1821114368. Throughput: 0: 13880.9. Samples: 455348736. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 22:58:02,088][1653645] Updated weights for policy 0, policy_version 889248 (0.0012) [2024-06-15 22:58:04,743][1651596] Signal inference workers to stop experience collection... (46200 times) [2024-06-15 22:58:04,810][1653645] InferenceWorker_p0-w0: stopping experience collection (46200 times) [2024-06-15 22:58:04,923][1651596] Signal inference workers to resume experience collection... (46200 times) [2024-06-15 22:58:04,924][1653645] InferenceWorker_p0-w0: resuming experience collection (46200 times) [2024-06-15 22:58:04,927][1653645] Updated weights for policy 0, policy_version 889328 (0.0014) [2024-06-15 22:58:05,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 54613.3, 300 sec: 46763.8). Total num frames: 1821376512. Throughput: 0: 13801.2. Samples: 455387648. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:58:06,853][1653645] Updated weights for policy 0, policy_version 889377 (0.0031) [2024-06-15 22:58:09,230][1653645] Updated weights for policy 0, policy_version 889466 (0.0012) [2024-06-15 22:58:10,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 55159.9, 300 sec: 46652.8). Total num frames: 1821638656. Throughput: 0: 14028.9. Samples: 455470592. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 22:58:12,401][1653645] Updated weights for policy 0, policy_version 889529 (0.0014) [2024-06-15 22:58:14,825][1653645] Updated weights for policy 0, policy_version 889572 (0.0012) [2024-06-15 22:58:15,958][1648982] Fps is (10 sec: 52429.2, 60 sec: 54613.3, 300 sec: 47097.1). Total num frames: 1821900800. Throughput: 0: 13664.7. Samples: 455543808. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:15,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:58:16,637][1653645] Updated weights for policy 0, policy_version 889621 (0.0015) [2024-06-15 22:58:17,208][1653645] Updated weights for policy 0, policy_version 889661 (0.0011) [2024-06-15 22:58:18,999][1653645] Updated weights for policy 0, policy_version 889723 (0.0012) [2024-06-15 22:58:20,958][1648982] Fps is (10 sec: 55704.6, 60 sec: 55159.2, 300 sec: 46986.0). Total num frames: 1822195712. Throughput: 0: 13880.8. Samples: 455592448. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:20,959][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 22:58:21,398][1653645] Updated weights for policy 0, policy_version 889766 (0.0012) [2024-06-15 22:58:23,380][1653645] Updated weights for policy 0, policy_version 889808 (0.0012) [2024-06-15 22:58:25,528][1653645] Updated weights for policy 0, policy_version 889888 (0.0098) [2024-06-15 22:58:25,958][1648982] Fps is (10 sec: 62258.4, 60 sec: 56251.9, 300 sec: 47763.5). Total num frames: 1822523392. Throughput: 0: 13949.1. Samples: 455679488. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:25,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:58:27,941][1653645] Updated weights for policy 0, policy_version 889925 (0.0015) [2024-06-15 22:58:28,874][1653645] Updated weights for policy 0, policy_version 889984 (0.0017) [2024-06-15 22:58:30,646][1653645] Updated weights for policy 0, policy_version 890039 (0.0011) [2024-06-15 22:58:30,958][1648982] Fps is (10 sec: 62260.9, 60 sec: 55159.5, 300 sec: 47763.5). Total num frames: 1822818304. Throughput: 0: 13744.4. Samples: 455759360. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:58:33,815][1653645] Updated weights for policy 0, policy_version 890096 (0.0034) [2024-06-15 22:58:35,508][1653645] Updated weights for policy 0, policy_version 890176 (0.0013) [2024-06-15 22:58:35,958][1648982] Fps is (10 sec: 55705.6, 60 sec: 56797.7, 300 sec: 47985.7). Total num frames: 1823080448. Throughput: 0: 13824.0. Samples: 455805440. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:35,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 22:58:38,943][1653645] Updated weights for policy 0, policy_version 890231 (0.0015) [2024-06-15 22:58:40,067][1653645] Updated weights for policy 0, policy_version 890272 (0.0013) [2024-06-15 22:58:40,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 54613.3, 300 sec: 48096.7). Total num frames: 1823342592. Throughput: 0: 13585.1. Samples: 455881216. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 22:58:42,687][1653645] Updated weights for policy 0, policy_version 890307 (0.0011) [2024-06-15 22:58:43,684][1653645] Updated weights for policy 0, policy_version 890361 (0.0024) [2024-06-15 22:58:44,550][1651596] Signal inference workers to stop experience collection... (46250 times) [2024-06-15 22:58:44,575][1653645] InferenceWorker_p0-w0: stopping experience collection (46250 times) [2024-06-15 22:58:44,697][1651596] Signal inference workers to resume experience collection... (46250 times) [2024-06-15 22:58:44,698][1653645] InferenceWorker_p0-w0: resuming experience collection (46250 times) [2024-06-15 22:58:45,363][1653645] Updated weights for policy 0, policy_version 890419 (0.0111) [2024-06-15 22:58:45,957][1648982] Fps is (10 sec: 52430.3, 60 sec: 56252.0, 300 sec: 47985.7). Total num frames: 1823604736. Throughput: 0: 13528.2. Samples: 455957504. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:58:48,490][1653645] Updated weights for policy 0, policy_version 890469 (0.0013) [2024-06-15 22:58:49,492][1653645] Updated weights for policy 0, policy_version 890515 (0.0013) [2024-06-15 22:58:50,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 54613.0, 300 sec: 48430.0). Total num frames: 1823866880. Throughput: 0: 13607.8. Samples: 456000000. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:50,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 22:58:52,671][1653645] Updated weights for policy 0, policy_version 890561 (0.0070) [2024-06-15 22:58:53,669][1653645] Updated weights for policy 0, policy_version 890624 (0.0016) [2024-06-15 22:58:54,902][1653645] Updated weights for policy 0, policy_version 890686 (0.0100) [2024-06-15 22:58:55,975][1648982] Fps is (10 sec: 52336.4, 60 sec: 54597.7, 300 sec: 48093.9). Total num frames: 1824129024. Throughput: 0: 13466.1. Samples: 456076800. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:58:55,976][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 22:58:55,981][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000890688_1824129024.pth... [2024-06-15 22:58:56,036][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000884928_1812332544.pth [2024-06-15 22:58:59,245][1653645] Updated weights for policy 0, policy_version 890753 (0.0099) [2024-06-15 22:59:00,093][1653645] Updated weights for policy 0, policy_version 890809 (0.0010) [2024-06-15 22:59:00,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 54613.3, 300 sec: 48430.0). Total num frames: 1824391168. Throughput: 0: 13687.5. Samples: 456159744. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:59:00,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 22:59:03,004][1653645] Updated weights for policy 0, policy_version 890866 (0.0011) [2024-06-15 22:59:04,895][1653645] Updated weights for policy 0, policy_version 890915 (0.0014) [2024-06-15 22:59:05,958][1648982] Fps is (10 sec: 52521.0, 60 sec: 54613.4, 300 sec: 48763.2). Total num frames: 1824653312. Throughput: 0: 13482.8. Samples: 456199168. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:59:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:59:07,397][1653645] Updated weights for policy 0, policy_version 890964 (0.0012) [2024-06-15 22:59:08,481][1653645] Updated weights for policy 0, policy_version 891024 (0.0011) [2024-06-15 22:59:10,958][1648982] Fps is (10 sec: 52427.7, 60 sec: 54613.2, 300 sec: 48874.3). Total num frames: 1824915456. Throughput: 0: 13471.3. Samples: 456285696. Policy #0 lag: (min: 1.0, avg: 96.3, max: 257.0) [2024-06-15 22:59:10,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:59:11,279][1653645] Updated weights for policy 0, policy_version 891090 (0.0013) [2024-06-15 22:59:12,734][1653645] Updated weights for policy 0, policy_version 891138 (0.0021) [2024-06-15 22:59:13,778][1653645] Updated weights for policy 0, policy_version 891200 (0.0012) [2024-06-15 22:59:15,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 54613.4, 300 sec: 48874.3). Total num frames: 1825177600. Throughput: 0: 13619.2. Samples: 456372224. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:15,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 22:59:18,153][1653645] Updated weights for policy 0, policy_version 891280 (0.0012) [2024-06-15 22:59:20,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 54067.3, 300 sec: 48874.3). Total num frames: 1825439744. Throughput: 0: 13471.3. Samples: 456411648. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:20,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 22:59:21,097][1653645] Updated weights for policy 0, policy_version 891333 (0.0013) [2024-06-15 22:59:21,954][1651596] Signal inference workers to stop experience collection... (46300 times) [2024-06-15 22:59:21,994][1653645] InferenceWorker_p0-w0: stopping experience collection (46300 times) [2024-06-15 22:59:22,210][1651596] Signal inference workers to resume experience collection... (46300 times) [2024-06-15 22:59:22,211][1653645] InferenceWorker_p0-w0: resuming experience collection (46300 times) [2024-06-15 22:59:22,357][1653645] Updated weights for policy 0, policy_version 891394 (0.0013) [2024-06-15 22:59:23,586][1653645] Updated weights for policy 0, policy_version 891456 (0.0013) [2024-06-15 22:59:25,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 52974.9, 300 sec: 49318.6). Total num frames: 1825701888. Throughput: 0: 13437.1. Samples: 456485888. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:59:28,246][1653645] Updated weights for policy 0, policy_version 891536 (0.0011) [2024-06-15 22:59:30,958][1648982] Fps is (10 sec: 52430.2, 60 sec: 52428.9, 300 sec: 49318.6). Total num frames: 1825964032. Throughput: 0: 13539.5. Samples: 456566784. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:59:31,766][1653645] Updated weights for policy 0, policy_version 891600 (0.0012) [2024-06-15 22:59:32,991][1653645] Updated weights for policy 0, policy_version 891656 (0.0010) [2024-06-15 22:59:35,958][1648982] Fps is (10 sec: 52429.8, 60 sec: 52428.9, 300 sec: 49540.8). Total num frames: 1826226176. Throughput: 0: 13551.0. Samples: 456609792. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 22:59:36,498][1653645] Updated weights for policy 0, policy_version 891715 (0.0013) [2024-06-15 22:59:38,336][1653645] Updated weights for policy 0, policy_version 891792 (0.0015) [2024-06-15 22:59:38,989][1653645] Updated weights for policy 0, policy_version 891837 (0.0012) [2024-06-15 22:59:40,958][1648982] Fps is (10 sec: 55705.2, 60 sec: 52975.0, 300 sec: 49874.0). Total num frames: 1826521088. Throughput: 0: 13624.5. Samples: 456689664. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:40,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 22:59:41,674][1653645] Updated weights for policy 0, policy_version 891889 (0.0012) [2024-06-15 22:59:43,107][1653645] Updated weights for policy 0, policy_version 891967 (0.0012) [2024-06-15 22:59:45,958][1648982] Fps is (10 sec: 62259.3, 60 sec: 54067.1, 300 sec: 50096.2). Total num frames: 1826848768. Throughput: 0: 13710.2. Samples: 456776704. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:45,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 22:59:47,952][1653645] Updated weights for policy 0, policy_version 892035 (0.0012) [2024-06-15 22:59:49,167][1653645] Updated weights for policy 0, policy_version 892096 (0.0011) [2024-06-15 22:59:50,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 52975.2, 300 sec: 50096.2). Total num frames: 1827045376. Throughput: 0: 13630.6. Samples: 456812544. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 22:59:51,443][1653645] Updated weights for policy 0, policy_version 892158 (0.0012) [2024-06-15 22:59:53,103][1653645] Updated weights for policy 0, policy_version 892222 (0.0015) [2024-06-15 22:59:55,784][1653645] Updated weights for policy 0, policy_version 892275 (0.0014) [2024-06-15 22:59:55,958][1648982] Fps is (10 sec: 55705.1, 60 sec: 54629.2, 300 sec: 50651.6). Total num frames: 1827405824. Throughput: 0: 13585.1. Samples: 456897024. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 22:59:55,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 22:59:58,195][1653645] Updated weights for policy 0, policy_version 892321 (0.0012) [2024-06-15 22:59:59,802][1653645] Updated weights for policy 0, policy_version 892368 (0.0019) [2024-06-15 22:59:59,925][1651596] Signal inference workers to stop experience collection... (46350 times) [2024-06-15 22:59:59,962][1653645] InferenceWorker_p0-w0: stopping experience collection (46350 times) [2024-06-15 23:00:00,105][1651596] Signal inference workers to resume experience collection... (46350 times) [2024-06-15 23:00:00,106][1653645] InferenceWorker_p0-w0: resuming experience collection (46350 times) [2024-06-15 23:00:00,958][1648982] Fps is (10 sec: 62259.6, 60 sec: 54613.4, 300 sec: 50651.6). Total num frames: 1827667968. Throughput: 0: 13414.4. Samples: 456975872. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:00:02,070][1653645] Updated weights for policy 0, policy_version 892433 (0.0015) [2024-06-15 23:00:04,563][1653645] Updated weights for policy 0, policy_version 892496 (0.0015) [2024-06-15 23:00:05,305][1653645] Updated weights for policy 0, policy_version 892540 (0.0013) [2024-06-15 23:00:05,957][1648982] Fps is (10 sec: 52429.8, 60 sec: 54613.4, 300 sec: 51095.9). Total num frames: 1827930112. Throughput: 0: 13505.5. Samples: 457019392. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:00:08,150][1653645] Updated weights for policy 0, policy_version 892592 (0.0021) [2024-06-15 23:00:09,746][1653645] Updated weights for policy 0, policy_version 892656 (0.0013) [2024-06-15 23:00:10,958][1648982] Fps is (10 sec: 52428.2, 60 sec: 54613.5, 300 sec: 50984.8). Total num frames: 1828192256. Throughput: 0: 13755.8. Samples: 457104896. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:00:11,709][1653645] Updated weights for policy 0, policy_version 892707 (0.0010) [2024-06-15 23:00:12,144][1653645] Updated weights for policy 0, policy_version 892734 (0.0010) [2024-06-15 23:00:14,047][1653645] Updated weights for policy 0, policy_version 892784 (0.0014) [2024-06-15 23:00:15,958][1648982] Fps is (10 sec: 52427.2, 60 sec: 54613.1, 300 sec: 51095.8). Total num frames: 1828454400. Throughput: 0: 13755.7. Samples: 457185792. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:15,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 23:00:18,306][1653645] Updated weights for policy 0, policy_version 892863 (0.0012) [2024-06-15 23:00:19,708][1653645] Updated weights for policy 0, policy_version 892918 (0.0022) [2024-06-15 23:00:20,958][1648982] Fps is (10 sec: 58982.9, 60 sec: 55705.8, 300 sec: 51318.1). Total num frames: 1828782080. Throughput: 0: 13664.7. Samples: 457224704. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:00:20,967][1653645] Updated weights for policy 0, policy_version 892964 (0.0014) [2024-06-15 23:00:23,656][1653645] Updated weights for policy 0, policy_version 893026 (0.0011) [2024-06-15 23:00:25,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 54613.0, 300 sec: 51429.0). Total num frames: 1828978688. Throughput: 0: 13812.5. Samples: 457311232. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:25,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:00:27,273][1653645] Updated weights for policy 0, policy_version 893093 (0.0011) [2024-06-15 23:00:29,249][1653645] Updated weights for policy 0, policy_version 893157 (0.0012) [2024-06-15 23:00:30,768][1653645] Updated weights for policy 0, policy_version 893242 (0.0013) [2024-06-15 23:00:30,958][1648982] Fps is (10 sec: 58982.4, 60 sec: 56797.8, 300 sec: 51984.5). Total num frames: 1829371904. Throughput: 0: 13539.6. Samples: 457385984. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:30,958][1648982] Avg episode reward: [(0, '37.240')] [2024-06-15 23:00:33,809][1653645] Updated weights for policy 0, policy_version 893312 (0.0012) [2024-06-15 23:00:35,958][1648982] Fps is (10 sec: 52431.5, 60 sec: 54613.3, 300 sec: 51540.2). Total num frames: 1829502976. Throughput: 0: 13573.7. Samples: 457423360. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:35,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:00:37,395][1653645] Updated weights for policy 0, policy_version 893353 (0.0033) [2024-06-15 23:00:38,445][1653645] Updated weights for policy 0, policy_version 893384 (0.0011) [2024-06-15 23:00:38,648][1651596] Signal inference workers to stop experience collection... (46400 times) [2024-06-15 23:00:38,692][1653645] InferenceWorker_p0-w0: stopping experience collection (46400 times) [2024-06-15 23:00:38,793][1651596] Signal inference workers to resume experience collection... (46400 times) [2024-06-15 23:00:38,793][1653645] InferenceWorker_p0-w0: resuming experience collection (46400 times) [2024-06-15 23:00:39,698][1653645] Updated weights for policy 0, policy_version 893441 (0.0014) [2024-06-15 23:00:40,670][1653645] Updated weights for policy 0, policy_version 893494 (0.0023) [2024-06-15 23:00:40,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 56251.5, 300 sec: 52317.7). Total num frames: 1829896192. Throughput: 0: 13687.4. Samples: 457512960. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:00:42,461][1653645] Updated weights for policy 0, policy_version 893545 (0.0012) [2024-06-15 23:00:45,952][1653645] Updated weights for policy 0, policy_version 893577 (0.0012) [2024-06-15 23:00:45,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 52974.9, 300 sec: 51651.3). Total num frames: 1830027264. Throughput: 0: 13892.3. Samples: 457601024. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:00:47,632][1653645] Updated weights for policy 0, policy_version 893648 (0.0011) [2024-06-15 23:00:49,108][1653645] Updated weights for policy 0, policy_version 893703 (0.0076) [2024-06-15 23:00:50,958][1648982] Fps is (10 sec: 52429.9, 60 sec: 56251.7, 300 sec: 52428.8). Total num frames: 1830420480. Throughput: 0: 13732.9. Samples: 457637376. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:00:51,521][1653645] Updated weights for policy 0, policy_version 893762 (0.0014) [2024-06-15 23:00:55,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 52428.8, 300 sec: 52095.6). Total num frames: 1830551552. Throughput: 0: 13619.2. Samples: 457717760. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:00:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:00:56,049][1653645] Updated weights for policy 0, policy_version 893840 (0.0013) [2024-06-15 23:00:56,454][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000893856_1830617088.pth... [2024-06-15 23:00:56,565][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000887488_1817575424.pth [2024-06-15 23:00:57,625][1653645] Updated weights for policy 0, policy_version 893920 (0.0012) [2024-06-15 23:00:59,084][1653645] Updated weights for policy 0, policy_version 893968 (0.0011) [2024-06-15 23:01:00,960][1648982] Fps is (10 sec: 52416.2, 60 sec: 54611.1, 300 sec: 52650.5). Total num frames: 1830944768. Throughput: 0: 13584.4. Samples: 457797120. Policy #0 lag: (min: 15.0, avg: 123.8, max: 271.0) [2024-06-15 23:01:00,961][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:01:01,359][1653645] Updated weights for policy 0, policy_version 894018 (0.0015) [2024-06-15 23:01:02,597][1653645] Updated weights for policy 0, policy_version 894080 (0.0013) [2024-06-15 23:01:05,958][1648982] Fps is (10 sec: 55706.1, 60 sec: 52974.8, 300 sec: 52539.9). Total num frames: 1831108608. Throughput: 0: 13607.8. Samples: 457837056. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:01:06,565][1653645] Updated weights for policy 0, policy_version 894140 (0.0148) [2024-06-15 23:01:08,459][1653645] Updated weights for policy 0, policy_version 894208 (0.0014) [2024-06-15 23:01:10,670][1653645] Updated weights for policy 0, policy_version 894273 (0.0013) [2024-06-15 23:01:10,958][1648982] Fps is (10 sec: 55718.4, 60 sec: 55159.4, 300 sec: 52984.2). Total num frames: 1831501824. Throughput: 0: 13551.0. Samples: 457921024. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:01:14,660][1653645] Updated weights for policy 0, policy_version 894337 (0.0012) [2024-06-15 23:01:15,649][1653645] Updated weights for policy 0, policy_version 894393 (0.0030) [2024-06-15 23:01:15,958][1648982] Fps is (10 sec: 62258.8, 60 sec: 54613.5, 300 sec: 53206.3). Total num frames: 1831731200. Throughput: 0: 13698.8. Samples: 458002432. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:01:17,793][1651596] Signal inference workers to stop experience collection... (46450 times) [2024-06-15 23:01:17,818][1653645] InferenceWorker_p0-w0: stopping experience collection (46450 times) [2024-06-15 23:01:18,031][1651596] Signal inference workers to resume experience collection... (46450 times) [2024-06-15 23:01:18,032][1653645] InferenceWorker_p0-w0: resuming experience collection (46450 times) [2024-06-15 23:01:18,166][1653645] Updated weights for policy 0, policy_version 894458 (0.0092) [2024-06-15 23:01:19,512][1653645] Updated weights for policy 0, policy_version 894519 (0.0011) [2024-06-15 23:01:20,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 54067.1, 300 sec: 53206.3). Total num frames: 1832026112. Throughput: 0: 13698.8. Samples: 458039808. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:20,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:01:21,159][1653645] Updated weights for policy 0, policy_version 894560 (0.0020) [2024-06-15 23:01:21,753][1653645] Updated weights for policy 0, policy_version 894592 (0.0016) [2024-06-15 23:01:25,575][1653645] Updated weights for policy 0, policy_version 894648 (0.0014) [2024-06-15 23:01:25,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 54613.7, 300 sec: 53317.4). Total num frames: 1832255488. Throughput: 0: 13528.2. Samples: 458121728. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:01:27,087][1653645] Updated weights for policy 0, policy_version 894689 (0.0010) [2024-06-15 23:01:28,202][1653645] Updated weights for policy 0, policy_version 894724 (0.0039) [2024-06-15 23:01:29,160][1653645] Updated weights for policy 0, policy_version 894784 (0.0012) [2024-06-15 23:01:30,958][1648982] Fps is (10 sec: 58981.8, 60 sec: 54067.0, 300 sec: 53983.9). Total num frames: 1832615936. Throughput: 0: 13425.7. Samples: 458205184. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:30,959][1648982] Avg episode reward: [(0, '37.510')] [2024-06-15 23:01:31,088][1653645] Updated weights for policy 0, policy_version 894848 (0.0144) [2024-06-15 23:01:35,260][1653645] Updated weights for policy 0, policy_version 894907 (0.0012) [2024-06-15 23:01:35,958][1648982] Fps is (10 sec: 52427.6, 60 sec: 54613.0, 300 sec: 53428.4). Total num frames: 1832779776. Throughput: 0: 13596.3. Samples: 458249216. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:01:36,842][1653645] Updated weights for policy 0, policy_version 894960 (0.0033) [2024-06-15 23:01:39,614][1653645] Updated weights for policy 0, policy_version 895032 (0.0118) [2024-06-15 23:01:40,728][1653645] Updated weights for policy 0, policy_version 895088 (0.0014) [2024-06-15 23:01:40,958][1648982] Fps is (10 sec: 52429.5, 60 sec: 54067.3, 300 sec: 54206.1). Total num frames: 1833140224. Throughput: 0: 13403.0. Samples: 458320896. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:01:45,426][1653645] Updated weights for policy 0, policy_version 895136 (0.0011) [2024-06-15 23:01:45,994][1648982] Fps is (10 sec: 48974.7, 60 sec: 54034.3, 300 sec: 53644.1). Total num frames: 1833271296. Throughput: 0: 13461.1. Samples: 458403328. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:45,995][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:01:46,732][1653645] Updated weights for policy 0, policy_version 895200 (0.0012) [2024-06-15 23:01:48,933][1653645] Updated weights for policy 0, policy_version 895253 (0.0013) [2024-06-15 23:01:50,243][1653645] Updated weights for policy 0, policy_version 895299 (0.0012) [2024-06-15 23:01:50,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 54067.2, 300 sec: 54539.3). Total num frames: 1833664512. Throughput: 0: 13391.6. Samples: 458439680. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:50,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:01:51,151][1653645] Updated weights for policy 0, policy_version 895360 (0.0115) [2024-06-15 23:01:55,958][1648982] Fps is (10 sec: 49332.1, 60 sec: 53521.1, 300 sec: 53761.7). Total num frames: 1833762816. Throughput: 0: 13300.6. Samples: 458519552. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:01:55,962][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:01:56,681][1653645] Updated weights for policy 0, policy_version 895440 (0.0013) [2024-06-15 23:01:56,754][1651596] Signal inference workers to stop experience collection... (46500 times) [2024-06-15 23:01:56,812][1653645] InferenceWorker_p0-w0: stopping experience collection (46500 times) [2024-06-15 23:01:56,984][1651596] Signal inference workers to resume experience collection... (46500 times) [2024-06-15 23:01:56,994][1653645] InferenceWorker_p0-w0: resuming experience collection (46500 times) [2024-06-15 23:01:58,485][1653645] Updated weights for policy 0, policy_version 895506 (0.0011) [2024-06-15 23:02:00,618][1653645] Updated weights for policy 0, policy_version 895584 (0.0075) [2024-06-15 23:02:00,958][1648982] Fps is (10 sec: 49152.2, 60 sec: 53523.2, 300 sec: 54428.2). Total num frames: 1834156032. Throughput: 0: 13118.6. Samples: 458592768. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:02:05,382][1653645] Updated weights for policy 0, policy_version 895619 (0.0011) [2024-06-15 23:02:05,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 52975.0, 300 sec: 54095.1). Total num frames: 1834287104. Throughput: 0: 13209.6. Samples: 458634240. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:02:06,985][1653645] Updated weights for policy 0, policy_version 895698 (0.0029) [2024-06-15 23:02:08,213][1653645] Updated weights for policy 0, policy_version 895762 (0.0014) [2024-06-15 23:02:10,606][1653645] Updated weights for policy 0, policy_version 895840 (0.0121) [2024-06-15 23:02:10,958][1648982] Fps is (10 sec: 55705.5, 60 sec: 53521.2, 300 sec: 54539.3). Total num frames: 1834713088. Throughput: 0: 13221.0. Samples: 458716672. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:10,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:02:14,996][1653645] Updated weights for policy 0, policy_version 895878 (0.0013) [2024-06-15 23:02:15,958][1648982] Fps is (10 sec: 58981.5, 60 sec: 52428.8, 300 sec: 54206.0). Total num frames: 1834876928. Throughput: 0: 13277.9. Samples: 458802688. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:02:16,070][1653645] Updated weights for policy 0, policy_version 895940 (0.0013) [2024-06-15 23:02:16,935][1653645] Updated weights for policy 0, policy_version 895994 (0.0014) [2024-06-15 23:02:17,810][1653645] Updated weights for policy 0, policy_version 896035 (0.0012) [2024-06-15 23:02:19,665][1653645] Updated weights for policy 0, policy_version 896082 (0.0011) [2024-06-15 23:02:20,958][1648982] Fps is (10 sec: 55705.2, 60 sec: 54067.2, 300 sec: 54650.4). Total num frames: 1835270144. Throughput: 0: 13164.2. Samples: 458841600. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:02:23,725][1653645] Updated weights for policy 0, policy_version 896129 (0.0012) [2024-06-15 23:02:25,006][1653645] Updated weights for policy 0, policy_version 896193 (0.0015) [2024-06-15 23:02:25,958][1648982] Fps is (10 sec: 62259.3, 60 sec: 54067.2, 300 sec: 54206.0). Total num frames: 1835499520. Throughput: 0: 13676.1. Samples: 458936320. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:02:26,298][1653645] Updated weights for policy 0, policy_version 896262 (0.0013) [2024-06-15 23:02:28,666][1653645] Updated weights for policy 0, policy_version 896336 (0.0012) [2024-06-15 23:02:30,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 52975.1, 300 sec: 54650.4). Total num frames: 1835794432. Throughput: 0: 13561.9. Samples: 459013120. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:02:33,317][1653645] Updated weights for policy 0, policy_version 896390 (0.0012) [2024-06-15 23:02:33,535][1651596] Signal inference workers to stop experience collection... (46550 times) [2024-06-15 23:02:33,602][1653645] InferenceWorker_p0-w0: stopping experience collection (46550 times) [2024-06-15 23:02:33,700][1651596] Signal inference workers to resume experience collection... (46550 times) [2024-06-15 23:02:33,701][1653645] InferenceWorker_p0-w0: resuming experience collection (46550 times) [2024-06-15 23:02:34,358][1653645] Updated weights for policy 0, policy_version 896449 (0.0095) [2024-06-15 23:02:35,217][1653645] Updated weights for policy 0, policy_version 896508 (0.0013) [2024-06-15 23:02:35,958][1648982] Fps is (10 sec: 58983.1, 60 sec: 55159.8, 300 sec: 54317.1). Total num frames: 1836089344. Throughput: 0: 13824.0. Samples: 459061760. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:02:36,623][1653645] Updated weights for policy 0, policy_version 896574 (0.0013) [2024-06-15 23:02:38,338][1653645] Updated weights for policy 0, policy_version 896637 (0.0011) [2024-06-15 23:02:40,958][1648982] Fps is (10 sec: 52428.1, 60 sec: 52974.9, 300 sec: 54539.3). Total num frames: 1836318720. Throughput: 0: 13892.2. Samples: 459144704. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:02:43,420][1653645] Updated weights for policy 0, policy_version 896691 (0.0011) [2024-06-15 23:02:44,769][1653645] Updated weights for policy 0, policy_version 896761 (0.0079) [2024-06-15 23:02:45,909][1653645] Updated weights for policy 0, policy_version 896803 (0.0016) [2024-06-15 23:02:45,958][1648982] Fps is (10 sec: 55705.1, 60 sec: 56286.0, 300 sec: 54428.2). Total num frames: 1836646400. Throughput: 0: 14006.0. Samples: 459223040. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:02:47,617][1653645] Updated weights for policy 0, policy_version 896866 (0.0015) [2024-06-15 23:02:50,958][1648982] Fps is (10 sec: 52429.1, 60 sec: 52974.9, 300 sec: 54206.1). Total num frames: 1836843008. Throughput: 0: 13915.0. Samples: 459260416. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:02:52,118][1653645] Updated weights for policy 0, policy_version 896912 (0.0012) [2024-06-15 23:02:53,526][1653645] Updated weights for policy 0, policy_version 896977 (0.0013) [2024-06-15 23:02:54,356][1653645] Updated weights for policy 0, policy_version 897022 (0.0011) [2024-06-15 23:02:55,679][1653645] Updated weights for policy 0, policy_version 897082 (0.0013) [2024-06-15 23:02:55,958][1648982] Fps is (10 sec: 58982.0, 60 sec: 57890.0, 300 sec: 54650.3). Total num frames: 1837236224. Throughput: 0: 13994.6. Samples: 459346432. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:02:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:02:55,964][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000897088_1837236224.pth... [2024-06-15 23:02:56,030][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000890688_1824129024.pth [2024-06-15 23:02:57,717][1653645] Updated weights for policy 0, policy_version 897136 (0.0015) [2024-06-15 23:03:00,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 53521.0, 300 sec: 54206.1). Total num frames: 1837367296. Throughput: 0: 13994.7. Samples: 459432448. Policy #0 lag: (min: 11.0, avg: 93.8, max: 267.0) [2024-06-15 23:03:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:03:02,164][1653645] Updated weights for policy 0, policy_version 897184 (0.0010) [2024-06-15 23:03:03,789][1653645] Updated weights for policy 0, policy_version 897255 (0.0013) [2024-06-15 23:03:04,833][1653645] Updated weights for policy 0, policy_version 897307 (0.0079) [2024-06-15 23:03:05,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 57890.0, 300 sec: 54650.4). Total num frames: 1837760512. Throughput: 0: 13926.4. Samples: 459468288. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:05,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:03:07,298][1653645] Updated weights for policy 0, policy_version 897345 (0.0013) [2024-06-15 23:03:07,857][1651596] Signal inference workers to stop experience collection... (46600 times) [2024-06-15 23:03:07,913][1653645] InferenceWorker_p0-w0: stopping experience collection (46600 times) [2024-06-15 23:03:08,043][1651596] Signal inference workers to resume experience collection... (46600 times) [2024-06-15 23:03:08,044][1653645] InferenceWorker_p0-w0: resuming experience collection (46600 times) [2024-06-15 23:03:08,347][1653645] Updated weights for policy 0, policy_version 897408 (0.0010) [2024-06-15 23:03:10,958][1648982] Fps is (10 sec: 52427.1, 60 sec: 52974.6, 300 sec: 54206.0). Total num frames: 1837891584. Throughput: 0: 13630.5. Samples: 459549696. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:03:12,514][1653645] Updated weights for policy 0, policy_version 897472 (0.0026) [2024-06-15 23:03:13,615][1653645] Updated weights for policy 0, policy_version 897531 (0.0097) [2024-06-15 23:03:14,743][1653645] Updated weights for policy 0, policy_version 897584 (0.0013) [2024-06-15 23:03:15,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 56797.7, 300 sec: 54539.3). Total num frames: 1838284800. Throughput: 0: 13573.6. Samples: 459623936. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:15,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:03:17,692][1653645] Updated weights for policy 0, policy_version 897637 (0.0151) [2024-06-15 23:03:20,958][1648982] Fps is (10 sec: 52431.0, 60 sec: 52428.9, 300 sec: 53872.9). Total num frames: 1838415872. Throughput: 0: 13505.4. Samples: 459669504. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:03:21,018][1653645] Updated weights for policy 0, policy_version 897680 (0.0013) [2024-06-15 23:03:22,496][1653645] Updated weights for policy 0, policy_version 897760 (0.0012) [2024-06-15 23:03:23,910][1653645] Updated weights for policy 0, policy_version 897824 (0.0012) [2024-06-15 23:03:25,958][1648982] Fps is (10 sec: 52430.1, 60 sec: 55159.5, 300 sec: 54206.0). Total num frames: 1838809088. Throughput: 0: 13494.1. Samples: 459751936. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:03:27,234][1653645] Updated weights for policy 0, policy_version 897891 (0.0087) [2024-06-15 23:03:30,958][1648982] Fps is (10 sec: 58981.1, 60 sec: 53521.0, 300 sec: 53983.9). Total num frames: 1839005696. Throughput: 0: 13755.7. Samples: 459842048. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:30,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:03:30,997][1653645] Updated weights for policy 0, policy_version 897968 (0.0012) [2024-06-15 23:03:32,452][1653645] Updated weights for policy 0, policy_version 898048 (0.0014) [2024-06-15 23:03:34,293][1653645] Updated weights for policy 0, policy_version 898096 (0.0012) [2024-06-15 23:03:35,958][1648982] Fps is (10 sec: 52429.0, 60 sec: 54067.2, 300 sec: 54206.1). Total num frames: 1839333376. Throughput: 0: 13698.9. Samples: 459876864. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:35,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:03:36,539][1653645] Updated weights for policy 0, policy_version 898130 (0.0013) [2024-06-15 23:03:39,979][1653645] Updated weights for policy 0, policy_version 898192 (0.0074) [2024-06-15 23:03:40,957][1648982] Fps is (10 sec: 58983.6, 60 sec: 54613.5, 300 sec: 54206.0). Total num frames: 1839595520. Throughput: 0: 13642.0. Samples: 459960320. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:40,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:03:41,192][1653645] Updated weights for policy 0, policy_version 898258 (0.0011) [2024-06-15 23:03:43,416][1653645] Updated weights for policy 0, policy_version 898320 (0.0100) [2024-06-15 23:03:43,750][1651596] Signal inference workers to stop experience collection... (46650 times) [2024-06-15 23:03:43,806][1653645] InferenceWorker_p0-w0: stopping experience collection (46650 times) [2024-06-15 23:03:44,056][1651596] Signal inference workers to resume experience collection... (46650 times) [2024-06-15 23:03:44,056][1653645] InferenceWorker_p0-w0: resuming experience collection (46650 times) [2024-06-15 23:03:45,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 53521.0, 300 sec: 54206.1). Total num frames: 1839857664. Throughput: 0: 13459.9. Samples: 460038144. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:45,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:03:45,997][1653645] Updated weights for policy 0, policy_version 898371 (0.0014) [2024-06-15 23:03:47,114][1653645] Updated weights for policy 0, policy_version 898430 (0.0012) [2024-06-15 23:03:50,805][1653645] Updated weights for policy 0, policy_version 898499 (0.0012) [2024-06-15 23:03:50,958][1648982] Fps is (10 sec: 52428.3, 60 sec: 54613.4, 300 sec: 54209.3). Total num frames: 1840119808. Throughput: 0: 13539.6. Samples: 460077568. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:50,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:03:53,944][1653645] Updated weights for policy 0, policy_version 898592 (0.0012) [2024-06-15 23:03:55,958][1648982] Fps is (10 sec: 52428.6, 60 sec: 52428.8, 300 sec: 54206.0). Total num frames: 1840381952. Throughput: 0: 13460.0. Samples: 460155392. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:03:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:03:59,922][1653645] Updated weights for policy 0, policy_version 898692 (0.0013) [2024-06-15 23:04:00,958][1648982] Fps is (10 sec: 52428.9, 60 sec: 54613.3, 300 sec: 54206.0). Total num frames: 1840644096. Throughput: 0: 13551.0. Samples: 460233728. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:04:01,359][1653645] Updated weights for policy 0, policy_version 898773 (0.0083) [2024-06-15 23:04:03,561][1653645] Updated weights for policy 0, policy_version 898821 (0.0012) [2024-06-15 23:04:05,958][1648982] Fps is (10 sec: 52428.4, 60 sec: 52428.7, 300 sec: 54206.1). Total num frames: 1840906240. Throughput: 0: 13448.5. Samples: 460274688. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:05,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:04:06,278][1653645] Updated weights for policy 0, policy_version 898896 (0.0032) [2024-06-15 23:04:09,743][1653645] Updated weights for policy 0, policy_version 898976 (0.0013) [2024-06-15 23:04:10,974][1648982] Fps is (10 sec: 58885.5, 60 sec: 55690.6, 300 sec: 54425.2). Total num frames: 1841233920. Throughput: 0: 13511.9. Samples: 460360192. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:10,974][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:04:11,128][1653645] Updated weights for policy 0, policy_version 899065 (0.0013) [2024-06-15 23:04:13,336][1653645] Updated weights for policy 0, policy_version 899105 (0.0022) [2024-06-15 23:04:15,532][1653645] Updated weights for policy 0, policy_version 899168 (0.0016) [2024-06-15 23:04:15,958][1648982] Fps is (10 sec: 62260.3, 60 sec: 54067.4, 300 sec: 54539.3). Total num frames: 1841528832. Throughput: 0: 13323.4. Samples: 460441600. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:04:18,821][1653645] Updated weights for policy 0, policy_version 899216 (0.0014) [2024-06-15 23:04:20,074][1653645] Updated weights for policy 0, policy_version 899266 (0.0013) [2024-06-15 23:04:20,958][1648982] Fps is (10 sec: 55797.5, 60 sec: 56251.7, 300 sec: 54539.3). Total num frames: 1841790976. Throughput: 0: 13437.2. Samples: 460481536. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:04:21,013][1653645] Updated weights for policy 0, policy_version 899320 (0.0112) [2024-06-15 23:04:22,336][1651596] Signal inference workers to stop experience collection... (46700 times) [2024-06-15 23:04:22,356][1653645] InferenceWorker_p0-w0: stopping experience collection (46700 times) [2024-06-15 23:04:22,582][1651596] Signal inference workers to resume experience collection... (46700 times) [2024-06-15 23:04:22,584][1653645] InferenceWorker_p0-w0: resuming experience collection (46700 times) [2024-06-15 23:04:24,099][1653645] Updated weights for policy 0, policy_version 899396 (0.0011) [2024-06-15 23:04:25,027][1653645] Updated weights for policy 0, policy_version 899454 (0.0012) [2024-06-15 23:04:25,958][1648982] Fps is (10 sec: 55705.7, 60 sec: 54613.4, 300 sec: 54650.4). Total num frames: 1842085888. Throughput: 0: 13494.0. Samples: 460567552. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:25,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:04:28,621][1653645] Updated weights for policy 0, policy_version 899513 (0.0014) [2024-06-15 23:04:30,599][1653645] Updated weights for policy 0, policy_version 899572 (0.0011) [2024-06-15 23:04:30,958][1648982] Fps is (10 sec: 55705.6, 60 sec: 55705.7, 300 sec: 54650.4). Total num frames: 1842348032. Throughput: 0: 13551.0. Samples: 460647936. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:04:32,146][1653645] Updated weights for policy 0, policy_version 899616 (0.0012) [2024-06-15 23:04:33,562][1653645] Updated weights for policy 0, policy_version 899654 (0.0010) [2024-06-15 23:04:34,385][1653645] Updated weights for policy 0, policy_version 899705 (0.0013) [2024-06-15 23:04:35,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 54613.3, 300 sec: 54539.3). Total num frames: 1842610176. Throughput: 0: 13607.8. Samples: 460689920. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:35,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:04:38,336][1653645] Updated weights for policy 0, policy_version 899768 (0.0014) [2024-06-15 23:04:40,222][1653645] Updated weights for policy 0, policy_version 899832 (0.0012) [2024-06-15 23:04:40,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 54613.2, 300 sec: 54317.1). Total num frames: 1842872320. Throughput: 0: 13744.4. Samples: 460773888. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:40,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:04:41,901][1653645] Updated weights for policy 0, policy_version 899872 (0.0013) [2024-06-15 23:04:43,356][1653645] Updated weights for policy 0, policy_version 899936 (0.0012) [2024-06-15 23:04:45,958][1648982] Fps is (10 sec: 52428.8, 60 sec: 54613.4, 300 sec: 54539.3). Total num frames: 1843134464. Throughput: 0: 13937.8. Samples: 460860928. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:45,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:04:47,042][1653645] Updated weights for policy 0, policy_version 899970 (0.0013) [2024-06-15 23:04:48,040][1653645] Updated weights for policy 0, policy_version 900031 (0.0010) [2024-06-15 23:04:49,667][1653645] Updated weights for policy 0, policy_version 900091 (0.0012) [2024-06-15 23:04:50,957][1648982] Fps is (10 sec: 52429.5, 60 sec: 54613.4, 300 sec: 54206.1). Total num frames: 1843396608. Throughput: 0: 13881.0. Samples: 460899328. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:04:51,727][1653645] Updated weights for policy 0, policy_version 900160 (0.0106) [2024-06-15 23:04:52,617][1653645] Updated weights for policy 0, policy_version 900216 (0.0010) [2024-06-15 23:04:55,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 54613.4, 300 sec: 54206.0). Total num frames: 1843658752. Throughput: 0: 13851.8. Samples: 460983296. Policy #0 lag: (min: 15.0, avg: 88.8, max: 271.0) [2024-06-15 23:04:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:04:55,970][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000900224_1843658752.pth... [2024-06-15 23:04:56,054][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000893856_1830617088.pth [2024-06-15 23:04:57,321][1653645] Updated weights for policy 0, policy_version 900258 (0.0010) [2024-06-15 23:04:58,640][1653645] Updated weights for policy 0, policy_version 900291 (0.0016) [2024-06-15 23:05:00,340][1653645] Updated weights for policy 0, policy_version 900358 (0.0012) [2024-06-15 23:05:00,800][1651596] Signal inference workers to stop experience collection... (46750 times) [2024-06-15 23:05:00,857][1653645] InferenceWorker_p0-w0: stopping experience collection (46750 times) [2024-06-15 23:05:00,946][1651596] Signal inference workers to resume experience collection... (46750 times) [2024-06-15 23:05:00,952][1653645] InferenceWorker_p0-w0: resuming experience collection (46750 times) [2024-06-15 23:05:00,958][1648982] Fps is (10 sec: 58981.0, 60 sec: 55705.5, 300 sec: 54428.1). Total num frames: 1843986432. Throughput: 0: 13801.2. Samples: 461062656. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:05:01,139][1653645] Updated weights for policy 0, policy_version 900416 (0.0013) [2024-06-15 23:05:02,314][1653645] Updated weights for policy 0, policy_version 900480 (0.0014) [2024-06-15 23:05:05,958][1648982] Fps is (10 sec: 58982.6, 60 sec: 55705.8, 300 sec: 54428.2). Total num frames: 1844248576. Throughput: 0: 13937.8. Samples: 461108736. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:05:06,590][1653645] Updated weights for policy 0, policy_version 900544 (0.0013) [2024-06-15 23:05:09,256][1653645] Updated weights for policy 0, policy_version 900597 (0.0047) [2024-06-15 23:05:10,279][1653645] Updated weights for policy 0, policy_version 900656 (0.0013) [2024-06-15 23:05:10,958][1648982] Fps is (10 sec: 62260.2, 60 sec: 56267.2, 300 sec: 54761.5). Total num frames: 1844609024. Throughput: 0: 13937.8. Samples: 461194752. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:10,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:05:11,686][1653645] Updated weights for policy 0, policy_version 900729 (0.0015) [2024-06-15 23:05:15,306][1653645] Updated weights for policy 0, policy_version 900800 (0.0012) [2024-06-15 23:05:15,958][1648982] Fps is (10 sec: 58982.6, 60 sec: 55159.5, 300 sec: 54428.2). Total num frames: 1844838400. Throughput: 0: 14142.6. Samples: 461284352. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:05:18,125][1653645] Updated weights for policy 0, policy_version 900851 (0.0096) [2024-06-15 23:05:19,340][1653645] Updated weights for policy 0, policy_version 900928 (0.0102) [2024-06-15 23:05:20,895][1653645] Updated weights for policy 0, policy_version 900992 (0.0012) [2024-06-15 23:05:20,958][1648982] Fps is (10 sec: 62257.5, 60 sec: 57343.7, 300 sec: 55094.7). Total num frames: 1845231616. Throughput: 0: 14131.1. Samples: 461325824. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:20,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:05:24,984][1653645] Updated weights for policy 0, policy_version 901054 (0.0011) [2024-06-15 23:05:25,958][1648982] Fps is (10 sec: 52427.8, 60 sec: 54613.2, 300 sec: 54206.0). Total num frames: 1845362688. Throughput: 0: 14165.3. Samples: 461411328. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:05:27,125][1653645] Updated weights for policy 0, policy_version 901106 (0.0026) [2024-06-15 23:05:28,316][1653645] Updated weights for policy 0, policy_version 901178 (0.0012) [2024-06-15 23:05:29,741][1653645] Updated weights for policy 0, policy_version 901232 (0.0013) [2024-06-15 23:05:30,958][1648982] Fps is (10 sec: 52430.5, 60 sec: 56797.9, 300 sec: 55094.7). Total num frames: 1845755904. Throughput: 0: 14097.1. Samples: 461495296. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:05:33,005][1653645] Updated weights for policy 0, policy_version 901266 (0.0034) [2024-06-15 23:05:34,459][1653645] Updated weights for policy 0, policy_version 901328 (0.0010) [2024-06-15 23:05:34,990][1651596] Signal inference workers to stop experience collection... (46800 times) [2024-06-15 23:05:35,057][1653645] InferenceWorker_p0-w0: stopping experience collection (46800 times) [2024-06-15 23:05:35,136][1651596] Signal inference workers to resume experience collection... (46800 times) [2024-06-15 23:05:35,136][1653645] InferenceWorker_p0-w0: resuming experience collection (46800 times) [2024-06-15 23:05:35,264][1653645] Updated weights for policy 0, policy_version 901380 (0.0010) [2024-06-15 23:05:35,929][1653645] Updated weights for policy 0, policy_version 901436 (0.0011) [2024-06-15 23:05:35,957][1648982] Fps is (10 sec: 78645.2, 60 sec: 58982.5, 300 sec: 55094.7). Total num frames: 1846149120. Throughput: 0: 14438.4. Samples: 461549056. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:05:37,030][1653645] Updated weights for policy 0, policy_version 901488 (0.0096) [2024-06-15 23:05:39,719][1653645] Updated weights for policy 0, policy_version 901525 (0.0010) [2024-06-15 23:05:40,958][1648982] Fps is (10 sec: 65535.5, 60 sec: 58982.4, 300 sec: 55539.0). Total num frames: 1846411264. Throughput: 0: 15166.6. Samples: 461665792. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:05:41,552][1653645] Updated weights for policy 0, policy_version 901584 (0.0063) [2024-06-15 23:05:42,547][1653645] Updated weights for policy 0, policy_version 901650 (0.0019) [2024-06-15 23:05:43,872][1653645] Updated weights for policy 0, policy_version 901712 (0.0010) [2024-06-15 23:05:45,957][1648982] Fps is (10 sec: 65535.9, 60 sec: 61167.0, 300 sec: 55539.0). Total num frames: 1846804480. Throughput: 0: 15860.7. Samples: 461776384. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:45,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:05:46,650][1653645] Updated weights for policy 0, policy_version 901764 (0.0010) [2024-06-15 23:05:47,512][1653645] Updated weights for policy 0, policy_version 901824 (0.0013) [2024-06-15 23:05:49,779][1653645] Updated weights for policy 0, policy_version 901908 (0.0080) [2024-06-15 23:05:50,957][1648982] Fps is (10 sec: 78644.6, 60 sec: 63351.5, 300 sec: 56427.6). Total num frames: 1847197696. Throughput: 0: 15951.7. Samples: 461826560. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:05:51,855][1653645] Updated weights for policy 0, policy_version 901984 (0.0011) [2024-06-15 23:05:54,039][1653645] Updated weights for policy 0, policy_version 902032 (0.0012) [2024-06-15 23:05:55,958][1648982] Fps is (10 sec: 65530.9, 60 sec: 63350.7, 300 sec: 55983.6). Total num frames: 1847459840. Throughput: 0: 16338.2. Samples: 461929984. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:05:55,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:05:56,088][1653645] Updated weights for policy 0, policy_version 902085 (0.0013) [2024-06-15 23:05:56,851][1653645] Updated weights for policy 0, policy_version 902135 (0.0075) [2024-06-15 23:05:57,770][1653645] Updated weights for policy 0, policy_version 902193 (0.0012) [2024-06-15 23:05:58,672][1653645] Updated weights for policy 0, policy_version 902224 (0.0010) [2024-06-15 23:05:59,213][1653645] Updated weights for policy 0, policy_version 902266 (0.0066) [2024-06-15 23:06:00,959][1648982] Fps is (10 sec: 65527.8, 60 sec: 64442.7, 300 sec: 56760.6). Total num frames: 1847853056. Throughput: 0: 16804.6. Samples: 462040576. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:06:01,631][1653645] Updated weights for policy 0, policy_version 902333 (0.0011) [2024-06-15 23:06:03,576][1651596] Signal inference workers to stop experience collection... (46850 times) [2024-06-15 23:06:03,614][1653645] InferenceWorker_p0-w0: stopping experience collection (46850 times) [2024-06-15 23:06:03,784][1651596] Signal inference workers to resume experience collection... (46850 times) [2024-06-15 23:06:03,785][1653645] InferenceWorker_p0-w0: resuming experience collection (46850 times) [2024-06-15 23:06:04,201][1653645] Updated weights for policy 0, policy_version 902400 (0.0011) [2024-06-15 23:06:05,006][1653645] Updated weights for policy 0, policy_version 902459 (0.0012) [2024-06-15 23:06:05,958][1648982] Fps is (10 sec: 78646.7, 60 sec: 66628.0, 300 sec: 56760.8). Total num frames: 1848246272. Throughput: 0: 17112.2. Samples: 462095872. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:05,959][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:06:06,712][1653645] Updated weights for policy 0, policy_version 902512 (0.0013) [2024-06-15 23:06:09,218][1653645] Updated weights for policy 0, policy_version 902579 (0.0011) [2024-06-15 23:06:10,958][1648982] Fps is (10 sec: 65542.9, 60 sec: 64989.8, 300 sec: 56871.9). Total num frames: 1848508416. Throughput: 0: 17339.8. Samples: 462191616. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:06:11,537][1653645] Updated weights for policy 0, policy_version 902624 (0.0013) [2024-06-15 23:06:12,639][1653645] Updated weights for policy 0, policy_version 902689 (0.0011) [2024-06-15 23:06:14,305][1653645] Updated weights for policy 0, policy_version 902752 (0.0012) [2024-06-15 23:06:15,958][1648982] Fps is (10 sec: 65536.9, 60 sec: 67720.3, 300 sec: 57205.1). Total num frames: 1848901632. Throughput: 0: 17783.4. Samples: 462295552. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:15,959][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:06:16,383][1653645] Updated weights for policy 0, policy_version 902800 (0.0012) [2024-06-15 23:06:18,948][1653645] Updated weights for policy 0, policy_version 902880 (0.0012) [2024-06-15 23:06:19,855][1653645] Updated weights for policy 0, policy_version 902944 (0.0013) [2024-06-15 23:06:20,958][1648982] Fps is (10 sec: 78643.5, 60 sec: 67720.9, 300 sec: 57760.6). Total num frames: 1849294848. Throughput: 0: 17817.6. Samples: 462350848. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:20,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:06:21,381][1653645] Updated weights for policy 0, policy_version 902992 (0.0029) [2024-06-15 23:06:22,070][1653645] Updated weights for policy 0, policy_version 903040 (0.0011) [2024-06-15 23:06:24,410][1653645] Updated weights for policy 0, policy_version 903104 (0.0011) [2024-06-15 23:06:25,958][1648982] Fps is (10 sec: 65536.1, 60 sec: 69905.1, 300 sec: 57427.3). Total num frames: 1849556992. Throughput: 0: 17521.7. Samples: 462454272. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:25,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:06:27,276][1653645] Updated weights for policy 0, policy_version 903174 (0.0076) [2024-06-15 23:06:27,865][1653645] Updated weights for policy 0, policy_version 903219 (0.0020) [2024-06-15 23:06:28,940][1653645] Updated weights for policy 0, policy_version 903251 (0.0010) [2024-06-15 23:06:29,456][1653645] Updated weights for policy 0, policy_version 903296 (0.0010) [2024-06-15 23:06:30,958][1648982] Fps is (10 sec: 65536.1, 60 sec: 69905.0, 300 sec: 58204.9). Total num frames: 1849950208. Throughput: 0: 17533.1. Samples: 462565376. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:30,958][1648982] Avg episode reward: [(0, '37.590')] [2024-06-15 23:06:31,319][1651596] Saving new best policy, reward=37.590! [2024-06-15 23:06:31,918][1653645] Updated weights for policy 0, policy_version 903360 (0.0016) [2024-06-15 23:06:33,539][1651596] Signal inference workers to stop experience collection... (46900 times) [2024-06-15 23:06:33,573][1653645] InferenceWorker_p0-w0: stopping experience collection (46900 times) [2024-06-15 23:06:33,699][1651596] Signal inference workers to resume experience collection... (46900 times) [2024-06-15 23:06:33,699][1653645] InferenceWorker_p0-w0: resuming experience collection (46900 times) [2024-06-15 23:06:34,363][1653645] Updated weights for policy 0, policy_version 903424 (0.0014) [2024-06-15 23:06:35,958][1648982] Fps is (10 sec: 78643.9, 60 sec: 69904.9, 300 sec: 58315.9). Total num frames: 1850343424. Throughput: 0: 17578.6. Samples: 462617600. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:35,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:06:36,149][1653645] Updated weights for policy 0, policy_version 903504 (0.0014) [2024-06-15 23:06:38,818][1653645] Updated weights for policy 0, policy_version 903568 (0.0011) [2024-06-15 23:06:40,958][1648982] Fps is (10 sec: 65535.5, 60 sec: 69905.0, 300 sec: 58767.5). Total num frames: 1850605568. Throughput: 0: 17590.3. Samples: 462721536. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:40,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:06:41,123][1653645] Updated weights for policy 0, policy_version 903634 (0.0013) [2024-06-15 23:06:41,858][1653645] Updated weights for policy 0, policy_version 903684 (0.0012) [2024-06-15 23:06:42,715][1653645] Updated weights for policy 0, policy_version 903741 (0.0012) [2024-06-15 23:06:43,706][1653645] Updated weights for policy 0, policy_version 903801 (0.0038) [2024-06-15 23:06:45,958][1648982] Fps is (10 sec: 65536.4, 60 sec: 69905.0, 300 sec: 58760.3). Total num frames: 1850998784. Throughput: 0: 17340.2. Samples: 462820864. Policy #0 lag: (min: 15.0, avg: 89.3, max: 271.0) [2024-06-15 23:06:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:06:46,898][1653645] Updated weights for policy 0, policy_version 903856 (0.0011) [2024-06-15 23:06:48,192][1653645] Updated weights for policy 0, policy_version 903904 (0.0013) [2024-06-15 23:06:48,925][1653645] Updated weights for policy 0, policy_version 903938 (0.0011) [2024-06-15 23:06:49,640][1653645] Updated weights for policy 0, policy_version 904000 (0.0011) [2024-06-15 23:06:50,959][1648982] Fps is (10 sec: 91737.9, 60 sec: 72087.7, 300 sec: 60204.0). Total num frames: 1851523072. Throughput: 0: 17555.5. Samples: 462885888. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:06:50,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:06:53,463][1653645] Updated weights for policy 0, policy_version 904083 (0.0014) [2024-06-15 23:06:54,846][1653645] Updated weights for policy 0, policy_version 904136 (0.0012) [2024-06-15 23:06:55,639][1653645] Updated weights for policy 0, policy_version 904192 (0.0012) [2024-06-15 23:06:55,958][1648982] Fps is (10 sec: 78642.2, 60 sec: 72090.4, 300 sec: 59759.9). Total num frames: 1851785216. Throughput: 0: 17794.8. Samples: 462992384. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:06:55,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 23:06:55,972][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000904192_1851785216.pth... [2024-06-15 23:06:56,041][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000897088_1837236224.pth [2024-06-15 23:06:57,397][1653645] Updated weights for policy 0, policy_version 904248 (0.0010) [2024-06-15 23:06:58,242][1653645] Updated weights for policy 0, policy_version 904304 (0.0012) [2024-06-15 23:07:00,127][1653645] Updated weights for policy 0, policy_version 904338 (0.0010) [2024-06-15 23:07:00,958][1648982] Fps is (10 sec: 65544.8, 60 sec: 72090.8, 300 sec: 60648.5). Total num frames: 1852178432. Throughput: 0: 18170.3. Samples: 463113216. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:07:01,211][1651596] Signal inference workers to stop experience collection... (46950 times) [2024-06-15 23:07:01,247][1653645] InferenceWorker_p0-w0: stopping experience collection (46950 times) [2024-06-15 23:07:01,359][1651596] Signal inference workers to resume experience collection... (46950 times) [2024-06-15 23:07:01,367][1653645] InferenceWorker_p0-w0: resuming experience collection (46950 times) [2024-06-15 23:07:01,631][1653645] Updated weights for policy 0, policy_version 904416 (0.0095) [2024-06-15 23:07:04,119][1653645] Updated weights for policy 0, policy_version 904449 (0.0016) [2024-06-15 23:07:04,957][1653645] Updated weights for policy 0, policy_version 904512 (0.0012) [2024-06-15 23:07:05,958][1648982] Fps is (10 sec: 75367.0, 60 sec: 71543.8, 300 sec: 60426.4). Total num frames: 1852538880. Throughput: 0: 17942.8. Samples: 463158272. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:05,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:07:06,084][1653645] Updated weights for policy 0, policy_version 904572 (0.0013) [2024-06-15 23:07:07,990][1653645] Updated weights for policy 0, policy_version 904614 (0.0062) [2024-06-15 23:07:09,140][1653645] Updated weights for policy 0, policy_version 904698 (0.0011) [2024-06-15 23:07:10,958][1648982] Fps is (10 sec: 65535.6, 60 sec: 72089.5, 300 sec: 60870.7). Total num frames: 1852833792. Throughput: 0: 18124.8. Samples: 463269888. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:10,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:07:11,500][1653645] Updated weights for policy 0, policy_version 904738 (0.0011) [2024-06-15 23:07:12,405][1653645] Updated weights for policy 0, policy_version 904787 (0.0012) [2024-06-15 23:07:15,012][1653645] Updated weights for policy 0, policy_version 904848 (0.0017) [2024-06-15 23:07:15,864][1653645] Updated weights for policy 0, policy_version 904901 (0.0011) [2024-06-15 23:07:15,958][1648982] Fps is (10 sec: 68813.1, 60 sec: 72089.8, 300 sec: 60870.7). Total num frames: 1853227008. Throughput: 0: 17988.3. Samples: 463374848. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:07:18,257][1653645] Updated weights for policy 0, policy_version 904961 (0.0011) [2024-06-15 23:07:19,114][1653645] Updated weights for policy 0, policy_version 905024 (0.0010) [2024-06-15 23:07:20,469][1653645] Updated weights for policy 0, policy_version 905087 (0.0011) [2024-06-15 23:07:20,958][1648982] Fps is (10 sec: 78645.1, 60 sec: 72089.7, 300 sec: 61426.1). Total num frames: 1853620224. Throughput: 0: 18147.6. Samples: 463434240. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:20,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:07:23,060][1653645] Updated weights for policy 0, policy_version 905140 (0.0011) [2024-06-15 23:07:25,531][1653645] Updated weights for policy 0, policy_version 905219 (0.0011) [2024-06-15 23:07:25,958][1648982] Fps is (10 sec: 72088.9, 60 sec: 73181.9, 300 sec: 61537.2). Total num frames: 1853947904. Throughput: 0: 17988.3. Samples: 463531008. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:07:26,288][1653645] Updated weights for policy 0, policy_version 905278 (0.0010) [2024-06-15 23:07:28,201][1653645] Updated weights for policy 0, policy_version 905337 (0.0013) [2024-06-15 23:07:30,071][1653645] Updated weights for policy 0, policy_version 905392 (0.0012) [2024-06-15 23:07:30,454][1651596] Signal inference workers to stop experience collection... (47000 times) [2024-06-15 23:07:30,513][1653645] InferenceWorker_p0-w0: stopping experience collection (47000 times) [2024-06-15 23:07:30,538][1653645] Updated weights for policy 0, policy_version 905414 (0.0010) [2024-06-15 23:07:30,605][1651596] Signal inference workers to resume experience collection... (47000 times) [2024-06-15 23:07:30,606][1653645] InferenceWorker_p0-w0: resuming experience collection (47000 times) [2024-06-15 23:07:30,958][1648982] Fps is (10 sec: 72087.0, 60 sec: 73181.5, 300 sec: 61870.4). Total num frames: 1854341120. Throughput: 0: 18261.2. Samples: 463642624. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:30,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:07:32,659][1653645] Updated weights for policy 0, policy_version 905475 (0.0011) [2024-06-15 23:07:33,355][1653645] Updated weights for policy 0, policy_version 905534 (0.0011) [2024-06-15 23:07:35,958][1648982] Fps is (10 sec: 72090.2, 60 sec: 72089.7, 300 sec: 62203.7). Total num frames: 1854668800. Throughput: 0: 18000.2. Samples: 463695872. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:35,960][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:07:37,136][1653645] Updated weights for policy 0, policy_version 905620 (0.0095) [2024-06-15 23:07:37,595][1653645] Updated weights for policy 0, policy_version 905660 (0.0011) [2024-06-15 23:07:38,697][1653645] Updated weights for policy 0, policy_version 905723 (0.0012) [2024-06-15 23:07:40,842][1653645] Updated weights for policy 0, policy_version 905788 (0.0011) [2024-06-15 23:07:40,958][1648982] Fps is (10 sec: 72091.4, 60 sec: 74274.2, 300 sec: 62425.8). Total num frames: 1855062016. Throughput: 0: 17965.5. Samples: 463800832. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:07:43,018][1653645] Updated weights for policy 0, policy_version 905827 (0.0030) [2024-06-15 23:07:43,868][1653645] Updated weights for policy 0, policy_version 905872 (0.0092) [2024-06-15 23:07:44,454][1653645] Updated weights for policy 0, policy_version 905913 (0.0011) [2024-06-15 23:07:45,958][1648982] Fps is (10 sec: 75364.3, 60 sec: 73727.7, 300 sec: 62981.2). Total num frames: 1855422464. Throughput: 0: 17772.0. Samples: 463912960. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:45,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:07:46,033][1653645] Updated weights for policy 0, policy_version 905984 (0.0010) [2024-06-15 23:07:48,416][1653645] Updated weights for policy 0, policy_version 906048 (0.0011) [2024-06-15 23:07:50,958][1648982] Fps is (10 sec: 65536.3, 60 sec: 69906.7, 300 sec: 62648.0). Total num frames: 1855717376. Throughput: 0: 17840.4. Samples: 463961088. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:50,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:07:51,015][1653645] Updated weights for policy 0, policy_version 906117 (0.0009) [2024-06-15 23:07:51,686][1653645] Updated weights for policy 0, policy_version 906166 (0.0013) [2024-06-15 23:07:53,546][1653645] Updated weights for policy 0, policy_version 906213 (0.0010) [2024-06-15 23:07:55,487][1653645] Updated weights for policy 0, policy_version 906272 (0.0011) [2024-06-15 23:07:55,920][1653645] Updated weights for policy 0, policy_version 906304 (0.0011) [2024-06-15 23:07:55,958][1648982] Fps is (10 sec: 68812.5, 60 sec: 72089.4, 300 sec: 63536.5). Total num frames: 1856110592. Throughput: 0: 17840.3. Samples: 464072704. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:07:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:07:57,932][1653645] Updated weights for policy 0, policy_version 906368 (0.0012) [2024-06-15 23:07:58,803][1653645] Updated weights for policy 0, policy_version 906432 (0.0012) [2024-06-15 23:08:00,279][1651596] Signal inference workers to stop experience collection... (47050 times) [2024-06-15 23:08:00,333][1653645] InferenceWorker_p0-w0: stopping experience collection (47050 times) [2024-06-15 23:08:00,510][1651596] Signal inference workers to resume experience collection... (47050 times) [2024-06-15 23:08:00,510][1653645] InferenceWorker_p0-w0: resuming experience collection (47050 times) [2024-06-15 23:08:00,958][1648982] Fps is (10 sec: 72088.7, 60 sec: 70997.3, 300 sec: 63314.4). Total num frames: 1856438272. Throughput: 0: 17760.7. Samples: 464174080. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:08:01,155][1653645] Updated weights for policy 0, policy_version 906490 (0.0093) [2024-06-15 23:08:03,534][1653645] Updated weights for policy 0, policy_version 906528 (0.0011) [2024-06-15 23:08:03,948][1653645] Updated weights for policy 0, policy_version 906560 (0.0011) [2024-06-15 23:08:05,655][1653645] Updated weights for policy 0, policy_version 906624 (0.0013) [2024-06-15 23:08:05,958][1648982] Fps is (10 sec: 68814.5, 60 sec: 70997.3, 300 sec: 64092.0). Total num frames: 1856798720. Throughput: 0: 17635.5. Samples: 464227840. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:05,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:08:07,816][1653645] Updated weights for policy 0, policy_version 906689 (0.0012) [2024-06-15 23:08:08,505][1653645] Updated weights for policy 0, policy_version 906743 (0.0013) [2024-06-15 23:08:10,958][1648982] Fps is (10 sec: 62254.1, 60 sec: 70450.3, 300 sec: 63647.5). Total num frames: 1857060864. Throughput: 0: 17805.9. Samples: 464332288. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:10,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:08:11,390][1653645] Updated weights for policy 0, policy_version 906807 (0.0015) [2024-06-15 23:08:13,318][1653645] Updated weights for policy 0, policy_version 906880 (0.0011) [2024-06-15 23:08:14,088][1653645] Updated weights for policy 0, policy_version 906939 (0.0012) [2024-06-15 23:08:15,960][1648982] Fps is (10 sec: 68798.8, 60 sec: 70994.8, 300 sec: 64646.9). Total num frames: 1857486848. Throughput: 0: 17509.7. Samples: 464430592. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:15,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:08:16,212][1653645] Updated weights for policy 0, policy_version 907003 (0.0011) [2024-06-15 23:08:18,911][1653645] Updated weights for policy 0, policy_version 907068 (0.0012) [2024-06-15 23:08:20,259][1653645] Updated weights for policy 0, policy_version 907104 (0.0015) [2024-06-15 23:08:20,958][1648982] Fps is (10 sec: 75373.6, 60 sec: 69905.0, 300 sec: 64425.2). Total num frames: 1857814528. Throughput: 0: 17567.3. Samples: 464486400. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:08:21,263][1653645] Updated weights for policy 0, policy_version 907168 (0.0074) [2024-06-15 23:08:23,358][1653645] Updated weights for policy 0, policy_version 907248 (0.0070) [2024-06-15 23:08:25,958][1648982] Fps is (10 sec: 58994.8, 60 sec: 68812.9, 300 sec: 64647.4). Total num frames: 1858076672. Throughput: 0: 17464.9. Samples: 464586752. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:08:26,224][1653645] Updated weights for policy 0, policy_version 907280 (0.0012) [2024-06-15 23:08:27,749][1653645] Updated weights for policy 0, policy_version 907347 (0.0011) [2024-06-15 23:08:28,632][1651596] Signal inference workers to stop experience collection... (47100 times) [2024-06-15 23:08:28,646][1653645] Updated weights for policy 0, policy_version 907409 (0.0011) [2024-06-15 23:08:28,704][1653645] InferenceWorker_p0-w0: stopping experience collection (47100 times) [2024-06-15 23:08:28,805][1651596] Signal inference workers to resume experience collection... (47100 times) [2024-06-15 23:08:28,806][1653645] InferenceWorker_p0-w0: resuming experience collection (47100 times) [2024-06-15 23:08:29,182][1653645] Updated weights for policy 0, policy_version 907449 (0.0012) [2024-06-15 23:08:30,275][1653645] Updated weights for policy 0, policy_version 907491 (0.0012) [2024-06-15 23:08:30,958][1648982] Fps is (10 sec: 78643.1, 60 sec: 70997.7, 300 sec: 65313.8). Total num frames: 1858600960. Throughput: 0: 17294.3. Samples: 464691200. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:30,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:08:33,929][1653645] Updated weights for policy 0, policy_version 907559 (0.0012) [2024-06-15 23:08:34,980][1653645] Updated weights for policy 0, policy_version 907591 (0.0011) [2024-06-15 23:08:35,958][1648982] Fps is (10 sec: 78642.8, 60 sec: 69905.0, 300 sec: 65313.8). Total num frames: 1858863104. Throughput: 0: 17681.0. Samples: 464756736. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:08:36,014][1653645] Updated weights for policy 0, policy_version 907656 (0.0079) [2024-06-15 23:08:37,313][1653645] Updated weights for policy 0, policy_version 907719 (0.0011) [2024-06-15 23:08:38,089][1653645] Updated weights for policy 0, policy_version 907776 (0.0011) [2024-06-15 23:08:40,958][1648982] Fps is (10 sec: 58980.9, 60 sec: 68812.6, 300 sec: 65536.0). Total num frames: 1859190784. Throughput: 0: 17499.0. Samples: 464860160. Policy #0 lag: (min: 48.0, avg: 197.0, max: 304.0) [2024-06-15 23:08:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:08:42,279][1653645] Updated weights for policy 0, policy_version 907845 (0.0013) [2024-06-15 23:08:42,929][1653645] Updated weights for policy 0, policy_version 907890 (0.0010) [2024-06-15 23:08:43,837][1653645] Updated weights for policy 0, policy_version 907959 (0.0071) [2024-06-15 23:08:44,482][1653645] Updated weights for policy 0, policy_version 908000 (0.0034) [2024-06-15 23:08:45,958][1648982] Fps is (10 sec: 78643.5, 60 sec: 70451.5, 300 sec: 66202.5). Total num frames: 1859649536. Throughput: 0: 17544.6. Samples: 464963584. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:08:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:08:48,142][1653645] Updated weights for policy 0, policy_version 908048 (0.0016) [2024-06-15 23:08:49,670][1653645] Updated weights for policy 0, policy_version 908100 (0.0032) [2024-06-15 23:08:50,585][1653645] Updated weights for policy 0, policy_version 908176 (0.0011) [2024-06-15 23:08:50,961][1648982] Fps is (10 sec: 78618.9, 60 sec: 70993.4, 300 sec: 66423.9). Total num frames: 1859977216. Throughput: 0: 17679.8. Samples: 465023488. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:08:50,962][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:08:51,521][1653645] Updated weights for policy 0, policy_version 908240 (0.0011) [2024-06-15 23:08:55,958][1648982] Fps is (10 sec: 55703.8, 60 sec: 68266.6, 300 sec: 66313.5). Total num frames: 1860206592. Throughput: 0: 17681.3. Samples: 465127936. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:08:55,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 23:08:56,152][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000908320_1860239360.pth... [2024-06-15 23:08:56,268][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000900224_1843658752.pth [2024-06-15 23:08:56,464][1653645] Updated weights for policy 0, policy_version 908336 (0.0013) [2024-06-15 23:08:57,926][1651596] Signal inference workers to stop experience collection... (47150 times) [2024-06-15 23:08:57,946][1653645] InferenceWorker_p0-w0: stopping experience collection (47150 times) [2024-06-15 23:08:58,045][1651596] Signal inference workers to resume experience collection... (47150 times) [2024-06-15 23:08:58,045][1653645] InferenceWorker_p0-w0: resuming experience collection (47150 times) [2024-06-15 23:08:58,380][1653645] Updated weights for policy 0, policy_version 908403 (0.0012) [2024-06-15 23:08:59,416][1653645] Updated weights for policy 0, policy_version 908480 (0.0013) [2024-06-15 23:09:00,187][1653645] Updated weights for policy 0, policy_version 908542 (0.0009) [2024-06-15 23:09:00,958][1648982] Fps is (10 sec: 72112.5, 60 sec: 70997.3, 300 sec: 67091.1). Total num frames: 1860698112. Throughput: 0: 17590.8. Samples: 465222144. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:09:03,867][1653645] Updated weights for policy 0, policy_version 908603 (0.0027) [2024-06-15 23:09:05,567][1653645] Updated weights for policy 0, policy_version 908659 (0.0013) [2024-06-15 23:09:05,958][1648982] Fps is (10 sec: 78645.7, 60 sec: 69905.1, 300 sec: 66983.8). Total num frames: 1860993024. Throughput: 0: 17840.3. Samples: 465289216. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:09:06,505][1653645] Updated weights for policy 0, policy_version 908720 (0.0104) [2024-06-15 23:09:07,446][1653645] Updated weights for policy 0, policy_version 908792 (0.0011) [2024-06-15 23:09:10,625][1653645] Updated weights for policy 0, policy_version 908864 (0.0011) [2024-06-15 23:09:10,958][1648982] Fps is (10 sec: 65534.8, 60 sec: 71544.2, 300 sec: 67202.1). Total num frames: 1861353472. Throughput: 0: 18067.8. Samples: 465399808. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:10,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:09:12,216][1653645] Updated weights for policy 0, policy_version 908913 (0.0010) [2024-06-15 23:09:13,197][1653645] Updated weights for policy 0, policy_version 908992 (0.0010) [2024-06-15 23:09:14,102][1653645] Updated weights for policy 0, policy_version 909056 (0.0012) [2024-06-15 23:09:15,958][1648982] Fps is (10 sec: 75366.0, 60 sec: 70999.7, 300 sec: 67646.5). Total num frames: 1861746688. Throughput: 0: 17965.5. Samples: 465499648. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:15,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:09:19,270][1653645] Updated weights for policy 0, policy_version 909140 (0.0025) [2024-06-15 23:09:20,143][1653645] Updated weights for policy 0, policy_version 909205 (0.0011) [2024-06-15 23:09:20,958][1648982] Fps is (10 sec: 81922.2, 60 sec: 72635.6, 300 sec: 68090.8). Total num frames: 1862172672. Throughput: 0: 17817.6. Samples: 465558528. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:09:20,967][1651596] Signal inference workers to stop experience collection... (47200 times) [2024-06-15 23:09:21,013][1653645] InferenceWorker_p0-w0: stopping experience collection (47200 times) [2024-06-15 23:09:21,078][1651596] Signal inference workers to resume experience collection... (47200 times) [2024-06-15 23:09:21,079][1653645] InferenceWorker_p0-w0: resuming experience collection (47200 times) [2024-06-15 23:09:21,080][1653645] Updated weights for policy 0, policy_version 909280 (0.0012) [2024-06-15 23:09:25,358][1653645] Updated weights for policy 0, policy_version 909329 (0.0011) [2024-06-15 23:09:25,958][1648982] Fps is (10 sec: 65535.7, 60 sec: 72089.5, 300 sec: 67979.7). Total num frames: 1862402048. Throughput: 0: 17840.4. Samples: 465662976. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:09:26,291][1653645] Updated weights for policy 0, policy_version 909392 (0.0011) [2024-06-15 23:09:26,930][1653645] Updated weights for policy 0, policy_version 909440 (0.0012) [2024-06-15 23:09:28,277][1653645] Updated weights for policy 0, policy_version 909504 (0.0012) [2024-06-15 23:09:29,160][1653645] Updated weights for policy 0, policy_version 909563 (0.0013) [2024-06-15 23:09:30,958][1648982] Fps is (10 sec: 62258.8, 60 sec: 69904.9, 300 sec: 68424.0). Total num frames: 1862795264. Throughput: 0: 17863.1. Samples: 465767424. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:30,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:09:32,960][1653645] Updated weights for policy 0, policy_version 909623 (0.0011) [2024-06-15 23:09:33,745][1653645] Updated weights for policy 0, policy_version 909667 (0.0010) [2024-06-15 23:09:34,795][1653645] Updated weights for policy 0, policy_version 909716 (0.0013) [2024-06-15 23:09:35,827][1653645] Updated weights for policy 0, policy_version 909780 (0.0012) [2024-06-15 23:09:35,958][1648982] Fps is (10 sec: 85197.7, 60 sec: 73181.9, 300 sec: 69090.5). Total num frames: 1863254016. Throughput: 0: 17864.4. Samples: 465827328. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:35,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:09:40,286][1653645] Updated weights for policy 0, policy_version 909856 (0.0010) [2024-06-15 23:09:40,957][1648982] Fps is (10 sec: 68814.1, 60 sec: 71543.8, 300 sec: 68979.4). Total num frames: 1863483392. Throughput: 0: 17942.9. Samples: 465935360. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:40,963][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:09:41,169][1653645] Updated weights for policy 0, policy_version 909920 (0.0012) [2024-06-15 23:09:42,320][1653645] Updated weights for policy 0, policy_version 909969 (0.0011) [2024-06-15 23:09:43,243][1653645] Updated weights for policy 0, policy_version 910037 (0.0012) [2024-06-15 23:09:45,958][1648982] Fps is (10 sec: 58982.5, 60 sec: 69905.1, 300 sec: 69312.6). Total num frames: 1863843840. Throughput: 0: 17988.3. Samples: 466031616. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:45,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 23:09:47,809][1653645] Updated weights for policy 0, policy_version 910102 (0.0012) [2024-06-15 23:09:48,836][1653645] Updated weights for policy 0, policy_version 910176 (0.0081) [2024-06-15 23:09:50,076][1651596] Signal inference workers to stop experience collection... (47250 times) [2024-06-15 23:09:50,103][1653645] InferenceWorker_p0-w0: stopping experience collection (47250 times) [2024-06-15 23:09:50,217][1651596] Signal inference workers to resume experience collection... (47250 times) [2024-06-15 23:09:50,218][1653645] InferenceWorker_p0-w0: resuming experience collection (47250 times) [2024-06-15 23:09:50,221][1653645] Updated weights for policy 0, policy_version 910240 (0.0013) [2024-06-15 23:09:50,958][1648982] Fps is (10 sec: 78642.5, 60 sec: 71547.4, 300 sec: 69868.0). Total num frames: 1864269824. Throughput: 0: 17749.3. Samples: 466087936. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:09:51,356][1653645] Updated weights for policy 0, policy_version 910310 (0.0163) [2024-06-15 23:09:55,061][1653645] Updated weights for policy 0, policy_version 910340 (0.0011) [2024-06-15 23:09:55,903][1653645] Updated weights for policy 0, policy_version 910403 (0.0010) [2024-06-15 23:09:55,958][1648982] Fps is (10 sec: 65535.5, 60 sec: 71543.8, 300 sec: 69534.8). Total num frames: 1864499200. Throughput: 0: 17647.0. Samples: 466193920. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:09:55,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:09:56,731][1653645] Updated weights for policy 0, policy_version 910464 (0.0012) [2024-06-15 23:09:57,880][1653645] Updated weights for policy 0, policy_version 910507 (0.0011) [2024-06-15 23:09:58,915][1653645] Updated weights for policy 0, policy_version 910576 (0.0011) [2024-06-15 23:10:00,958][1648982] Fps is (10 sec: 62258.6, 60 sec: 69905.1, 300 sec: 69979.1). Total num frames: 1864892416. Throughput: 0: 17578.6. Samples: 466290688. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:00,959][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:10:02,574][1653645] Updated weights for policy 0, policy_version 910612 (0.0013) [2024-06-15 23:10:03,368][1653645] Updated weights for policy 0, policy_version 910672 (0.0011) [2024-06-15 23:10:04,668][1653645] Updated weights for policy 0, policy_version 910727 (0.0011) [2024-06-15 23:10:05,422][1653645] Updated weights for policy 0, policy_version 910786 (0.0009) [2024-06-15 23:10:05,958][1648982] Fps is (10 sec: 85197.4, 60 sec: 72635.8, 300 sec: 70312.4). Total num frames: 1865351168. Throughput: 0: 17612.8. Samples: 466351104. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:10:06,136][1653645] Updated weights for policy 0, policy_version 910838 (0.0013) [2024-06-15 23:10:10,029][1653645] Updated weights for policy 0, policy_version 910896 (0.0024) [2024-06-15 23:10:10,958][1648982] Fps is (10 sec: 68813.2, 60 sec: 70451.5, 300 sec: 70312.3). Total num frames: 1865580544. Throughput: 0: 17692.5. Samples: 466459136. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:10:11,085][1653645] Updated weights for policy 0, policy_version 910944 (0.0011) [2024-06-15 23:10:12,234][1653645] Updated weights for policy 0, policy_version 910997 (0.0018) [2024-06-15 23:10:13,165][1653645] Updated weights for policy 0, policy_version 911072 (0.0012) [2024-06-15 23:10:15,958][1648982] Fps is (10 sec: 58982.3, 60 sec: 69905.1, 300 sec: 70201.3). Total num frames: 1865940992. Throughput: 0: 17681.1. Samples: 466563072. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:15,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:10:16,873][1653645] Updated weights for policy 0, policy_version 911120 (0.0012) [2024-06-15 23:10:18,230][1651596] Signal inference workers to stop experience collection... (47300 times) [2024-06-15 23:10:18,274][1653645] Updated weights for policy 0, policy_version 911189 (0.0015) [2024-06-15 23:10:18,288][1653645] InferenceWorker_p0-w0: stopping experience collection (47300 times) [2024-06-15 23:10:18,382][1651596] Signal inference workers to resume experience collection... (47300 times) [2024-06-15 23:10:18,383][1653645] InferenceWorker_p0-w0: resuming experience collection (47300 times) [2024-06-15 23:10:18,847][1653645] Updated weights for policy 0, policy_version 911232 (0.0107) [2024-06-15 23:10:20,340][1653645] Updated weights for policy 0, policy_version 911312 (0.0064) [2024-06-15 23:10:20,958][1648982] Fps is (10 sec: 88474.4, 60 sec: 71543.6, 300 sec: 71534.3). Total num frames: 1866465280. Throughput: 0: 17555.9. Samples: 466617344. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:10:24,695][1653645] Updated weights for policy 0, policy_version 911393 (0.0011) [2024-06-15 23:10:25,452][1653645] Updated weights for policy 0, policy_version 911427 (0.0010) [2024-06-15 23:10:25,958][1648982] Fps is (10 sec: 72089.5, 60 sec: 70997.4, 300 sec: 70867.7). Total num frames: 1866661888. Throughput: 0: 17544.5. Samples: 466724864. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:25,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:10:26,818][1653645] Updated weights for policy 0, policy_version 911489 (0.0011) [2024-06-15 23:10:27,643][1653645] Updated weights for policy 0, policy_version 911552 (0.0010) [2024-06-15 23:10:28,423][1653645] Updated weights for policy 0, policy_version 911610 (0.0013) [2024-06-15 23:10:30,958][1648982] Fps is (10 sec: 52428.0, 60 sec: 69905.1, 300 sec: 70645.5). Total num frames: 1866989568. Throughput: 0: 17817.5. Samples: 466833408. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:30,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:10:31,724][1653645] Updated weights for policy 0, policy_version 911664 (0.0011) [2024-06-15 23:10:33,165][1653645] Updated weights for policy 0, policy_version 911704 (0.0011) [2024-06-15 23:10:33,666][1653645] Updated weights for policy 0, policy_version 911744 (0.0018) [2024-06-15 23:10:35,199][1653645] Updated weights for policy 0, policy_version 911824 (0.0142) [2024-06-15 23:10:35,729][1653645] Updated weights for policy 0, policy_version 911865 (0.0012) [2024-06-15 23:10:35,958][1648982] Fps is (10 sec: 85195.7, 60 sec: 70997.2, 300 sec: 71534.2). Total num frames: 1867513856. Throughput: 0: 17737.9. Samples: 466886144. Policy #0 lag: (min: 63.0, avg: 210.0, max: 303.0) [2024-06-15 23:10:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:10:39,639][1653645] Updated weights for policy 0, policy_version 911924 (0.0011) [2024-06-15 23:10:40,943][1653645] Updated weights for policy 0, policy_version 911968 (0.0012) [2024-06-15 23:10:40,958][1648982] Fps is (10 sec: 72090.3, 60 sec: 70451.1, 300 sec: 70867.7). Total num frames: 1867710464. Throughput: 0: 17726.6. Samples: 466991616. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:10:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:10:41,823][1653645] Updated weights for policy 0, policy_version 912032 (0.0071) [2024-06-15 23:10:42,902][1653645] Updated weights for policy 0, policy_version 912098 (0.0060) [2024-06-15 23:10:45,958][1648982] Fps is (10 sec: 52429.7, 60 sec: 69905.1, 300 sec: 70645.6). Total num frames: 1868038144. Throughput: 0: 17885.9. Samples: 467095552. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:10:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:10:46,896][1653645] Updated weights for policy 0, policy_version 912144 (0.0011) [2024-06-15 23:10:46,964][1651596] Signal inference workers to stop experience collection... (47350 times) [2024-06-15 23:10:47,021][1653645] InferenceWorker_p0-w0: stopping experience collection (47350 times) [2024-06-15 23:10:47,109][1651596] Signal inference workers to resume experience collection... (47350 times) [2024-06-15 23:10:47,109][1653645] InferenceWorker_p0-w0: resuming experience collection (47350 times) [2024-06-15 23:10:48,253][1653645] Updated weights for policy 0, policy_version 912208 (0.0054) [2024-06-15 23:10:49,217][1653645] Updated weights for policy 0, policy_version 912272 (0.0011) [2024-06-15 23:10:49,953][1653645] Updated weights for policy 0, policy_version 912321 (0.0009) [2024-06-15 23:10:50,523][1653645] Updated weights for policy 0, policy_version 912370 (0.0013) [2024-06-15 23:10:50,958][1648982] Fps is (10 sec: 85194.2, 60 sec: 71543.1, 300 sec: 71534.3). Total num frames: 1868562432. Throughput: 0: 17726.4. Samples: 467148800. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:10:50,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:10:53,557][1653645] Updated weights for policy 0, policy_version 912403 (0.0010) [2024-06-15 23:10:55,589][1653645] Updated weights for policy 0, policy_version 912494 (0.0130) [2024-06-15 23:10:55,958][1648982] Fps is (10 sec: 78640.0, 60 sec: 72089.2, 300 sec: 71090.1). Total num frames: 1868824576. Throughput: 0: 17885.7. Samples: 467264000. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:10:55,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:10:56,275][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000912544_1868890112.pth... [2024-06-15 23:10:56,380][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000904192_1851785216.pth [2024-06-15 23:10:56,383][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000912544_1868890112.pth [2024-06-15 23:10:56,708][1653645] Updated weights for policy 0, policy_version 912562 (0.0013) [2024-06-15 23:10:57,691][1653645] Updated weights for policy 0, policy_version 912630 (0.0011) [2024-06-15 23:11:00,958][1648982] Fps is (10 sec: 52429.3, 60 sec: 69904.9, 300 sec: 70645.6). Total num frames: 1869086720. Throughput: 0: 17931.3. Samples: 467369984. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:00,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:11:01,528][1653645] Updated weights for policy 0, policy_version 912688 (0.0013) [2024-06-15 23:11:02,410][1653645] Updated weights for policy 0, policy_version 912738 (0.0012) [2024-06-15 23:11:03,337][1653645] Updated weights for policy 0, policy_version 912802 (0.0010) [2024-06-15 23:11:04,197][1653645] Updated weights for policy 0, policy_version 912864 (0.0011) [2024-06-15 23:11:05,958][1648982] Fps is (10 sec: 78646.4, 60 sec: 70997.4, 300 sec: 71534.2). Total num frames: 1869611008. Throughput: 0: 17874.5. Samples: 467421696. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:05,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:11:09,192][1653645] Updated weights for policy 0, policy_version 912932 (0.0016) [2024-06-15 23:11:09,949][1653645] Updated weights for policy 0, policy_version 912977 (0.0009) [2024-06-15 23:11:10,913][1653645] Updated weights for policy 0, policy_version 913040 (0.0012) [2024-06-15 23:11:10,958][1648982] Fps is (10 sec: 81921.9, 60 sec: 72089.7, 300 sec: 71201.0). Total num frames: 1869905920. Throughput: 0: 17738.0. Samples: 467523072. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:10,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:11:11,167][1651596] Signal inference workers to stop experience collection... (47400 times) [2024-06-15 23:11:11,213][1653645] InferenceWorker_p0-w0: stopping experience collection (47400 times) [2024-06-15 23:11:11,303][1651596] Signal inference workers to resume experience collection... (47400 times) [2024-06-15 23:11:11,304][1653645] InferenceWorker_p0-w0: resuming experience collection (47400 times) [2024-06-15 23:11:11,956][1653645] Updated weights for policy 0, policy_version 913120 (0.0009) [2024-06-15 23:11:15,958][1648982] Fps is (10 sec: 52428.5, 60 sec: 69905.0, 300 sec: 70645.6). Total num frames: 1870135296. Throughput: 0: 17897.3. Samples: 467638784. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:15,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:11:16,387][1653645] Updated weights for policy 0, policy_version 913184 (0.0013) [2024-06-15 23:11:17,589][1653645] Updated weights for policy 0, policy_version 913264 (0.0011) [2024-06-15 23:11:18,672][1653645] Updated weights for policy 0, policy_version 913318 (0.0011) [2024-06-15 23:11:19,160][1653645] Updated weights for policy 0, policy_version 913345 (0.0011) [2024-06-15 23:11:20,958][1648982] Fps is (10 sec: 75366.2, 60 sec: 69905.0, 300 sec: 71534.2). Total num frames: 1870659584. Throughput: 0: 17578.7. Samples: 467677184. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:20,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:11:23,604][1653645] Updated weights for policy 0, policy_version 913428 (0.0012) [2024-06-15 23:11:24,449][1653645] Updated weights for policy 0, policy_version 913494 (0.0009) [2024-06-15 23:11:25,592][1653645] Updated weights for policy 0, policy_version 913572 (0.0012) [2024-06-15 23:11:25,958][1648982] Fps is (10 sec: 91749.8, 60 sec: 73181.8, 300 sec: 71534.2). Total num frames: 1871052800. Throughput: 0: 17965.5. Samples: 467800064. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:25,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:11:26,529][1653645] Updated weights for policy 0, policy_version 913636 (0.0013) [2024-06-15 23:11:30,688][1653645] Updated weights for policy 0, policy_version 913680 (0.0011) [2024-06-15 23:11:30,958][1648982] Fps is (10 sec: 58982.5, 60 sec: 70997.5, 300 sec: 70867.8). Total num frames: 1871249408. Throughput: 0: 17988.3. Samples: 467905024. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:11:31,463][1653645] Updated weights for policy 0, policy_version 913730 (0.0021) [2024-06-15 23:11:32,208][1653645] Updated weights for policy 0, policy_version 913788 (0.0010) [2024-06-15 23:11:32,880][1653645] Updated weights for policy 0, policy_version 913840 (0.0011) [2024-06-15 23:11:33,844][1653645] Updated weights for policy 0, policy_version 913911 (0.0022) [2024-06-15 23:11:35,958][1648982] Fps is (10 sec: 65536.2, 60 sec: 69905.1, 300 sec: 71534.2). Total num frames: 1871708160. Throughput: 0: 18056.6. Samples: 467961344. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:35,969][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:11:38,020][1653645] Updated weights for policy 0, policy_version 913952 (0.0012) [2024-06-15 23:11:38,382][1651596] Signal inference workers to stop experience collection... (47450 times) [2024-06-15 23:11:38,414][1653645] InferenceWorker_p0-w0: stopping experience collection (47450 times) [2024-06-15 23:11:38,525][1651596] Signal inference workers to resume experience collection... (47450 times) [2024-06-15 23:11:38,526][1653645] InferenceWorker_p0-w0: resuming experience collection (47450 times) [2024-06-15 23:11:39,087][1653645] Updated weights for policy 0, policy_version 914019 (0.0019) [2024-06-15 23:11:40,119][1653645] Updated weights for policy 0, policy_version 914085 (0.0027) [2024-06-15 23:11:40,958][1648982] Fps is (10 sec: 88473.2, 60 sec: 73728.0, 300 sec: 71645.3). Total num frames: 1872134144. Throughput: 0: 17795.0. Samples: 468064768. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:40,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:11:41,025][1653645] Updated weights for policy 0, policy_version 914131 (0.0010) [2024-06-15 23:11:41,557][1653645] Updated weights for policy 0, policy_version 914176 (0.0009) [2024-06-15 23:11:45,567][1653645] Updated weights for policy 0, policy_version 914227 (0.0010) [2024-06-15 23:11:45,958][1648982] Fps is (10 sec: 68812.9, 60 sec: 72635.6, 300 sec: 70757.0). Total num frames: 1872396288. Throughput: 0: 17920.1. Samples: 468176384. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:11:46,397][1653645] Updated weights for policy 0, policy_version 914288 (0.0011) [2024-06-15 23:11:47,224][1653645] Updated weights for policy 0, policy_version 914328 (0.0010) [2024-06-15 23:11:47,718][1653645] Updated weights for policy 0, policy_version 914368 (0.0011) [2024-06-15 23:11:49,062][1653645] Updated weights for policy 0, policy_version 914426 (0.0011) [2024-06-15 23:11:50,958][1648982] Fps is (10 sec: 62258.4, 60 sec: 69905.2, 300 sec: 71089.9). Total num frames: 1872756736. Throughput: 0: 17840.3. Samples: 468224512. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:50,960][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:11:52,489][1653645] Updated weights for policy 0, policy_version 914483 (0.0011) [2024-06-15 23:11:54,002][1653645] Updated weights for policy 0, policy_version 914561 (0.0012) [2024-06-15 23:11:54,757][1653645] Updated weights for policy 0, policy_version 914619 (0.0010) [2024-06-15 23:11:55,958][1648982] Fps is (10 sec: 81920.2, 60 sec: 73182.3, 300 sec: 71312.1). Total num frames: 1873215488. Throughput: 0: 18022.4. Samples: 468334080. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:11:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:11:56,174][1653645] Updated weights for policy 0, policy_version 914681 (0.0011) [2024-06-15 23:12:00,542][1653645] Updated weights for policy 0, policy_version 914752 (0.0071) [2024-06-15 23:12:00,958][1648982] Fps is (10 sec: 68812.4, 60 sec: 72635.7, 300 sec: 70867.7). Total num frames: 1873444864. Throughput: 0: 17851.7. Samples: 468442112. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:00,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:12:01,284][1653645] Updated weights for policy 0, policy_version 914801 (0.0012) [2024-06-15 23:12:02,314][1653645] Updated weights for policy 0, policy_version 914873 (0.0014) [2024-06-15 23:12:03,643][1651596] Signal inference workers to stop experience collection... (47500 times) [2024-06-15 23:12:03,655][1653645] Updated weights for policy 0, policy_version 914913 (0.0012) [2024-06-15 23:12:03,668][1653645] InferenceWorker_p0-w0: stopping experience collection (47500 times) [2024-06-15 23:12:03,774][1651596] Signal inference workers to resume experience collection... (47500 times) [2024-06-15 23:12:03,774][1653645] InferenceWorker_p0-w0: resuming experience collection (47500 times) [2024-06-15 23:12:05,958][1648982] Fps is (10 sec: 58982.6, 60 sec: 69905.0, 300 sec: 71089.9). Total num frames: 1873805312. Throughput: 0: 18079.3. Samples: 468490752. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:12:07,736][1653645] Updated weights for policy 0, policy_version 914992 (0.0013) [2024-06-15 23:12:08,733][1653645] Updated weights for policy 0, policy_version 915066 (0.0012) [2024-06-15 23:12:10,147][1653645] Updated weights for policy 0, policy_version 915134 (0.0025) [2024-06-15 23:12:10,958][1648982] Fps is (10 sec: 75367.9, 60 sec: 71543.4, 300 sec: 71089.9). Total num frames: 1874198528. Throughput: 0: 17715.2. Samples: 468597248. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:12:11,739][1653645] Updated weights for policy 0, policy_version 915191 (0.0011) [2024-06-15 23:12:14,442][1653645] Updated weights for policy 0, policy_version 915236 (0.0010) [2024-06-15 23:12:15,232][1653645] Updated weights for policy 0, policy_version 915296 (0.0025) [2024-06-15 23:12:15,958][1648982] Fps is (10 sec: 78643.0, 60 sec: 74274.1, 300 sec: 71089.9). Total num frames: 1874591744. Throughput: 0: 17874.5. Samples: 468709376. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:15,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:12:16,759][1653645] Updated weights for policy 0, policy_version 915344 (0.0011) [2024-06-15 23:12:17,365][1653645] Updated weights for policy 0, policy_version 915387 (0.0010) [2024-06-15 23:12:18,616][1653645] Updated weights for policy 0, policy_version 915428 (0.0010) [2024-06-15 23:12:20,957][1648982] Fps is (10 sec: 65536.5, 60 sec: 69905.1, 300 sec: 70867.8). Total num frames: 1874853888. Throughput: 0: 17840.4. Samples: 468764160. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:20,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:12:21,335][1653645] Updated weights for policy 0, policy_version 915472 (0.0012) [2024-06-15 23:12:22,268][1653645] Updated weights for policy 0, policy_version 915536 (0.0011) [2024-06-15 23:12:22,836][1653645] Updated weights for policy 0, policy_version 915580 (0.0015) [2024-06-15 23:12:24,808][1653645] Updated weights for policy 0, policy_version 915641 (0.0012) [2024-06-15 23:12:25,610][1653645] Updated weights for policy 0, policy_version 915683 (0.0009) [2024-06-15 23:12:25,958][1648982] Fps is (10 sec: 78643.0, 60 sec: 72089.7, 300 sec: 71312.1). Total num frames: 1875378176. Throughput: 0: 18022.4. Samples: 468875776. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:25,960][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:12:28,328][1653645] Updated weights for policy 0, policy_version 915728 (0.0050) [2024-06-15 23:12:29,333][1653645] Updated weights for policy 0, policy_version 915792 (0.0076) [2024-06-15 23:12:29,985][1653645] Updated weights for policy 0, policy_version 915832 (0.0012) [2024-06-15 23:12:30,957][1648982] Fps is (10 sec: 78643.4, 60 sec: 73181.9, 300 sec: 71089.9). Total num frames: 1875640320. Throughput: 0: 17726.6. Samples: 468974080. Policy #0 lag: (min: 15.0, avg: 92.9, max: 271.0) [2024-06-15 23:12:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:12:31,915][1651596] Signal inference workers to stop experience collection... (47550 times) [2024-06-15 23:12:31,964][1653645] InferenceWorker_p0-w0: stopping experience collection (47550 times) [2024-06-15 23:12:32,111][1651596] Signal inference workers to resume experience collection... (47550 times) [2024-06-15 23:12:32,112][1653645] InferenceWorker_p0-w0: resuming experience collection (47550 times) [2024-06-15 23:12:32,114][1653645] Updated weights for policy 0, policy_version 915888 (0.0011) [2024-06-15 23:12:33,273][1653645] Updated weights for policy 0, policy_version 915960 (0.0012) [2024-06-15 23:12:35,557][1653645] Updated weights for policy 0, policy_version 916000 (0.0011) [2024-06-15 23:12:35,982][1648982] Fps is (10 sec: 62106.4, 60 sec: 71514.2, 300 sec: 70972.9). Total num frames: 1876000768. Throughput: 0: 17898.9. Samples: 469030400. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:12:35,983][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:12:36,357][1653645] Updated weights for policy 0, policy_version 916049 (0.0011) [2024-06-15 23:12:38,425][1653645] Updated weights for policy 0, policy_version 916099 (0.0011) [2024-06-15 23:12:39,321][1653645] Updated weights for policy 0, policy_version 916160 (0.0010) [2024-06-15 23:12:40,958][1648982] Fps is (10 sec: 78642.7, 60 sec: 71543.5, 300 sec: 71201.0). Total num frames: 1876426752. Throughput: 0: 17920.0. Samples: 469140480. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:12:40,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:12:42,317][1653645] Updated weights for policy 0, policy_version 916240 (0.0010) [2024-06-15 23:12:43,550][1653645] Updated weights for policy 0, policy_version 916309 (0.0009) [2024-06-15 23:12:45,533][1653645] Updated weights for policy 0, policy_version 916368 (0.0012) [2024-06-15 23:12:45,958][1648982] Fps is (10 sec: 75552.0, 60 sec: 72635.7, 300 sec: 71312.0). Total num frames: 1876754432. Throughput: 0: 18295.5. Samples: 469265408. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:12:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:12:46,855][1653645] Updated weights for policy 0, policy_version 916437 (0.0013) [2024-06-15 23:12:49,548][1653645] Updated weights for policy 0, policy_version 916496 (0.0013) [2024-06-15 23:12:50,853][1653645] Updated weights for policy 0, policy_version 916560 (0.0010) [2024-06-15 23:12:50,958][1648982] Fps is (10 sec: 68812.0, 60 sec: 72635.8, 300 sec: 71201.0). Total num frames: 1877114880. Throughput: 0: 18238.5. Samples: 469311488. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:12:50,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:12:51,518][1653645] Updated weights for policy 0, policy_version 916607 (0.0009) [2024-06-15 23:12:54,011][1653645] Updated weights for policy 0, policy_version 916657 (0.0009) [2024-06-15 23:12:54,774][1653645] Updated weights for policy 0, policy_version 916720 (0.0009) [2024-06-15 23:12:55,958][1648982] Fps is (10 sec: 75366.4, 60 sec: 71543.4, 300 sec: 71423.1). Total num frames: 1877508096. Throughput: 0: 18329.6. Samples: 469422080. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:12:55,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:12:56,138][1653645] Updated weights for policy 0, policy_version 916774 (0.0010) [2024-06-15 23:12:56,240][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000916784_1877573632.pth... [2024-06-15 23:12:56,284][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000908320_1860239360.pth [2024-06-15 23:12:57,348][1653645] Updated weights for policy 0, policy_version 916816 (0.0012) [2024-06-15 23:13:00,958][1648982] Fps is (10 sec: 62258.2, 60 sec: 71543.4, 300 sec: 70978.8). Total num frames: 1877737472. Throughput: 0: 18147.4. Samples: 469526016. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:00,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:13:01,134][1653645] Updated weights for policy 0, policy_version 916881 (0.0012) [2024-06-15 23:13:01,355][1651596] Signal inference workers to stop experience collection... (47600 times) [2024-06-15 23:13:01,389][1653645] InferenceWorker_p0-w0: stopping experience collection (47600 times) [2024-06-15 23:13:01,516][1651596] Signal inference workers to resume experience collection... (47600 times) [2024-06-15 23:13:01,517][1653645] InferenceWorker_p0-w0: resuming experience collection (47600 times) [2024-06-15 23:13:01,998][1653645] Updated weights for policy 0, policy_version 916944 (0.0010) [2024-06-15 23:13:03,655][1653645] Updated weights for policy 0, policy_version 917008 (0.0073) [2024-06-15 23:13:04,331][1653645] Updated weights for policy 0, policy_version 917057 (0.0011) [2024-06-15 23:13:05,958][1648982] Fps is (10 sec: 75366.8, 60 sec: 74274.1, 300 sec: 71867.7). Total num frames: 1878261760. Throughput: 0: 18204.4. Samples: 469583360. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:05,960][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:13:07,509][1653645] Updated weights for policy 0, policy_version 917122 (0.0012) [2024-06-15 23:13:08,671][1653645] Updated weights for policy 0, policy_version 917207 (0.0011) [2024-06-15 23:13:09,907][1653645] Updated weights for policy 0, policy_version 917252 (0.0009) [2024-06-15 23:13:10,519][1653645] Updated weights for policy 0, policy_version 917308 (0.0011) [2024-06-15 23:13:10,958][1648982] Fps is (10 sec: 91752.5, 60 sec: 74274.1, 300 sec: 71756.9). Total num frames: 1878654976. Throughput: 0: 18250.0. Samples: 469697024. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:13:11,460][1653645] Updated weights for policy 0, policy_version 917360 (0.0010) [2024-06-15 23:13:14,364][1653645] Updated weights for policy 0, policy_version 917410 (0.0011) [2024-06-15 23:13:15,046][1653645] Updated weights for policy 0, policy_version 917463 (0.0020) [2024-06-15 23:13:15,958][1648982] Fps is (10 sec: 78642.1, 60 sec: 74273.9, 300 sec: 71978.5). Total num frames: 1879048192. Throughput: 0: 18909.8. Samples: 469825024. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:15,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:13:16,348][1653645] Updated weights for policy 0, policy_version 917507 (0.0016) [2024-06-15 23:13:17,072][1653645] Updated weights for policy 0, policy_version 917566 (0.0012) [2024-06-15 23:13:20,292][1653645] Updated weights for policy 0, policy_version 917633 (0.0013) [2024-06-15 23:13:20,958][1648982] Fps is (10 sec: 75366.9, 60 sec: 75912.5, 300 sec: 72311.8). Total num frames: 1879408640. Throughput: 0: 18863.3. Samples: 469878784. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:13:21,463][1653645] Updated weights for policy 0, policy_version 917712 (0.0015) [2024-06-15 23:13:22,026][1653645] Updated weights for policy 0, policy_version 917758 (0.0011) [2024-06-15 23:13:23,321][1653645] Updated weights for policy 0, policy_version 917821 (0.0074) [2024-06-15 23:13:24,409][1653645] Updated weights for policy 0, policy_version 917888 (0.0011) [2024-06-15 23:13:25,958][1648982] Fps is (10 sec: 78643.6, 60 sec: 74274.0, 300 sec: 71978.5). Total num frames: 1879834624. Throughput: 0: 19126.0. Samples: 470001152. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:25,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:13:26,992][1651596] Signal inference workers to stop experience collection... (47650 times) [2024-06-15 23:13:27,019][1653645] InferenceWorker_p0-w0: stopping experience collection (47650 times) [2024-06-15 23:13:27,121][1651596] Signal inference workers to resume experience collection... (47650 times) [2024-06-15 23:13:27,122][1653645] InferenceWorker_p0-w0: resuming experience collection (47650 times) [2024-06-15 23:13:27,639][1653645] Updated weights for policy 0, policy_version 917947 (0.0013) [2024-06-15 23:13:28,553][1653645] Updated weights for policy 0, policy_version 918011 (0.0012) [2024-06-15 23:13:30,185][1653645] Updated weights for policy 0, policy_version 918085 (0.0013) [2024-06-15 23:13:30,854][1653645] Updated weights for policy 0, policy_version 918141 (0.0013) [2024-06-15 23:13:30,958][1648982] Fps is (10 sec: 95026.7, 60 sec: 78643.1, 300 sec: 72867.1). Total num frames: 1880358912. Throughput: 0: 19046.4. Samples: 470122496. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:13:34,146][1653645] Updated weights for policy 0, policy_version 918203 (0.0010) [2024-06-15 23:13:35,547][1653645] Updated weights for policy 0, policy_version 918270 (0.0013) [2024-06-15 23:13:35,958][1648982] Fps is (10 sec: 78643.7, 60 sec: 77036.4, 300 sec: 72645.0). Total num frames: 1880621056. Throughput: 0: 19512.9. Samples: 470189568. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:35,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:13:36,668][1653645] Updated weights for policy 0, policy_version 918336 (0.0015) [2024-06-15 23:13:37,540][1653645] Updated weights for policy 0, policy_version 918398 (0.0086) [2024-06-15 23:13:40,958][1648982] Fps is (10 sec: 58981.6, 60 sec: 75366.2, 300 sec: 72200.6). Total num frames: 1880948736. Throughput: 0: 19547.0. Samples: 470301696. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:40,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:13:41,224][1653645] Updated weights for policy 0, policy_version 918456 (0.0012) [2024-06-15 23:13:41,741][1653645] Updated weights for policy 0, policy_version 918484 (0.0011) [2024-06-15 23:13:42,205][1653645] Updated weights for policy 0, policy_version 918525 (0.0011) [2024-06-15 23:13:43,267][1653645] Updated weights for policy 0, policy_version 918592 (0.0011) [2024-06-15 23:13:44,079][1653645] Updated weights for policy 0, policy_version 918650 (0.0009) [2024-06-15 23:13:45,958][1648982] Fps is (10 sec: 78643.8, 60 sec: 77551.1, 300 sec: 72645.8). Total num frames: 1881407488. Throughput: 0: 19854.4. Samples: 470419456. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:13:47,817][1653645] Updated weights for policy 0, policy_version 918704 (0.0010) [2024-06-15 23:13:48,580][1653645] Updated weights for policy 0, policy_version 918753 (0.0012) [2024-06-15 23:13:49,775][1653645] Updated weights for policy 0, policy_version 918806 (0.0013) [2024-06-15 23:13:50,596][1651596] Signal inference workers to stop experience collection... (47700 times) [2024-06-15 23:13:50,646][1653645] InferenceWorker_p0-w0: stopping experience collection (47700 times) [2024-06-15 23:13:50,648][1653645] Updated weights for policy 0, policy_version 918867 (0.0068) [2024-06-15 23:13:50,761][1651596] Signal inference workers to resume experience collection... (47700 times) [2024-06-15 23:13:50,762][1653645] InferenceWorker_p0-w0: resuming experience collection (47700 times) [2024-06-15 23:13:50,958][1648982] Fps is (10 sec: 95028.1, 60 sec: 79735.5, 300 sec: 73533.7). Total num frames: 1881899008. Throughput: 0: 19922.5. Samples: 470479872. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:50,959][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 23:13:53,729][1653645] Updated weights for policy 0, policy_version 918914 (0.0010) [2024-06-15 23:13:54,437][1653645] Updated weights for policy 0, policy_version 918975 (0.0010) [2024-06-15 23:13:55,705][1653645] Updated weights for policy 0, policy_version 919034 (0.0149) [2024-06-15 23:13:55,958][1648982] Fps is (10 sec: 78642.9, 60 sec: 78097.1, 300 sec: 72867.2). Total num frames: 1882193920. Throughput: 0: 19968.0. Samples: 470595584. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:13:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:13:56,871][1653645] Updated weights for policy 0, policy_version 919088 (0.0011) [2024-06-15 23:13:57,786][1653645] Updated weights for policy 0, policy_version 919152 (0.0012) [2024-06-15 23:14:00,450][1653645] Updated weights for policy 0, policy_version 919184 (0.0010) [2024-06-15 23:14:00,958][1648982] Fps is (10 sec: 65535.2, 60 sec: 80281.7, 300 sec: 73089.3). Total num frames: 1882554368. Throughput: 0: 19842.8. Samples: 470717952. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:00,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:14:01,058][1653645] Updated weights for policy 0, policy_version 919227 (0.0015) [2024-06-15 23:14:02,136][1653645] Updated weights for policy 0, policy_version 919291 (0.0010) [2024-06-15 23:14:03,522][1653645] Updated weights for policy 0, policy_version 919345 (0.0011) [2024-06-15 23:14:04,184][1653645] Updated weights for policy 0, policy_version 919395 (0.0011) [2024-06-15 23:14:05,957][1648982] Fps is (10 sec: 78644.1, 60 sec: 78643.3, 300 sec: 73311.6). Total num frames: 1882980352. Throughput: 0: 19865.6. Samples: 470772736. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:14:06,838][1653645] Updated weights for policy 0, policy_version 919430 (0.0009) [2024-06-15 23:14:07,662][1653645] Updated weights for policy 0, policy_version 919488 (0.0013) [2024-06-15 23:14:08,374][1653645] Updated weights for policy 0, policy_version 919536 (0.0011) [2024-06-15 23:14:09,741][1653645] Updated weights for policy 0, policy_version 919587 (0.0012) [2024-06-15 23:14:10,391][1653645] Updated weights for policy 0, policy_version 919637 (0.0009) [2024-06-15 23:14:10,958][1648982] Fps is (10 sec: 95029.4, 60 sec: 80827.8, 300 sec: 73755.8). Total num frames: 1883504640. Throughput: 0: 19899.8. Samples: 470896640. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:10,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:14:13,303][1653645] Updated weights for policy 0, policy_version 919696 (0.0013) [2024-06-15 23:14:14,160][1653645] Updated weights for policy 0, policy_version 919760 (0.0012) [2024-06-15 23:14:15,880][1653645] Updated weights for policy 0, policy_version 919810 (0.0012) [2024-06-15 23:14:15,958][1648982] Fps is (10 sec: 78642.8, 60 sec: 78643.5, 300 sec: 73200.4). Total num frames: 1883766784. Throughput: 0: 19945.3. Samples: 471020032. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:14:16,068][1651596] Signal inference workers to stop experience collection... (47750 times) [2024-06-15 23:14:16,133][1653645] InferenceWorker_p0-w0: stopping experience collection (47750 times) [2024-06-15 23:14:16,206][1651596] Signal inference workers to resume experience collection... (47750 times) [2024-06-15 23:14:16,207][1653645] InferenceWorker_p0-w0: resuming experience collection (47750 times) [2024-06-15 23:14:16,788][1653645] Updated weights for policy 0, policy_version 919874 (0.0012) [2024-06-15 23:14:17,352][1653645] Updated weights for policy 0, policy_version 919923 (0.0010) [2024-06-15 23:14:19,958][1653645] Updated weights for policy 0, policy_version 919952 (0.0010) [2024-06-15 23:14:20,787][1653645] Updated weights for policy 0, policy_version 920004 (0.0035) [2024-06-15 23:14:20,958][1648982] Fps is (10 sec: 68812.2, 60 sec: 79735.4, 300 sec: 73866.9). Total num frames: 1884192768. Throughput: 0: 19729.1. Samples: 471077376. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:20,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 23:14:21,480][1653645] Updated weights for policy 0, policy_version 920060 (0.0012) [2024-06-15 23:14:22,761][1653645] Updated weights for policy 0, policy_version 920113 (0.0011) [2024-06-15 23:14:23,266][1653645] Updated weights for policy 0, policy_version 920147 (0.0029) [2024-06-15 23:14:25,958][1648982] Fps is (10 sec: 78641.7, 60 sec: 78643.2, 300 sec: 73755.8). Total num frames: 1884553216. Throughput: 0: 19865.6. Samples: 471195648. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:14:26,543][1653645] Updated weights for policy 0, policy_version 920208 (0.0011) [2024-06-15 23:14:27,250][1653645] Updated weights for policy 0, policy_version 920260 (0.0011) [2024-06-15 23:14:28,277][1653645] Updated weights for policy 0, policy_version 920321 (0.0011) [2024-06-15 23:14:29,081][1653645] Updated weights for policy 0, policy_version 920384 (0.0012) [2024-06-15 23:14:29,767][1653645] Updated weights for policy 0, policy_version 920439 (0.0010) [2024-06-15 23:14:30,958][1648982] Fps is (10 sec: 88475.0, 60 sec: 78643.4, 300 sec: 73977.9). Total num frames: 1885077504. Throughput: 0: 20002.2. Samples: 471319552. Policy #0 lag: (min: 15.0, avg: 101.4, max: 271.0) [2024-06-15 23:14:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:14:33,371][1653645] Updated weights for policy 0, policy_version 920481 (0.0010) [2024-06-15 23:14:34,032][1653645] Updated weights for policy 0, policy_version 920532 (0.0011) [2024-06-15 23:14:35,136][1653645] Updated weights for policy 0, policy_version 920608 (0.0012) [2024-06-15 23:14:35,958][1648982] Fps is (10 sec: 95026.7, 60 sec: 81373.7, 300 sec: 74644.3). Total num frames: 1885503488. Throughput: 0: 20241.0. Samples: 471390720. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:14:35,959][1648982] Avg episode reward: [(0, '37.530')] [2024-06-15 23:14:36,352][1653645] Updated weights for policy 0, policy_version 920702 (0.0012) [2024-06-15 23:14:40,342][1651596] Signal inference workers to stop experience collection... (47800 times) [2024-06-15 23:14:40,438][1653645] InferenceWorker_p0-w0: stopping experience collection (47800 times) [2024-06-15 23:14:40,493][1651596] Signal inference workers to resume experience collection... (47800 times) [2024-06-15 23:14:40,494][1653645] InferenceWorker_p0-w0: resuming experience collection (47800 times) [2024-06-15 23:14:40,725][1653645] Updated weights for policy 0, policy_version 920768 (0.0085) [2024-06-15 23:14:40,958][1648982] Fps is (10 sec: 68806.4, 60 sec: 80280.7, 300 sec: 74310.9). Total num frames: 1885765632. Throughput: 0: 20286.2. Samples: 471508480. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:14:40,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:14:41,468][1653645] Updated weights for policy 0, policy_version 920828 (0.0011) [2024-06-15 23:14:42,293][1653645] Updated weights for policy 0, policy_version 920880 (0.0009) [2024-06-15 23:14:43,177][1653645] Updated weights for policy 0, policy_version 920944 (0.0011) [2024-06-15 23:14:45,958][1648982] Fps is (10 sec: 62259.1, 60 sec: 78642.9, 300 sec: 74089.0). Total num frames: 1886126080. Throughput: 0: 20138.7. Samples: 471624192. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:14:45,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:14:46,978][1653645] Updated weights for policy 0, policy_version 921002 (0.0092) [2024-06-15 23:14:47,652][1653645] Updated weights for policy 0, policy_version 921048 (0.0012) [2024-06-15 23:14:48,631][1653645] Updated weights for policy 0, policy_version 921104 (0.0080) [2024-06-15 23:14:49,535][1653645] Updated weights for policy 0, policy_version 921170 (0.0011) [2024-06-15 23:14:50,958][1648982] Fps is (10 sec: 88481.4, 60 sec: 79189.5, 300 sec: 75088.7). Total num frames: 1886650368. Throughput: 0: 20161.4. Samples: 471680000. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:14:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:14:52,871][1653645] Updated weights for policy 0, policy_version 921219 (0.0012) [2024-06-15 23:14:53,631][1653645] Updated weights for policy 0, policy_version 921280 (0.0010) [2024-06-15 23:14:54,448][1653645] Updated weights for policy 0, policy_version 921338 (0.0014) [2024-06-15 23:14:55,421][1653645] Updated weights for policy 0, policy_version 921398 (0.0012) [2024-06-15 23:14:55,958][1648982] Fps is (10 sec: 98305.2, 60 sec: 81919.9, 300 sec: 75310.9). Total num frames: 1887109120. Throughput: 0: 20150.0. Samples: 471803392. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:14:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:14:56,197][1653645] Updated weights for policy 0, policy_version 921456 (0.0010) [2024-06-15 23:14:56,197][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000921456_1887141888.pth... [2024-06-15 23:14:56,234][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000912544_1868890112.pth [2024-06-15 23:14:59,765][1653645] Updated weights for policy 0, policy_version 921507 (0.0010) [2024-06-15 23:15:00,431][1653645] Updated weights for policy 0, policy_version 921557 (0.0013) [2024-06-15 23:15:00,958][1648982] Fps is (10 sec: 78643.0, 60 sec: 81374.2, 300 sec: 74866.6). Total num frames: 1887436800. Throughput: 0: 20013.5. Samples: 471920640. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:00,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:15:01,888][1653645] Updated weights for policy 0, policy_version 921637 (0.0011) [2024-06-15 23:15:02,054][1651596] Signal inference workers to stop experience collection... (47850 times) [2024-06-15 23:15:02,100][1653645] InferenceWorker_p0-w0: stopping experience collection (47850 times) [2024-06-15 23:15:02,195][1651596] Signal inference workers to resume experience collection... (47850 times) [2024-06-15 23:15:02,196][1653645] InferenceWorker_p0-w0: resuming experience collection (47850 times) [2024-06-15 23:15:05,958][1648982] Fps is (10 sec: 58983.6, 60 sec: 78643.2, 300 sec: 74977.7). Total num frames: 1887698944. Throughput: 0: 19797.4. Samples: 471968256. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:05,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 23:15:06,520][1653645] Updated weights for policy 0, policy_version 921731 (0.0012) [2024-06-15 23:15:07,450][1653645] Updated weights for policy 0, policy_version 921796 (0.0011) [2024-06-15 23:15:08,018][1653645] Updated weights for policy 0, policy_version 921845 (0.0009) [2024-06-15 23:15:08,652][1653645] Updated weights for policy 0, policy_version 921888 (0.0012) [2024-06-15 23:15:09,491][1653645] Updated weights for policy 0, policy_version 921952 (0.0091) [2024-06-15 23:15:10,958][1648982] Fps is (10 sec: 78643.1, 60 sec: 78643.2, 300 sec: 75533.0). Total num frames: 1888223232. Throughput: 0: 19854.3. Samples: 472089088. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:15:13,256][1653645] Updated weights for policy 0, policy_version 922000 (0.0012) [2024-06-15 23:15:14,049][1653645] Updated weights for policy 0, policy_version 922051 (0.0011) [2024-06-15 23:15:14,911][1653645] Updated weights for policy 0, policy_version 922113 (0.0011) [2024-06-15 23:15:15,670][1653645] Updated weights for policy 0, policy_version 922176 (0.0010) [2024-06-15 23:15:15,958][1648982] Fps is (10 sec: 95024.4, 60 sec: 81373.5, 300 sec: 75199.7). Total num frames: 1888649216. Throughput: 0: 19774.4. Samples: 472209408. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:15:20,275][1653645] Updated weights for policy 0, policy_version 922256 (0.0012) [2024-06-15 23:15:20,963][1648982] Fps is (10 sec: 65500.8, 60 sec: 78090.2, 300 sec: 75309.5). Total num frames: 1888878592. Throughput: 0: 19487.9. Samples: 472267776. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:20,967][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:15:21,173][1653645] Updated weights for policy 0, policy_version 922320 (0.0012) [2024-06-15 23:15:22,177][1653645] Updated weights for policy 0, policy_version 922387 (0.0012) [2024-06-15 23:15:23,120][1653645] Updated weights for policy 0, policy_version 922464 (0.0010) [2024-06-15 23:15:25,961][1648982] Fps is (10 sec: 62241.7, 60 sec: 78639.5, 300 sec: 75532.3). Total num frames: 1889271808. Throughput: 0: 19318.5. Samples: 472377856. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:25,961][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:15:27,119][1653645] Updated weights for policy 0, policy_version 922512 (0.0011) [2024-06-15 23:15:27,194][1651596] Signal inference workers to stop experience collection... (47900 times) [2024-06-15 23:15:27,230][1653645] InferenceWorker_p0-w0: stopping experience collection (47900 times) [2024-06-15 23:15:27,326][1651596] Signal inference workers to resume experience collection... (47900 times) [2024-06-15 23:15:27,327][1653645] InferenceWorker_p0-w0: resuming experience collection (47900 times) [2024-06-15 23:15:27,992][1653645] Updated weights for policy 0, policy_version 922576 (0.0079) [2024-06-15 23:15:28,765][1653645] Updated weights for policy 0, policy_version 922624 (0.0090) [2024-06-15 23:15:29,621][1653645] Updated weights for policy 0, policy_version 922688 (0.0011) [2024-06-15 23:15:30,419][1653645] Updated weights for policy 0, policy_version 922746 (0.0009) [2024-06-15 23:15:30,958][1648982] Fps is (10 sec: 91799.0, 60 sec: 78643.0, 300 sec: 75533.0). Total num frames: 1889796096. Throughput: 0: 19251.3. Samples: 472490496. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:30,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:15:34,167][1653645] Updated weights for policy 0, policy_version 922806 (0.0011) [2024-06-15 23:15:35,208][1653645] Updated weights for policy 0, policy_version 922870 (0.0012) [2024-06-15 23:15:35,958][1648982] Fps is (10 sec: 88500.5, 60 sec: 77551.2, 300 sec: 76088.4). Total num frames: 1890156544. Throughput: 0: 19785.9. Samples: 472570368. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:35,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 23:15:36,293][1653645] Updated weights for policy 0, policy_version 922946 (0.0012) [2024-06-15 23:15:37,002][1653645] Updated weights for policy 0, policy_version 923000 (0.0011) [2024-06-15 23:15:40,387][1653645] Updated weights for policy 0, policy_version 923058 (0.0010) [2024-06-15 23:15:40,958][1648982] Fps is (10 sec: 68813.1, 60 sec: 78644.3, 300 sec: 76088.4). Total num frames: 1890484224. Throughput: 0: 19695.0. Samples: 472689664. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:40,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:15:41,405][1653645] Updated weights for policy 0, policy_version 923130 (0.0081) [2024-06-15 23:15:42,437][1653645] Updated weights for policy 0, policy_version 923200 (0.0011) [2024-06-15 23:15:43,226][1653645] Updated weights for policy 0, policy_version 923253 (0.0011) [2024-06-15 23:15:45,960][1648982] Fps is (10 sec: 68812.1, 60 sec: 78643.4, 300 sec: 75533.1). Total num frames: 1890844672. Throughput: 0: 19558.4. Samples: 472800768. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:45,961][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:15:46,775][1653645] Updated weights for policy 0, policy_version 923301 (0.0013) [2024-06-15 23:15:47,661][1651596] Signal inference workers to stop experience collection... (47950 times) [2024-06-15 23:15:47,695][1653645] InferenceWorker_p0-w0: stopping experience collection (47950 times) [2024-06-15 23:15:47,698][1653645] Updated weights for policy 0, policy_version 923363 (0.0013) [2024-06-15 23:15:47,802][1651596] Signal inference workers to resume experience collection... (47950 times) [2024-06-15 23:15:47,803][1653645] InferenceWorker_p0-w0: resuming experience collection (47950 times) [2024-06-15 23:15:48,577][1653645] Updated weights for policy 0, policy_version 923427 (0.0011) [2024-06-15 23:15:49,454][1653645] Updated weights for policy 0, policy_version 923488 (0.0072) [2024-06-15 23:15:50,958][1648982] Fps is (10 sec: 88473.9, 60 sec: 78643.2, 300 sec: 76421.7). Total num frames: 1891368960. Throughput: 0: 19888.3. Samples: 472863232. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:50,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 23:15:53,018][1653645] Updated weights for policy 0, policy_version 923536 (0.0013) [2024-06-15 23:15:54,284][1653645] Updated weights for policy 0, policy_version 923619 (0.0011) [2024-06-15 23:15:55,562][1653645] Updated weights for policy 0, policy_version 923680 (0.0011) [2024-06-15 23:15:55,958][1648982] Fps is (10 sec: 88471.0, 60 sec: 77004.5, 300 sec: 76754.8). Total num frames: 1891729408. Throughput: 0: 19660.6. Samples: 472973824. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:15:55,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:15:56,589][1653645] Updated weights for policy 0, policy_version 923747 (0.0011) [2024-06-15 23:15:56,855][1653645] Updated weights for policy 0, policy_version 923775 (0.0009) [2024-06-15 23:16:00,958][1648982] Fps is (10 sec: 62259.1, 60 sec: 75912.5, 300 sec: 75866.2). Total num frames: 1891991552. Throughput: 0: 19649.5. Samples: 473093632. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:16:00,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:16:00,968][1653645] Updated weights for policy 0, policy_version 923829 (0.0011) [2024-06-15 23:16:01,886][1653645] Updated weights for policy 0, policy_version 923895 (0.0064) [2024-06-15 23:16:02,732][1653645] Updated weights for policy 0, policy_version 923957 (0.0088) [2024-06-15 23:16:03,425][1653645] Updated weights for policy 0, policy_version 924016 (0.0010) [2024-06-15 23:16:05,958][1648982] Fps is (10 sec: 68815.5, 60 sec: 78643.1, 300 sec: 76310.6). Total num frames: 1892417536. Throughput: 0: 19412.8. Samples: 473141248. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:16:05,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:16:07,056][1653645] Updated weights for policy 0, policy_version 924061 (0.0083) [2024-06-15 23:16:07,916][1653645] Updated weights for policy 0, policy_version 924116 (0.0012) [2024-06-15 23:16:09,163][1653645] Updated weights for policy 0, policy_version 924208 (0.0083) [2024-06-15 23:16:09,494][1651596] Signal inference workers to stop experience collection... (48000 times) [2024-06-15 23:16:09,556][1653645] InferenceWorker_p0-w0: stopping experience collection (48000 times) [2024-06-15 23:16:09,669][1651596] Signal inference workers to resume experience collection... (48000 times) [2024-06-15 23:16:09,670][1653645] InferenceWorker_p0-w0: resuming experience collection (48000 times) [2024-06-15 23:16:09,905][1653645] Updated weights for policy 0, policy_version 924256 (0.0011) [2024-06-15 23:16:10,958][1648982] Fps is (10 sec: 95026.5, 60 sec: 78643.1, 300 sec: 77310.3). Total num frames: 1892941824. Throughput: 0: 19536.9. Samples: 473256960. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:16:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:16:13,849][1653645] Updated weights for policy 0, policy_version 924306 (0.0012) [2024-06-15 23:16:14,619][1653645] Updated weights for policy 0, policy_version 924368 (0.0011) [2024-06-15 23:16:15,703][1653645] Updated weights for policy 0, policy_version 924433 (0.0012) [2024-06-15 23:16:15,958][1648982] Fps is (10 sec: 85196.3, 60 sec: 77005.0, 300 sec: 76643.8). Total num frames: 1893269504. Throughput: 0: 19797.3. Samples: 473381376. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:16:15,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:16:16,846][1653645] Updated weights for policy 0, policy_version 924514 (0.0081) [2024-06-15 23:16:20,958][1648982] Fps is (10 sec: 62258.3, 60 sec: 78103.8, 300 sec: 76310.5). Total num frames: 1893564416. Throughput: 0: 19285.2. Samples: 473438208. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:16:20,961][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:16:21,100][1653645] Updated weights for policy 0, policy_version 924600 (0.0078) [2024-06-15 23:16:22,024][1653645] Updated weights for policy 0, policy_version 924672 (0.0012) [2024-06-15 23:16:22,772][1653645] Updated weights for policy 0, policy_version 924726 (0.0010) [2024-06-15 23:16:23,611][1653645] Updated weights for policy 0, policy_version 924791 (0.0014) [2024-06-15 23:16:25,958][1648982] Fps is (10 sec: 72087.9, 60 sec: 78646.8, 300 sec: 77088.0). Total num frames: 1893990400. Throughput: 0: 19091.8. Samples: 473548800. Policy #0 lag: (min: 15.0, avg: 75.1, max: 271.0) [2024-06-15 23:16:25,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:16:27,474][1653645] Updated weights for policy 0, policy_version 924848 (0.0076) [2024-06-15 23:16:28,182][1653645] Updated weights for policy 0, policy_version 924897 (0.0012) [2024-06-15 23:16:29,086][1653645] Updated weights for policy 0, policy_version 924962 (0.0011) [2024-06-15 23:16:30,079][1653645] Updated weights for policy 0, policy_version 925040 (0.0079) [2024-06-15 23:16:30,958][1648982] Fps is (10 sec: 95029.0, 60 sec: 78643.3, 300 sec: 77310.3). Total num frames: 1894514688. Throughput: 0: 19296.7. Samples: 473669120. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:16:30,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:16:33,938][1651596] Signal inference workers to stop experience collection... (48050 times) [2024-06-15 23:16:33,977][1653645] InferenceWorker_p0-w0: stopping experience collection (48050 times) [2024-06-15 23:16:33,978][1653645] Updated weights for policy 0, policy_version 925095 (0.0013) [2024-06-15 23:16:34,058][1651596] Signal inference workers to resume experience collection... (48050 times) [2024-06-15 23:16:34,059][1653645] InferenceWorker_p0-w0: resuming experience collection (48050 times) [2024-06-15 23:16:35,232][1653645] Updated weights for policy 0, policy_version 925168 (0.0022) [2024-06-15 23:16:35,958][1648982] Fps is (10 sec: 85199.5, 60 sec: 78097.1, 300 sec: 76977.0). Total num frames: 1894842368. Throughput: 0: 19478.8. Samples: 473739776. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:16:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:16:36,285][1653645] Updated weights for policy 0, policy_version 925248 (0.0013) [2024-06-15 23:16:40,958][1648982] Fps is (10 sec: 52428.7, 60 sec: 75912.5, 300 sec: 76754.9). Total num frames: 1895038976. Throughput: 0: 19365.1. Samples: 473845248. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:16:40,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:16:41,019][1653645] Updated weights for policy 0, policy_version 925316 (0.0010) [2024-06-15 23:16:42,148][1653645] Updated weights for policy 0, policy_version 925398 (0.0013) [2024-06-15 23:16:43,027][1653645] Updated weights for policy 0, policy_version 925461 (0.0013) [2024-06-15 23:16:43,831][1653645] Updated weights for policy 0, policy_version 925520 (0.0009) [2024-06-15 23:16:44,397][1653645] Updated weights for policy 0, policy_version 925565 (0.0013) [2024-06-15 23:16:45,958][1648982] Fps is (10 sec: 72088.7, 60 sec: 78643.2, 300 sec: 77310.3). Total num frames: 1895563264. Throughput: 0: 19103.3. Samples: 473953280. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:16:45,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:16:48,738][1653645] Updated weights for policy 0, policy_version 925617 (0.0189) [2024-06-15 23:16:49,613][1653645] Updated weights for policy 0, policy_version 925684 (0.0009) [2024-06-15 23:16:50,322][1653645] Updated weights for policy 0, policy_version 925734 (0.0011) [2024-06-15 23:16:50,958][1648982] Fps is (10 sec: 95025.5, 60 sec: 77004.5, 300 sec: 77199.1). Total num frames: 1895989248. Throughput: 0: 19569.7. Samples: 474021888. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:16:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:16:51,250][1653645] Updated weights for policy 0, policy_version 925808 (0.0012) [2024-06-15 23:16:55,217][1653645] Updated weights for policy 0, policy_version 925856 (0.0021) [2024-06-15 23:16:55,958][1648982] Fps is (10 sec: 68811.6, 60 sec: 75366.5, 300 sec: 77310.3). Total num frames: 1896251392. Throughput: 0: 19524.2. Samples: 474135552. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:16:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:16:56,092][1653645] Updated weights for policy 0, policy_version 925920 (0.0010) [2024-06-15 23:16:56,093][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000925920_1896284160.pth... [2024-06-15 23:16:56,176][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000916784_1877573632.pth [2024-06-15 23:16:56,630][1651596] Signal inference workers to stop experience collection... (48100 times) [2024-06-15 23:16:56,698][1653645] InferenceWorker_p0-w0: stopping experience collection (48100 times) [2024-06-15 23:16:56,761][1651596] Signal inference workers to resume experience collection... (48100 times) [2024-06-15 23:16:56,762][1653645] InferenceWorker_p0-w0: resuming experience collection (48100 times) [2024-06-15 23:16:57,016][1653645] Updated weights for policy 0, policy_version 925984 (0.0011) [2024-06-15 23:16:57,766][1653645] Updated weights for policy 0, policy_version 926034 (0.0012) [2024-06-15 23:17:00,958][1648982] Fps is (10 sec: 62260.8, 60 sec: 77004.9, 300 sec: 77310.3). Total num frames: 1896611840. Throughput: 0: 19330.9. Samples: 474251264. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:17:01,638][1653645] Updated weights for policy 0, policy_version 926099 (0.0012) [2024-06-15 23:17:02,556][1653645] Updated weights for policy 0, policy_version 926163 (0.0011) [2024-06-15 23:17:03,560][1653645] Updated weights for policy 0, policy_version 926240 (0.0071) [2024-06-15 23:17:04,627][1653645] Updated weights for policy 0, policy_version 926320 (0.0010) [2024-06-15 23:17:05,958][1648982] Fps is (10 sec: 88475.5, 60 sec: 78643.1, 300 sec: 77754.6). Total num frames: 1897136128. Throughput: 0: 19217.1. Samples: 474302976. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:05,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:17:08,479][1653645] Updated weights for policy 0, policy_version 926370 (0.0011) [2024-06-15 23:17:09,376][1653645] Updated weights for policy 0, policy_version 926433 (0.0012) [2024-06-15 23:17:10,224][1653645] Updated weights for policy 0, policy_version 926496 (0.0011) [2024-06-15 23:17:10,958][1648982] Fps is (10 sec: 95026.8, 60 sec: 77004.9, 300 sec: 77865.7). Total num frames: 1897562112. Throughput: 0: 19410.6. Samples: 474422272. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:10,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:17:11,043][1653645] Updated weights for policy 0, policy_version 926547 (0.0010) [2024-06-15 23:17:14,965][1653645] Updated weights for policy 0, policy_version 926597 (0.0010) [2024-06-15 23:17:15,784][1653645] Updated weights for policy 0, policy_version 926658 (0.0011) [2024-06-15 23:17:15,958][1648982] Fps is (10 sec: 68812.0, 60 sec: 75912.4, 300 sec: 77865.6). Total num frames: 1897824256. Throughput: 0: 19410.4. Samples: 474542592. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:17:16,494][1653645] Updated weights for policy 0, policy_version 926720 (0.0010) [2024-06-15 23:17:17,360][1653645] Updated weights for policy 0, policy_version 926776 (0.0012) [2024-06-15 23:17:17,749][1651596] Signal inference workers to stop experience collection... (48150 times) [2024-06-15 23:17:17,785][1653645] InferenceWorker_p0-w0: stopping experience collection (48150 times) [2024-06-15 23:17:17,888][1651596] Signal inference workers to resume experience collection... (48150 times) [2024-06-15 23:17:17,889][1653645] InferenceWorker_p0-w0: resuming experience collection (48150 times) [2024-06-15 23:17:18,132][1653645] Updated weights for policy 0, policy_version 926834 (0.0011) [2024-06-15 23:17:20,958][1648982] Fps is (10 sec: 62259.0, 60 sec: 77005.0, 300 sec: 77310.3). Total num frames: 1898184704. Throughput: 0: 18955.4. Samples: 474592768. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:17:22,248][1653645] Updated weights for policy 0, policy_version 926880 (0.0025) [2024-06-15 23:17:23,031][1653645] Updated weights for policy 0, policy_version 926933 (0.0011) [2024-06-15 23:17:23,785][1653645] Updated weights for policy 0, policy_version 926992 (0.0010) [2024-06-15 23:17:24,683][1653645] Updated weights for policy 0, policy_version 927056 (0.0012) [2024-06-15 23:17:25,265][1653645] Updated weights for policy 0, policy_version 927097 (0.0011) [2024-06-15 23:17:25,958][1648982] Fps is (10 sec: 88472.8, 60 sec: 78643.2, 300 sec: 78198.8). Total num frames: 1898708992. Throughput: 0: 19285.2. Samples: 474713088. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:25,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:17:28,972][1653645] Updated weights for policy 0, policy_version 927144 (0.0076) [2024-06-15 23:17:29,840][1653645] Updated weights for policy 0, policy_version 927207 (0.0012) [2024-06-15 23:17:30,760][1653645] Updated weights for policy 0, policy_version 927280 (0.0011) [2024-06-15 23:17:30,957][1648982] Fps is (10 sec: 88474.4, 60 sec: 75912.6, 300 sec: 78205.4). Total num frames: 1899069440. Throughput: 0: 19501.6. Samples: 474830848. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:17:31,655][1653645] Updated weights for policy 0, policy_version 927346 (0.0010) [2024-06-15 23:17:35,958][1648982] Fps is (10 sec: 58983.9, 60 sec: 74274.1, 300 sec: 77532.4). Total num frames: 1899298816. Throughput: 0: 19285.4. Samples: 474889728. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:17:36,157][1653645] Updated weights for policy 0, policy_version 927413 (0.0010) [2024-06-15 23:17:37,098][1653645] Updated weights for policy 0, policy_version 927488 (0.0012) [2024-06-15 23:17:38,088][1653645] Updated weights for policy 0, policy_version 927554 (0.0010) [2024-06-15 23:17:38,725][1653645] Updated weights for policy 0, policy_version 927610 (0.0010) [2024-06-15 23:17:40,958][1648982] Fps is (10 sec: 68812.3, 60 sec: 78643.2, 300 sec: 77976.8). Total num frames: 1899757568. Throughput: 0: 19137.5. Samples: 474996736. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:40,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:17:42,842][1651596] Signal inference workers to stop experience collection... (48200 times) [2024-06-15 23:17:42,857][1653645] InferenceWorker_p0-w0: stopping experience collection (48200 times) [2024-06-15 23:17:42,903][1651596] Signal inference workers to resume experience collection... (48200 times) [2024-06-15 23:17:42,912][1653645] InferenceWorker_p0-w0: resuming experience collection (48200 times) [2024-06-15 23:17:43,026][1653645] Updated weights for policy 0, policy_version 927683 (0.0011) [2024-06-15 23:17:44,028][1653645] Updated weights for policy 0, policy_version 927747 (0.0013) [2024-06-15 23:17:44,884][1653645] Updated weights for policy 0, policy_version 927810 (0.0012) [2024-06-15 23:17:45,623][1653645] Updated weights for policy 0, policy_version 927866 (0.0010) [2024-06-15 23:17:45,958][1648982] Fps is (10 sec: 98304.3, 60 sec: 78643.3, 300 sec: 78532.2). Total num frames: 1900281856. Throughput: 0: 19035.0. Samples: 475107840. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:17:49,574][1653645] Updated weights for policy 0, policy_version 927925 (0.0011) [2024-06-15 23:17:50,449][1653645] Updated weights for policy 0, policy_version 927988 (0.0011) [2024-06-15 23:17:50,958][1648982] Fps is (10 sec: 81919.8, 60 sec: 76458.9, 300 sec: 78198.9). Total num frames: 1900576768. Throughput: 0: 19512.9. Samples: 475181056. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:17:51,152][1653645] Updated weights for policy 0, policy_version 928039 (0.0011) [2024-06-15 23:17:52,118][1653645] Updated weights for policy 0, policy_version 928112 (0.0011) [2024-06-15 23:17:55,452][1653645] Updated weights for policy 0, policy_version 928160 (0.0011) [2024-06-15 23:17:55,958][1648982] Fps is (10 sec: 65535.8, 60 sec: 78097.4, 300 sec: 78643.3). Total num frames: 1900937216. Throughput: 0: 19456.0. Samples: 475297792. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:17:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:17:56,161][1653645] Updated weights for policy 0, policy_version 928194 (0.0010) [2024-06-15 23:17:56,944][1653645] Updated weights for policy 0, policy_version 928255 (0.0011) [2024-06-15 23:17:57,906][1653645] Updated weights for policy 0, policy_version 928316 (0.0108) [2024-06-15 23:17:58,589][1653645] Updated weights for policy 0, policy_version 928358 (0.0011) [2024-06-15 23:18:00,958][1648982] Fps is (10 sec: 75364.0, 60 sec: 78642.7, 300 sec: 78198.8). Total num frames: 1901330432. Throughput: 0: 19478.7. Samples: 475419136. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:18:00,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:18:01,430][1653645] Updated weights for policy 0, policy_version 928416 (0.0011) [2024-06-15 23:18:02,978][1653645] Updated weights for policy 0, policy_version 928481 (0.0011) [2024-06-15 23:18:03,893][1653645] Updated weights for policy 0, policy_version 928528 (0.0012) [2024-06-15 23:18:04,434][1651596] Signal inference workers to stop experience collection... (48250 times) [2024-06-15 23:18:04,486][1653645] InferenceWorker_p0-w0: stopping experience collection (48250 times) [2024-06-15 23:18:04,581][1651596] Signal inference workers to resume experience collection... (48250 times) [2024-06-15 23:18:04,582][1653645] InferenceWorker_p0-w0: resuming experience collection (48250 times) [2024-06-15 23:18:05,053][1653645] Updated weights for policy 0, policy_version 928608 (0.0021) [2024-06-15 23:18:05,958][1648982] Fps is (10 sec: 91751.0, 60 sec: 78643.3, 300 sec: 78643.2). Total num frames: 1901854720. Throughput: 0: 19649.4. Samples: 475476992. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:18:05,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:18:07,779][1653645] Updated weights for policy 0, policy_version 928657 (0.0012) [2024-06-15 23:18:08,282][1653645] Updated weights for policy 0, policy_version 928701 (0.0013) [2024-06-15 23:18:09,968][1653645] Updated weights for policy 0, policy_version 928752 (0.0012) [2024-06-15 23:18:10,845][1653645] Updated weights for policy 0, policy_version 928816 (0.0021) [2024-06-15 23:18:10,958][1648982] Fps is (10 sec: 88474.5, 60 sec: 77550.6, 300 sec: 78532.1). Total num frames: 1902215168. Throughput: 0: 19729.1. Samples: 475600896. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:18:10,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:18:11,769][1653645] Updated weights for policy 0, policy_version 928880 (0.0014) [2024-06-15 23:18:14,230][1653645] Updated weights for policy 0, policy_version 928929 (0.0014) [2024-06-15 23:18:15,958][1648982] Fps is (10 sec: 65535.9, 60 sec: 78097.3, 300 sec: 78310.0). Total num frames: 1902510080. Throughput: 0: 19785.9. Samples: 475721216. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:18:15,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:18:16,124][1653645] Updated weights for policy 0, policy_version 928976 (0.0010) [2024-06-15 23:18:16,879][1653645] Updated weights for policy 0, policy_version 929026 (0.0010) [2024-06-15 23:18:17,769][1653645] Updated weights for policy 0, policy_version 929092 (0.0084) [2024-06-15 23:18:20,517][1653645] Updated weights for policy 0, policy_version 929153 (0.0010) [2024-06-15 23:18:20,958][1648982] Fps is (10 sec: 75368.2, 60 sec: 79735.5, 300 sec: 78421.1). Total num frames: 1902968832. Throughput: 0: 19649.4. Samples: 475773952. Policy #0 lag: (min: 107.0, avg: 155.4, max: 363.0) [2024-06-15 23:18:20,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:18:21,173][1653645] Updated weights for policy 0, policy_version 929212 (0.0009) [2024-06-15 23:18:23,638][1653645] Updated weights for policy 0, policy_version 929265 (0.0026) [2024-06-15 23:18:24,503][1653645] Updated weights for policy 0, policy_version 929330 (0.0009) [2024-06-15 23:18:25,304][1653645] Updated weights for policy 0, policy_version 929394 (0.0011) [2024-06-15 23:18:25,958][1648982] Fps is (10 sec: 91750.3, 60 sec: 78643.6, 300 sec: 78198.9). Total num frames: 1903427584. Throughput: 0: 20013.5. Samples: 475897344. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:25,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:18:26,976][1653645] Updated weights for policy 0, policy_version 929427 (0.0010) [2024-06-15 23:18:29,271][1653645] Updated weights for policy 0, policy_version 929488 (0.0011) [2024-06-15 23:18:29,359][1651596] Signal inference workers to stop experience collection... (48300 times) [2024-06-15 23:18:29,406][1653645] InferenceWorker_p0-w0: stopping experience collection (48300 times) [2024-06-15 23:18:29,519][1651596] Signal inference workers to resume experience collection... (48300 times) [2024-06-15 23:18:29,520][1653645] InferenceWorker_p0-w0: resuming experience collection (48300 times) [2024-06-15 23:18:30,189][1653645] Updated weights for policy 0, policy_version 929537 (0.0014) [2024-06-15 23:18:30,958][1648982] Fps is (10 sec: 81920.4, 60 sec: 78643.2, 300 sec: 78532.2). Total num frames: 1903788032. Throughput: 0: 20263.8. Samples: 476019712. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:30,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:18:31,068][1653645] Updated weights for policy 0, policy_version 929602 (0.0012) [2024-06-15 23:18:31,809][1653645] Updated weights for policy 0, policy_version 929662 (0.0010) [2024-06-15 23:18:33,787][1653645] Updated weights for policy 0, policy_version 929712 (0.0018) [2024-06-15 23:18:35,861][1653645] Updated weights for policy 0, policy_version 929765 (0.0011) [2024-06-15 23:18:35,958][1648982] Fps is (10 sec: 75366.6, 60 sec: 81373.9, 300 sec: 78754.3). Total num frames: 1904181248. Throughput: 0: 20093.2. Samples: 476085248. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:18:36,543][1653645] Updated weights for policy 0, policy_version 929811 (0.0020) [2024-06-15 23:18:37,696][1653645] Updated weights for policy 0, policy_version 929888 (0.0066) [2024-06-15 23:18:40,063][1653645] Updated weights for policy 0, policy_version 929936 (0.0010) [2024-06-15 23:18:40,958][1648982] Fps is (10 sec: 81919.5, 60 sec: 80827.7, 300 sec: 78643.2). Total num frames: 1904607232. Throughput: 0: 20036.3. Samples: 476199424. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:40,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:18:41,855][1653645] Updated weights for policy 0, policy_version 929985 (0.0011) [2024-06-15 23:18:42,599][1653645] Updated weights for policy 0, policy_version 930035 (0.0011) [2024-06-15 23:18:43,264][1653645] Updated weights for policy 0, policy_version 930083 (0.0013) [2024-06-15 23:18:44,241][1653645] Updated weights for policy 0, policy_version 930160 (0.0012) [2024-06-15 23:18:45,958][1648982] Fps is (10 sec: 81919.3, 60 sec: 78643.1, 300 sec: 78310.0). Total num frames: 1905000448. Throughput: 0: 20036.4. Samples: 476320768. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:45,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:18:47,027][1653645] Updated weights for policy 0, policy_version 930224 (0.0068) [2024-06-15 23:18:48,912][1653645] Updated weights for policy 0, policy_version 930272 (0.0012) [2024-06-15 23:18:49,743][1653645] Updated weights for policy 0, policy_version 930322 (0.0017) [2024-06-15 23:18:50,537][1653645] Updated weights for policy 0, policy_version 930384 (0.0011) [2024-06-15 23:18:50,605][1651596] Signal inference workers to stop experience collection... (48350 times) [2024-06-15 23:18:50,679][1653645] InferenceWorker_p0-w0: stopping experience collection (48350 times) [2024-06-15 23:18:50,762][1651596] Signal inference workers to resume experience collection... (48350 times) [2024-06-15 23:18:50,763][1653645] InferenceWorker_p0-w0: resuming experience collection (48350 times) [2024-06-15 23:18:50,958][1648982] Fps is (10 sec: 85197.2, 60 sec: 81373.9, 300 sec: 78865.4). Total num frames: 1905459200. Throughput: 0: 20138.7. Samples: 476383232. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:50,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:18:53,571][1653645] Updated weights for policy 0, policy_version 930448 (0.0011) [2024-06-15 23:18:54,203][1653645] Updated weights for policy 0, policy_version 930490 (0.0028) [2024-06-15 23:18:55,808][1653645] Updated weights for policy 0, policy_version 930544 (0.0010) [2024-06-15 23:18:55,958][1648982] Fps is (10 sec: 75366.4, 60 sec: 80281.6, 300 sec: 78643.2). Total num frames: 1905754112. Throughput: 0: 19899.8. Samples: 476496384. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:18:55,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:18:56,342][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000930576_1905819648.pth... [2024-06-15 23:18:56,448][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000921456_1887141888.pth [2024-06-15 23:18:56,983][1653645] Updated weights for policy 0, policy_version 930615 (0.0011) [2024-06-15 23:18:57,864][1653645] Updated weights for policy 0, policy_version 930684 (0.0012) [2024-06-15 23:19:00,862][1653645] Updated weights for policy 0, policy_version 930743 (0.0010) [2024-06-15 23:19:00,960][1648982] Fps is (10 sec: 68811.6, 60 sec: 80281.9, 300 sec: 78532.1). Total num frames: 1906147328. Throughput: 0: 19933.8. Samples: 476618240. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:00,963][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:19:02,307][1653645] Updated weights for policy 0, policy_version 930785 (0.0011) [2024-06-15 23:19:03,473][1653645] Updated weights for policy 0, policy_version 930864 (0.0151) [2024-06-15 23:19:04,264][1653645] Updated weights for policy 0, policy_version 930915 (0.0010) [2024-06-15 23:19:05,958][1648982] Fps is (10 sec: 81920.4, 60 sec: 78643.2, 300 sec: 78198.9). Total num frames: 1906573312. Throughput: 0: 19945.3. Samples: 476671488. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:05,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:19:06,711][1653645] Updated weights for policy 0, policy_version 930945 (0.0051) [2024-06-15 23:19:07,377][1653645] Updated weights for policy 0, policy_version 931001 (0.0019) [2024-06-15 23:19:08,788][1653645] Updated weights for policy 0, policy_version 931041 (0.0012) [2024-06-15 23:19:09,766][1653645] Updated weights for policy 0, policy_version 931110 (0.0101) [2024-06-15 23:19:10,895][1653645] Updated weights for policy 0, policy_version 931193 (0.0018) [2024-06-15 23:19:10,958][1648982] Fps is (10 sec: 95027.1, 60 sec: 81374.0, 300 sec: 79087.5). Total num frames: 1907097600. Throughput: 0: 19899.6. Samples: 476792832. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:10,960][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:19:13,947][1653645] Updated weights for policy 0, policy_version 931236 (0.0014) [2024-06-15 23:19:15,255][1653645] Updated weights for policy 0, policy_version 931280 (0.0011) [2024-06-15 23:19:15,816][1651596] Signal inference workers to stop experience collection... (48400 times) [2024-06-15 23:19:15,846][1653645] InferenceWorker_p0-w0: stopping experience collection (48400 times) [2024-06-15 23:19:15,947][1651596] Signal inference workers to resume experience collection... (48400 times) [2024-06-15 23:19:15,948][1653645] InferenceWorker_p0-w0: resuming experience collection (48400 times) [2024-06-15 23:19:15,958][1648982] Fps is (10 sec: 78643.2, 60 sec: 80827.7, 300 sec: 78532.1). Total num frames: 1907359744. Throughput: 0: 19990.7. Samples: 476919296. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:15,958][1648982] Avg episode reward: [(0, '37.550')] [2024-06-15 23:19:16,420][1653645] Updated weights for policy 0, policy_version 931360 (0.0090) [2024-06-15 23:19:17,172][1653645] Updated weights for policy 0, policy_version 931410 (0.0010) [2024-06-15 23:19:20,127][1653645] Updated weights for policy 0, policy_version 931474 (0.0093) [2024-06-15 23:19:20,599][1653645] Updated weights for policy 0, policy_version 931517 (0.0011) [2024-06-15 23:19:20,958][1648982] Fps is (10 sec: 65536.8, 60 sec: 79735.4, 300 sec: 78643.2). Total num frames: 1907752960. Throughput: 0: 19569.7. Samples: 476965888. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:20,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:19:22,277][1653645] Updated weights for policy 0, policy_version 931569 (0.0012) [2024-06-15 23:19:23,063][1653645] Updated weights for policy 0, policy_version 931632 (0.0016) [2024-06-15 23:19:23,835][1653645] Updated weights for policy 0, policy_version 931680 (0.0009) [2024-06-15 23:19:25,958][1648982] Fps is (10 sec: 78643.4, 60 sec: 78643.2, 300 sec: 78198.9). Total num frames: 1908146176. Throughput: 0: 19786.0. Samples: 477089792. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:19:26,134][1653645] Updated weights for policy 0, policy_version 931717 (0.0012) [2024-06-15 23:19:27,759][1653645] Updated weights for policy 0, policy_version 931778 (0.0011) [2024-06-15 23:19:28,501][1653645] Updated weights for policy 0, policy_version 931835 (0.0070) [2024-06-15 23:19:29,376][1653645] Updated weights for policy 0, policy_version 931904 (0.0012) [2024-06-15 23:19:30,958][1648982] Fps is (10 sec: 85196.2, 60 sec: 80281.4, 300 sec: 78310.0). Total num frames: 1908604928. Throughput: 0: 19774.5. Samples: 477210624. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:30,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:19:31,119][1653645] Updated weights for policy 0, policy_version 931959 (0.0012) [2024-06-15 23:19:33,110][1653645] Updated weights for policy 0, policy_version 932004 (0.0075) [2024-06-15 23:19:35,083][1653645] Updated weights for policy 0, policy_version 932069 (0.0011) [2024-06-15 23:19:35,874][1653645] Updated weights for policy 0, policy_version 932129 (0.0011) [2024-06-15 23:19:35,958][1648982] Fps is (10 sec: 85197.0, 60 sec: 80281.6, 300 sec: 78754.5). Total num frames: 1908998144. Throughput: 0: 19717.7. Samples: 477270528. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:35,960][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:19:37,281][1653645] Updated weights for policy 0, policy_version 932177 (0.0011) [2024-06-15 23:19:39,313][1653645] Updated weights for policy 0, policy_version 932240 (0.0012) [2024-06-15 23:19:40,958][1648982] Fps is (10 sec: 72090.3, 60 sec: 78643.2, 300 sec: 78643.3). Total num frames: 1909325824. Throughput: 0: 19751.8. Samples: 477385216. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:19:41,517][1651596] Signal inference workers to stop experience collection... (48450 times) [2024-06-15 23:19:41,541][1653645] InferenceWorker_p0-w0: stopping experience collection (48450 times) [2024-06-15 23:19:41,683][1651596] Signal inference workers to resume experience collection... (48450 times) [2024-06-15 23:19:41,685][1653645] InferenceWorker_p0-w0: resuming experience collection (48450 times) [2024-06-15 23:19:41,686][1653645] Updated weights for policy 0, policy_version 932304 (0.0013) [2024-06-15 23:19:42,576][1653645] Updated weights for policy 0, policy_version 932353 (0.0012) [2024-06-15 23:19:43,296][1653645] Updated weights for policy 0, policy_version 932410 (0.0010) [2024-06-15 23:19:44,502][1653645] Updated weights for policy 0, policy_version 932464 (0.0011) [2024-06-15 23:19:45,958][1648982] Fps is (10 sec: 75365.4, 60 sec: 79189.3, 300 sec: 78309.9). Total num frames: 1909751808. Throughput: 0: 19660.8. Samples: 477502976. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:45,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:19:46,355][1653645] Updated weights for policy 0, policy_version 932515 (0.0011) [2024-06-15 23:19:48,475][1653645] Updated weights for policy 0, policy_version 932576 (0.0012) [2024-06-15 23:19:49,337][1653645] Updated weights for policy 0, policy_version 932614 (0.0011) [2024-06-15 23:19:50,000][1653645] Updated weights for policy 0, policy_version 932668 (0.0010) [2024-06-15 23:19:50,958][1648982] Fps is (10 sec: 78643.1, 60 sec: 77550.8, 300 sec: 77976.8). Total num frames: 1910112256. Throughput: 0: 19774.6. Samples: 477561344. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:50,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:19:52,159][1653645] Updated weights for policy 0, policy_version 932738 (0.0011) [2024-06-15 23:19:52,840][1653645] Updated weights for policy 0, policy_version 932793 (0.0011) [2024-06-15 23:19:54,933][1653645] Updated weights for policy 0, policy_version 932848 (0.0012) [2024-06-15 23:19:55,958][1648982] Fps is (10 sec: 78641.4, 60 sec: 79735.1, 300 sec: 78309.9). Total num frames: 1910538240. Throughput: 0: 19717.6. Samples: 477680128. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:19:55,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:19:56,065][1653645] Updated weights for policy 0, policy_version 932885 (0.0011) [2024-06-15 23:19:57,578][1653645] Updated weights for policy 0, policy_version 932934 (0.0012) [2024-06-15 23:19:58,240][1653645] Updated weights for policy 0, policy_version 932979 (0.0010) [2024-06-15 23:19:59,074][1653645] Updated weights for policy 0, policy_version 933048 (0.0085) [2024-06-15 23:20:00,958][1648982] Fps is (10 sec: 78643.8, 60 sec: 79189.6, 300 sec: 78643.2). Total num frames: 1910898688. Throughput: 0: 19672.2. Samples: 477804544. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:20:00,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:20:01,252][1653645] Updated weights for policy 0, policy_version 933088 (0.0010) [2024-06-15 23:20:02,769][1653645] Updated weights for policy 0, policy_version 933152 (0.0010) [2024-06-15 23:20:03,233][1653645] Updated weights for policy 0, policy_version 933184 (0.0009) [2024-06-15 23:20:04,741][1653645] Updated weights for policy 0, policy_version 933232 (0.0012) [2024-06-15 23:20:05,644][1653645] Updated weights for policy 0, policy_version 933296 (0.0013) [2024-06-15 23:20:05,958][1648982] Fps is (10 sec: 88476.1, 60 sec: 80827.7, 300 sec: 78643.2). Total num frames: 1911422976. Throughput: 0: 19842.9. Samples: 477858816. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:20:05,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:20:07,225][1651596] Signal inference workers to stop experience collection... (48500 times) [2024-06-15 23:20:07,253][1653645] InferenceWorker_p0-w0: stopping experience collection (48500 times) [2024-06-15 23:20:07,359][1651596] Signal inference workers to resume experience collection... (48500 times) [2024-06-15 23:20:07,360][1653645] InferenceWorker_p0-w0: resuming experience collection (48500 times) [2024-06-15 23:20:07,696][1653645] Updated weights for policy 0, policy_version 933346 (0.0011) [2024-06-15 23:20:09,277][1653645] Updated weights for policy 0, policy_version 933397 (0.0012) [2024-06-15 23:20:10,642][1653645] Updated weights for policy 0, policy_version 933460 (0.0019) [2024-06-15 23:20:10,958][1648982] Fps is (10 sec: 88472.3, 60 sec: 78097.1, 300 sec: 78421.1). Total num frames: 1911783424. Throughput: 0: 19979.3. Samples: 477988864. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:20:10,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:20:11,299][1653645] Updated weights for policy 0, policy_version 933505 (0.0012) [2024-06-15 23:20:11,900][1653645] Updated weights for policy 0, policy_version 933554 (0.0024) [2024-06-15 23:20:13,606][1653645] Updated weights for policy 0, policy_version 933600 (0.0010) [2024-06-15 23:20:15,958][1648982] Fps is (10 sec: 65536.4, 60 sec: 78643.2, 300 sec: 78644.6). Total num frames: 1912078336. Throughput: 0: 20025.0. Samples: 478111744. Policy #0 lag: (min: 68.0, avg: 144.0, max: 324.0) [2024-06-15 23:20:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:20:16,337][1653645] Updated weights for policy 0, policy_version 933666 (0.0011) [2024-06-15 23:20:17,068][1653645] Updated weights for policy 0, policy_version 933720 (0.0010) [2024-06-15 23:20:17,596][1653645] Updated weights for policy 0, policy_version 933760 (0.0015) [2024-06-15 23:20:18,618][1653645] Updated weights for policy 0, policy_version 933814 (0.0010) [2024-06-15 23:20:20,493][1653645] Updated weights for policy 0, policy_version 933863 (0.0010) [2024-06-15 23:20:20,958][1648982] Fps is (10 sec: 81921.3, 60 sec: 80827.8, 300 sec: 79088.3). Total num frames: 1912602624. Throughput: 0: 19877.0. Samples: 478164992. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:20,958][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 23:20:22,175][1653645] Updated weights for policy 0, policy_version 933904 (0.0011) [2024-06-15 23:20:23,283][1653645] Updated weights for policy 0, policy_version 933984 (0.0011) [2024-06-15 23:20:24,919][1653645] Updated weights for policy 0, policy_version 934048 (0.0011) [2024-06-15 23:20:25,958][1648982] Fps is (10 sec: 91748.6, 60 sec: 80827.5, 300 sec: 78643.2). Total num frames: 1912995840. Throughput: 0: 20047.6. Samples: 478287360. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:25,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:20:26,613][1653645] Updated weights for policy 0, policy_version 934100 (0.0013) [2024-06-15 23:20:28,775][1653645] Updated weights for policy 0, policy_version 934152 (0.0010) [2024-06-15 23:20:29,488][1653645] Updated weights for policy 0, policy_version 934208 (0.0011) [2024-06-15 23:20:30,280][1653645] Updated weights for policy 0, policy_version 934270 (0.0009) [2024-06-15 23:20:30,958][1648982] Fps is (10 sec: 78642.2, 60 sec: 79735.5, 300 sec: 78754.3). Total num frames: 1913389056. Throughput: 0: 20172.8. Samples: 478410752. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:20:31,325][1651596] Signal inference workers to stop experience collection... (48550 times) [2024-06-15 23:20:31,358][1653645] InferenceWorker_p0-w0: stopping experience collection (48550 times) [2024-06-15 23:20:31,480][1651596] Signal inference workers to resume experience collection... (48550 times) [2024-06-15 23:20:31,481][1653645] InferenceWorker_p0-w0: resuming experience collection (48550 times) [2024-06-15 23:20:31,591][1653645] Updated weights for policy 0, policy_version 934330 (0.0014) [2024-06-15 23:20:34,100][1653645] Updated weights for policy 0, policy_version 934393 (0.0010) [2024-06-15 23:20:35,591][1653645] Updated weights for policy 0, policy_version 934448 (0.0011) [2024-06-15 23:20:35,958][1648982] Fps is (10 sec: 78643.3, 60 sec: 79735.2, 300 sec: 78976.4). Total num frames: 1913782272. Throughput: 0: 20184.1. Samples: 478469632. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:35,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:20:36,180][1653645] Updated weights for policy 0, policy_version 934489 (0.0011) [2024-06-15 23:20:37,147][1653645] Updated weights for policy 0, policy_version 934531 (0.0010) [2024-06-15 23:20:37,875][1653645] Updated weights for policy 0, policy_version 934591 (0.0010) [2024-06-15 23:20:40,741][1653645] Updated weights for policy 0, policy_version 934640 (0.0011) [2024-06-15 23:20:40,958][1648982] Fps is (10 sec: 78642.1, 60 sec: 80827.5, 300 sec: 79087.5). Total num frames: 1914175488. Throughput: 0: 20252.5. Samples: 478591488. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:20:42,253][1653645] Updated weights for policy 0, policy_version 934690 (0.0013) [2024-06-15 23:20:43,071][1653645] Updated weights for policy 0, policy_version 934752 (0.0011) [2024-06-15 23:20:44,246][1653645] Updated weights for policy 0, policy_version 934800 (0.0014) [2024-06-15 23:20:44,866][1653645] Updated weights for policy 0, policy_version 934848 (0.0013) [2024-06-15 23:20:45,958][1648982] Fps is (10 sec: 78644.0, 60 sec: 80281.7, 300 sec: 78643.2). Total num frames: 1914568704. Throughput: 0: 20184.1. Samples: 478712832. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:45,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:20:47,197][1653645] Updated weights for policy 0, policy_version 934906 (0.0012) [2024-06-15 23:20:48,630][1653645] Updated weights for policy 0, policy_version 934944 (0.0013) [2024-06-15 23:20:49,581][1653645] Updated weights for policy 0, policy_version 935010 (0.0011) [2024-06-15 23:20:50,958][1648982] Fps is (10 sec: 85199.1, 60 sec: 81920.1, 300 sec: 78976.5). Total num frames: 1915027456. Throughput: 0: 20320.7. Samples: 478773248. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:50,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:20:50,959][1653645] Updated weights for policy 0, policy_version 935072 (0.0010) [2024-06-15 23:20:52,808][1653645] Updated weights for policy 0, policy_version 935111 (0.0013) [2024-06-15 23:20:54,856][1653645] Updated weights for policy 0, policy_version 935170 (0.0013) [2024-06-15 23:20:55,697][1653645] Updated weights for policy 0, policy_version 935232 (0.0010) [2024-06-15 23:20:55,958][1648982] Fps is (10 sec: 81919.8, 60 sec: 80828.1, 300 sec: 79309.7). Total num frames: 1915387904. Throughput: 0: 20150.1. Samples: 478895616. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:20:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:20:56,128][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000935264_1915420672.pth... [2024-06-15 23:20:56,221][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000925920_1896284160.pth [2024-06-15 23:20:56,546][1653645] Updated weights for policy 0, policy_version 935296 (0.0010) [2024-06-15 23:20:56,975][1651596] Signal inference workers to stop experience collection... (48600 times) [2024-06-15 23:20:57,013][1653645] InferenceWorker_p0-w0: stopping experience collection (48600 times) [2024-06-15 23:20:57,125][1651596] Signal inference workers to resume experience collection... (48600 times) [2024-06-15 23:20:57,125][1653645] InferenceWorker_p0-w0: resuming experience collection (48600 times) [2024-06-15 23:20:57,671][1653645] Updated weights for policy 0, policy_version 935352 (0.0077) [2024-06-15 23:20:59,790][1653645] Updated weights for policy 0, policy_version 935393 (0.0009) [2024-06-15 23:21:00,958][1648982] Fps is (10 sec: 72088.9, 60 sec: 80827.6, 300 sec: 79087.5). Total num frames: 1915748352. Throughput: 0: 19968.0. Samples: 479010304. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:00,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:21:01,686][1653645] Updated weights for policy 0, policy_version 935456 (0.0011) [2024-06-15 23:21:02,893][1653645] Updated weights for policy 0, policy_version 935507 (0.0011) [2024-06-15 23:21:03,923][1653645] Updated weights for policy 0, policy_version 935584 (0.0083) [2024-06-15 23:21:05,958][1648982] Fps is (10 sec: 78643.7, 60 sec: 79189.4, 300 sec: 78754.3). Total num frames: 1916174336. Throughput: 0: 20115.9. Samples: 479070208. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:05,958][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 23:21:06,329][1653645] Updated weights for policy 0, policy_version 935655 (0.0075) [2024-06-15 23:21:07,900][1653645] Updated weights for policy 0, policy_version 935696 (0.0011) [2024-06-15 23:21:09,743][1653645] Updated weights for policy 0, policy_version 935748 (0.0100) [2024-06-15 23:21:10,680][1653645] Updated weights for policy 0, policy_version 935815 (0.0091) [2024-06-15 23:21:10,958][1648982] Fps is (10 sec: 81918.6, 60 sec: 79735.3, 300 sec: 78976.4). Total num frames: 1916567552. Throughput: 0: 20093.1. Samples: 479191552. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:21:11,338][1653645] Updated weights for policy 0, policy_version 935870 (0.0011) [2024-06-15 23:21:14,165][1653645] Updated weights for policy 0, policy_version 935938 (0.0011) [2024-06-15 23:21:14,765][1653645] Updated weights for policy 0, policy_version 935991 (0.0011) [2024-06-15 23:21:15,957][1648982] Fps is (10 sec: 75367.3, 60 sec: 80827.9, 300 sec: 79198.7). Total num frames: 1916928000. Throughput: 0: 20070.5. Samples: 479313920. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:15,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:21:16,432][1653645] Updated weights for policy 0, policy_version 936032 (0.0009) [2024-06-15 23:21:17,304][1653645] Updated weights for policy 0, policy_version 936096 (0.0011) [2024-06-15 23:21:18,848][1653645] Updated weights for policy 0, policy_version 936145 (0.0010) [2024-06-15 23:21:19,372][1653645] Updated weights for policy 0, policy_version 936191 (0.0011) [2024-06-15 23:21:20,806][1653645] Updated weights for policy 0, policy_version 936248 (0.0011) [2024-06-15 23:21:20,958][1648982] Fps is (10 sec: 88475.5, 60 sec: 80827.7, 300 sec: 79531.9). Total num frames: 1917452288. Throughput: 0: 20172.9. Samples: 479377408. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:21:22,315][1651596] Signal inference workers to stop experience collection... (48650 times) [2024-06-15 23:21:22,335][1653645] Updated weights for policy 0, policy_version 936290 (0.0011) [2024-06-15 23:21:22,347][1653645] InferenceWorker_p0-w0: stopping experience collection (48650 times) [2024-06-15 23:21:22,458][1651596] Signal inference workers to resume experience collection... (48650 times) [2024-06-15 23:21:22,459][1653645] InferenceWorker_p0-w0: resuming experience collection (48650 times) [2024-06-15 23:21:22,984][1653645] Updated weights for policy 0, policy_version 936339 (0.0010) [2024-06-15 23:21:23,431][1653645] Updated weights for policy 0, policy_version 936383 (0.0010) [2024-06-15 23:21:24,849][1653645] Updated weights for policy 0, policy_version 936432 (0.0009) [2024-06-15 23:21:25,958][1648982] Fps is (10 sec: 91748.8, 60 sec: 80827.9, 300 sec: 79087.5). Total num frames: 1917845504. Throughput: 0: 20537.0. Samples: 479515648. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:25,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:21:26,102][1653645] Updated weights for policy 0, policy_version 936469 (0.0010) [2024-06-15 23:21:26,522][1653645] Updated weights for policy 0, policy_version 936506 (0.0010) [2024-06-15 23:21:27,568][1653645] Updated weights for policy 0, policy_version 936532 (0.0011) [2024-06-15 23:21:28,317][1653645] Updated weights for policy 0, policy_version 936584 (0.0011) [2024-06-15 23:21:28,898][1653645] Updated weights for policy 0, policy_version 936632 (0.0012) [2024-06-15 23:21:30,349][1653645] Updated weights for policy 0, policy_version 936672 (0.0012) [2024-06-15 23:21:30,958][1648982] Fps is (10 sec: 91749.7, 60 sec: 83012.3, 300 sec: 79754.0). Total num frames: 1918369792. Throughput: 0: 20992.0. Samples: 479657472. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:21:31,645][1653645] Updated weights for policy 0, policy_version 936720 (0.0013) [2024-06-15 23:21:32,218][1653645] Updated weights for policy 0, policy_version 936766 (0.0011) [2024-06-15 23:21:33,723][1653645] Updated weights for policy 0, policy_version 936819 (0.0014) [2024-06-15 23:21:34,377][1653645] Updated weights for policy 0, policy_version 936867 (0.0012) [2024-06-15 23:21:35,669][1653645] Updated weights for policy 0, policy_version 936919 (0.0011) [2024-06-15 23:21:35,958][1648982] Fps is (10 sec: 101580.5, 60 sec: 84650.8, 300 sec: 80753.7). Total num frames: 1918861312. Throughput: 0: 21196.7. Samples: 479727104. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:35,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:21:37,390][1653645] Updated weights for policy 0, policy_version 936976 (0.0011) [2024-06-15 23:21:38,656][1653645] Updated weights for policy 0, policy_version 937025 (0.0013) [2024-06-15 23:21:39,299][1653645] Updated weights for policy 0, policy_version 937074 (0.0010) [2024-06-15 23:21:40,032][1653645] Updated weights for policy 0, policy_version 937125 (0.0011) [2024-06-15 23:21:40,958][1648982] Fps is (10 sec: 91750.9, 60 sec: 85197.1, 300 sec: 80420.5). Total num frames: 1919287296. Throughput: 0: 21504.0. Samples: 479863296. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:40,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:21:41,076][1653645] Updated weights for policy 0, policy_version 937156 (0.0009) [2024-06-15 23:21:41,726][1653645] Updated weights for policy 0, policy_version 937215 (0.0011) [2024-06-15 23:21:43,428][1653645] Updated weights for policy 0, policy_version 937265 (0.0012) [2024-06-15 23:21:44,258][1651596] Signal inference workers to stop experience collection... (48700 times) [2024-06-15 23:21:44,281][1653645] InferenceWorker_p0-w0: stopping experience collection (48700 times) [2024-06-15 23:21:44,376][1651596] Signal inference workers to resume experience collection... (48700 times) [2024-06-15 23:21:44,377][1653645] InferenceWorker_p0-w0: resuming experience collection (48700 times) [2024-06-15 23:21:44,477][1653645] Updated weights for policy 0, policy_version 937300 (0.0009) [2024-06-15 23:21:45,187][1653645] Updated weights for policy 0, policy_version 937360 (0.0012) [2024-06-15 23:21:45,958][1648982] Fps is (10 sec: 95026.2, 60 sec: 87381.1, 300 sec: 80753.7). Total num frames: 1919811584. Throughput: 0: 22015.9. Samples: 480001024. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:45,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:21:47,015][1653645] Updated weights for policy 0, policy_version 937409 (0.0010) [2024-06-15 23:21:48,652][1653645] Updated weights for policy 0, policy_version 937473 (0.0012) [2024-06-15 23:21:50,062][1653645] Updated weights for policy 0, policy_version 937539 (0.0011) [2024-06-15 23:21:50,776][1653645] Updated weights for policy 0, policy_version 937600 (0.0010) [2024-06-15 23:21:50,958][1648982] Fps is (10 sec: 91750.8, 60 sec: 86289.0, 300 sec: 81198.1). Total num frames: 1920204800. Throughput: 0: 22232.2. Samples: 480070656. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:50,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:21:51,636][1653645] Updated weights for policy 0, policy_version 937661 (0.0010) [2024-06-15 23:21:53,528][1653645] Updated weights for policy 0, policy_version 937722 (0.0011) [2024-06-15 23:21:54,884][1653645] Updated weights for policy 0, policy_version 937766 (0.0073) [2024-06-15 23:21:55,958][1648982] Fps is (10 sec: 78644.4, 60 sec: 86835.2, 300 sec: 81309.1). Total num frames: 1920598016. Throughput: 0: 22585.0. Samples: 480207872. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:21:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:21:56,193][1653645] Updated weights for policy 0, policy_version 937816 (0.0011) [2024-06-15 23:21:57,153][1653645] Updated weights for policy 0, policy_version 937888 (0.0011) [2024-06-15 23:21:58,511][1653645] Updated weights for policy 0, policy_version 937926 (0.0010) [2024-06-15 23:21:59,167][1653645] Updated weights for policy 0, policy_version 937984 (0.0010) [2024-06-15 23:22:00,958][1648982] Fps is (10 sec: 85196.1, 60 sec: 88473.6, 300 sec: 81086.9). Total num frames: 1921056768. Throughput: 0: 22994.4. Samples: 480348672. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:22:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:22:01,111][1653645] Updated weights for policy 0, policy_version 938041 (0.0011) [2024-06-15 23:22:02,090][1653645] Updated weights for policy 0, policy_version 938112 (0.0011) [2024-06-15 23:22:02,794][1653645] Updated weights for policy 0, policy_version 938167 (0.0009) [2024-06-15 23:22:04,581][1653645] Updated weights for policy 0, policy_version 938210 (0.0009) [2024-06-15 23:22:05,958][1648982] Fps is (10 sec: 91750.8, 60 sec: 89019.7, 300 sec: 81198.0). Total num frames: 1921515520. Throughput: 0: 22971.7. Samples: 480411136. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:22:05,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:22:06,193][1651596] Signal inference workers to stop experience collection... (48750 times) [2024-06-15 23:22:06,229][1653645] Updated weights for policy 0, policy_version 938260 (0.0010) [2024-06-15 23:22:06,257][1653645] InferenceWorker_p0-w0: stopping experience collection (48750 times) [2024-06-15 23:22:06,324][1651596] Signal inference workers to resume experience collection... (48750 times) [2024-06-15 23:22:06,325][1653645] InferenceWorker_p0-w0: resuming experience collection (48750 times) [2024-06-15 23:22:06,708][1653645] Updated weights for policy 0, policy_version 938304 (0.0008) [2024-06-15 23:22:07,774][1653645] Updated weights for policy 0, policy_version 938372 (0.0010) [2024-06-15 23:22:08,445][1653645] Updated weights for policy 0, policy_version 938425 (0.0013) [2024-06-15 23:22:10,017][1653645] Updated weights for policy 0, policy_version 938470 (0.0009) [2024-06-15 23:22:10,958][1648982] Fps is (10 sec: 98302.9, 60 sec: 91204.3, 300 sec: 82086.6). Total num frames: 1922039808. Throughput: 0: 23131.0. Samples: 480556544. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:22:10,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:22:11,921][1653645] Updated weights for policy 0, policy_version 938516 (0.0011) [2024-06-15 23:22:13,032][1653645] Updated weights for policy 0, policy_version 938595 (0.0012) [2024-06-15 23:22:13,876][1653645] Updated weights for policy 0, policy_version 938662 (0.0010) [2024-06-15 23:22:15,919][1653645] Updated weights for policy 0, policy_version 938726 (0.0011) [2024-06-15 23:22:15,958][1648982] Fps is (10 sec: 98303.5, 60 sec: 92842.4, 300 sec: 82419.8). Total num frames: 1922498560. Throughput: 0: 22937.6. Samples: 480689664. Policy #0 lag: (min: 14.0, avg: 135.1, max: 270.0) [2024-06-15 23:22:15,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:22:17,975][1653645] Updated weights for policy 0, policy_version 938770 (0.0011) [2024-06-15 23:22:18,707][1653645] Updated weights for policy 0, policy_version 938832 (0.0014) [2024-06-15 23:22:19,557][1653645] Updated weights for policy 0, policy_version 938896 (0.0009) [2024-06-15 23:22:20,096][1653645] Updated weights for policy 0, policy_version 938940 (0.0010) [2024-06-15 23:22:20,958][1648982] Fps is (10 sec: 91752.5, 60 sec: 91750.5, 300 sec: 82197.8). Total num frames: 1922957312. Throughput: 0: 22994.6. Samples: 480761856. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:22:22,053][1653645] Updated weights for policy 0, policy_version 939004 (0.0012) [2024-06-15 23:22:24,257][1653645] Updated weights for policy 0, policy_version 939060 (0.0011) [2024-06-15 23:22:24,876][1653645] Updated weights for policy 0, policy_version 939108 (0.0010) [2024-06-15 23:22:25,686][1651596] Signal inference workers to stop experience collection... (48800 times) [2024-06-15 23:22:25,731][1653645] InferenceWorker_p0-w0: stopping experience collection (48800 times) [2024-06-15 23:22:25,732][1653645] Updated weights for policy 0, policy_version 939173 (0.0008) [2024-06-15 23:22:25,808][1651596] Signal inference workers to resume experience collection... (48800 times) [2024-06-15 23:22:25,808][1653645] InferenceWorker_p0-w0: resuming experience collection (48800 times) [2024-06-15 23:22:25,958][1648982] Fps is (10 sec: 95026.1, 60 sec: 93388.6, 300 sec: 82641.9). Total num frames: 1923448832. Throughput: 0: 22914.8. Samples: 480894464. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:22:27,563][1653645] Updated weights for policy 0, policy_version 939220 (0.0010) [2024-06-15 23:22:28,024][1653645] Updated weights for policy 0, policy_version 939261 (0.0013) [2024-06-15 23:22:29,906][1653645] Updated weights for policy 0, policy_version 939315 (0.0011) [2024-06-15 23:22:30,678][1653645] Updated weights for policy 0, policy_version 939376 (0.0011) [2024-06-15 23:22:30,959][1648982] Fps is (10 sec: 91732.3, 60 sec: 91747.6, 300 sec: 83307.9). Total num frames: 1923874816. Throughput: 0: 22879.8. Samples: 481030656. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:30,960][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:22:31,451][1653645] Updated weights for policy 0, policy_version 939425 (0.0010) [2024-06-15 23:22:33,326][1653645] Updated weights for policy 0, policy_version 939488 (0.0011) [2024-06-15 23:22:35,035][1653645] Updated weights for policy 0, policy_version 939522 (0.0011) [2024-06-15 23:22:35,911][1653645] Updated weights for policy 0, policy_version 939587 (0.0011) [2024-06-15 23:22:35,958][1648982] Fps is (10 sec: 81921.0, 60 sec: 90112.1, 300 sec: 83086.3). Total num frames: 1924268032. Throughput: 0: 22812.4. Samples: 481097216. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:35,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:22:36,526][1653645] Updated weights for policy 0, policy_version 939634 (0.0008) [2024-06-15 23:22:37,352][1653645] Updated weights for policy 0, policy_version 939708 (0.0010) [2024-06-15 23:22:39,255][1653645] Updated weights for policy 0, policy_version 939747 (0.0011) [2024-06-15 23:22:40,866][1653645] Updated weights for policy 0, policy_version 939792 (0.0014) [2024-06-15 23:22:40,960][1648982] Fps is (10 sec: 81934.9, 60 sec: 90111.9, 300 sec: 82753.1). Total num frames: 1924694016. Throughput: 0: 22789.7. Samples: 481233408. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:40,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:22:41,707][1653645] Updated weights for policy 0, policy_version 939856 (0.0011) [2024-06-15 23:22:42,445][1653645] Updated weights for policy 0, policy_version 939911 (0.0010) [2024-06-15 23:22:44,572][1653645] Updated weights for policy 0, policy_version 939970 (0.0012) [2024-06-15 23:22:45,217][1653645] Updated weights for policy 0, policy_version 940026 (0.0014) [2024-06-15 23:22:45,958][1648982] Fps is (10 sec: 91750.8, 60 sec: 89566.2, 300 sec: 83419.6). Total num frames: 1925185536. Throughput: 0: 22664.6. Samples: 481368576. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:45,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:22:46,853][1653645] Updated weights for policy 0, policy_version 940066 (0.0010) [2024-06-15 23:22:47,264][1651596] Signal inference workers to stop experience collection... (48850 times) [2024-06-15 23:22:47,307][1653645] InferenceWorker_p0-w0: stopping experience collection (48850 times) [2024-06-15 23:22:47,413][1651596] Signal inference workers to resume experience collection... (48850 times) [2024-06-15 23:22:47,414][1653645] InferenceWorker_p0-w0: resuming experience collection (48850 times) [2024-06-15 23:22:47,517][1653645] Updated weights for policy 0, policy_version 940115 (0.0009) [2024-06-15 23:22:48,285][1653645] Updated weights for policy 0, policy_version 940176 (0.0010) [2024-06-15 23:22:50,306][1653645] Updated weights for policy 0, policy_version 940226 (0.0011) [2024-06-15 23:22:50,957][1653645] Updated weights for policy 0, policy_version 940283 (0.0011) [2024-06-15 23:22:50,958][1648982] Fps is (10 sec: 98305.5, 60 sec: 91204.3, 300 sec: 83863.9). Total num frames: 1925677056. Throughput: 0: 22732.8. Samples: 481434112. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:50,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:22:52,697][1653645] Updated weights for policy 0, policy_version 940341 (0.0011) [2024-06-15 23:22:53,625][1653645] Updated weights for policy 0, policy_version 940386 (0.0011) [2024-06-15 23:22:54,300][1653645] Updated weights for policy 0, policy_version 940435 (0.0011) [2024-06-15 23:22:54,738][1653645] Updated weights for policy 0, policy_version 940476 (0.0011) [2024-06-15 23:22:55,958][1648982] Fps is (10 sec: 91749.6, 60 sec: 91750.3, 300 sec: 83975.0). Total num frames: 1926103040. Throughput: 0: 22573.6. Samples: 481572352. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:22:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:22:56,197][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000940512_1926168576.pth... [2024-06-15 23:22:56,309][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000930576_1905819648.pth [2024-06-15 23:22:56,631][1653645] Updated weights for policy 0, policy_version 940544 (0.0010) [2024-06-15 23:22:58,577][1653645] Updated weights for policy 0, policy_version 940604 (0.0011) [2024-06-15 23:22:59,370][1653645] Updated weights for policy 0, policy_version 940643 (0.0012) [2024-06-15 23:23:00,228][1653645] Updated weights for policy 0, policy_version 940705 (0.0010) [2024-06-15 23:23:00,958][1648982] Fps is (10 sec: 95026.7, 60 sec: 92842.8, 300 sec: 83974.9). Total num frames: 1926627328. Throughput: 0: 22539.4. Samples: 481703936. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:23:01,889][1653645] Updated weights for policy 0, policy_version 940756 (0.0016) [2024-06-15 23:23:03,829][1653645] Updated weights for policy 0, policy_version 940805 (0.0010) [2024-06-15 23:23:04,632][1653645] Updated weights for policy 0, policy_version 940865 (0.0011) [2024-06-15 23:23:05,280][1653645] Updated weights for policy 0, policy_version 940921 (0.0011) [2024-06-15 23:23:05,958][1648982] Fps is (10 sec: 98304.6, 60 sec: 92842.6, 300 sec: 84308.2). Total num frames: 1927086080. Throughput: 0: 22675.9. Samples: 481782272. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:23:06,216][1653645] Updated weights for policy 0, policy_version 940986 (0.0010) [2024-06-15 23:23:07,701][1653645] Updated weights for policy 0, policy_version 941043 (0.0012) [2024-06-15 23:23:09,483][1651596] Signal inference workers to stop experience collection... (48900 times) [2024-06-15 23:23:09,521][1653645] InferenceWorker_p0-w0: stopping experience collection (48900 times) [2024-06-15 23:23:09,539][1653645] Updated weights for policy 0, policy_version 941077 (0.0020) [2024-06-15 23:23:09,620][1651596] Signal inference workers to resume experience collection... (48900 times) [2024-06-15 23:23:09,620][1653645] InferenceWorker_p0-w0: resuming experience collection (48900 times) [2024-06-15 23:23:10,622][1653645] Updated weights for policy 0, policy_version 941152 (0.0010) [2024-06-15 23:23:10,958][1648982] Fps is (10 sec: 88473.5, 60 sec: 91204.5, 300 sec: 84752.5). Total num frames: 1927512064. Throughput: 0: 22914.9. Samples: 481925632. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:10,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:23:11,412][1653645] Updated weights for policy 0, policy_version 941200 (0.0012) [2024-06-15 23:23:12,914][1653645] Updated weights for policy 0, policy_version 941250 (0.0012) [2024-06-15 23:23:13,581][1653645] Updated weights for policy 0, policy_version 941310 (0.0009) [2024-06-15 23:23:15,499][1653645] Updated weights for policy 0, policy_version 941348 (0.0010) [2024-06-15 23:23:15,958][1648982] Fps is (10 sec: 85194.8, 60 sec: 90657.8, 300 sec: 84641.3). Total num frames: 1927938048. Throughput: 0: 22824.7. Samples: 482057728. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:23:16,212][1653645] Updated weights for policy 0, policy_version 941377 (0.0010) [2024-06-15 23:23:17,073][1653645] Updated weights for policy 0, policy_version 941443 (0.0011) [2024-06-15 23:23:17,723][1653645] Updated weights for policy 0, policy_version 941499 (0.0010) [2024-06-15 23:23:18,986][1653645] Updated weights for policy 0, policy_version 941542 (0.0011) [2024-06-15 23:23:20,954][1653645] Updated weights for policy 0, policy_version 941574 (0.0010) [2024-06-15 23:23:20,958][1648982] Fps is (10 sec: 81920.3, 60 sec: 89565.8, 300 sec: 84419.3). Total num frames: 1928331264. Throughput: 0: 22664.6. Samples: 482117120. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:20,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:23:21,563][1653645] Updated weights for policy 0, policy_version 941631 (0.0010) [2024-06-15 23:23:22,997][1653645] Updated weights for policy 0, policy_version 941704 (0.0011) [2024-06-15 23:23:23,610][1653645] Updated weights for policy 0, policy_version 941756 (0.0013) [2024-06-15 23:23:25,036][1653645] Updated weights for policy 0, policy_version 941824 (0.0014) [2024-06-15 23:23:25,958][1648982] Fps is (10 sec: 91752.7, 60 sec: 90112.2, 300 sec: 84974.6). Total num frames: 1928855552. Throughput: 0: 22721.5. Samples: 482255872. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:25,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:23:27,588][1653645] Updated weights for policy 0, policy_version 941888 (0.0013) [2024-06-15 23:23:28,780][1653645] Updated weights for policy 0, policy_version 941952 (0.0011) [2024-06-15 23:23:29,520][1653645] Updated weights for policy 0, policy_version 942011 (0.0011) [2024-06-15 23:23:30,812][1653645] Updated weights for policy 0, policy_version 942072 (0.0016) [2024-06-15 23:23:30,958][1648982] Fps is (10 sec: 104857.4, 60 sec: 91753.3, 300 sec: 85418.9). Total num frames: 1929379840. Throughput: 0: 22619.0. Samples: 482386432. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:30,960][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:23:32,722][1651596] Signal inference workers to stop experience collection... (48950 times) [2024-06-15 23:23:32,756][1653645] InferenceWorker_p0-w0: stopping experience collection (48950 times) [2024-06-15 23:23:32,855][1651596] Signal inference workers to resume experience collection... (48950 times) [2024-06-15 23:23:32,855][1653645] InferenceWorker_p0-w0: resuming experience collection (48950 times) [2024-06-15 23:23:33,062][1653645] Updated weights for policy 0, policy_version 942112 (0.0028) [2024-06-15 23:23:34,182][1653645] Updated weights for policy 0, policy_version 942163 (0.0031) [2024-06-15 23:23:35,099][1653645] Updated weights for policy 0, policy_version 942229 (0.0011) [2024-06-15 23:23:35,958][1648982] Fps is (10 sec: 91749.7, 60 sec: 91750.3, 300 sec: 85307.9). Total num frames: 1929773056. Throughput: 0: 22755.5. Samples: 482458112. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:23:36,402][1653645] Updated weights for policy 0, policy_version 942292 (0.0010) [2024-06-15 23:23:38,818][1653645] Updated weights for policy 0, policy_version 942352 (0.0011) [2024-06-15 23:23:40,196][1653645] Updated weights for policy 0, policy_version 942418 (0.0012) [2024-06-15 23:23:40,812][1653645] Updated weights for policy 0, policy_version 942466 (0.0009) [2024-06-15 23:23:40,958][1648982] Fps is (10 sec: 81919.2, 60 sec: 91750.4, 300 sec: 85418.9). Total num frames: 1930199040. Throughput: 0: 22596.3. Samples: 482589184. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:23:41,466][1653645] Updated weights for policy 0, policy_version 942522 (0.0009) [2024-06-15 23:23:42,371][1653645] Updated weights for policy 0, policy_version 942567 (0.0011) [2024-06-15 23:23:44,896][1653645] Updated weights for policy 0, policy_version 942624 (0.0014) [2024-06-15 23:23:45,960][1648982] Fps is (10 sec: 81919.6, 60 sec: 90111.8, 300 sec: 85196.8). Total num frames: 1930592256. Throughput: 0: 22710.0. Samples: 482725888. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:45,961][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:23:46,054][1653645] Updated weights for policy 0, policy_version 942688 (0.0068) [2024-06-15 23:23:46,781][1653645] Updated weights for policy 0, policy_version 942741 (0.0012) [2024-06-15 23:23:47,826][1653645] Updated weights for policy 0, policy_version 942800 (0.0012) [2024-06-15 23:23:50,296][1653645] Updated weights for policy 0, policy_version 942849 (0.0010) [2024-06-15 23:23:50,962][1648982] Fps is (10 sec: 85158.8, 60 sec: 89559.0, 300 sec: 85750.9). Total num frames: 1931051008. Throughput: 0: 22309.6. Samples: 482786304. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:50,963][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:23:50,998][1653645] Updated weights for policy 0, policy_version 942904 (0.0068) [2024-06-15 23:23:51,597][1653645] Updated weights for policy 0, policy_version 942948 (0.0012) [2024-06-15 23:23:52,237][1653645] Updated weights for policy 0, policy_version 942997 (0.0012) [2024-06-15 23:23:53,240][1651596] Signal inference workers to stop experience collection... (49000 times) [2024-06-15 23:23:53,286][1653645] InferenceWorker_p0-w0: stopping experience collection (49000 times) [2024-06-15 23:23:53,299][1653645] Updated weights for policy 0, policy_version 943043 (0.0011) [2024-06-15 23:23:53,396][1651596] Signal inference workers to resume experience collection... (49000 times) [2024-06-15 23:23:53,397][1653645] InferenceWorker_p0-w0: resuming experience collection (49000 times) [2024-06-15 23:23:53,932][1653645] Updated weights for policy 0, policy_version 943101 (0.0079) [2024-06-15 23:23:55,958][1648982] Fps is (10 sec: 88473.7, 60 sec: 89565.8, 300 sec: 85863.3). Total num frames: 1931476992. Throughput: 0: 22311.8. Samples: 482929664. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:23:55,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:23:56,516][1653645] Updated weights for policy 0, policy_version 943143 (0.0011) [2024-06-15 23:23:57,180][1653645] Updated weights for policy 0, policy_version 943200 (0.0011) [2024-06-15 23:23:57,993][1653645] Updated weights for policy 0, policy_version 943253 (0.0010) [2024-06-15 23:23:59,021][1653645] Updated weights for policy 0, policy_version 943298 (0.0012) [2024-06-15 23:23:59,706][1653645] Updated weights for policy 0, policy_version 943360 (0.0075) [2024-06-15 23:24:00,957][1648982] Fps is (10 sec: 95070.9, 60 sec: 89565.9, 300 sec: 86196.5). Total num frames: 1932001280. Throughput: 0: 22448.5. Samples: 483067904. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:24:00,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:24:02,579][1653645] Updated weights for policy 0, policy_version 943426 (0.0012) [2024-06-15 23:24:03,335][1653645] Updated weights for policy 0, policy_version 943488 (0.0012) [2024-06-15 23:24:04,062][1653645] Updated weights for policy 0, policy_version 943544 (0.0012) [2024-06-15 23:24:05,348][1653645] Updated weights for policy 0, policy_version 943585 (0.0012) [2024-06-15 23:24:05,958][1648982] Fps is (10 sec: 104856.5, 60 sec: 90657.8, 300 sec: 86196.5). Total num frames: 1932525568. Throughput: 0: 22641.6. Samples: 483136000. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:24:05,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:24:07,795][1653645] Updated weights for policy 0, policy_version 943632 (0.0009) [2024-06-15 23:24:08,345][1653645] Updated weights for policy 0, policy_version 943679 (0.0012) [2024-06-15 23:24:08,991][1653645] Updated weights for policy 0, policy_version 943716 (0.0011) [2024-06-15 23:24:09,531][1653645] Updated weights for policy 0, policy_version 943760 (0.0011) [2024-06-15 23:24:10,958][1648982] Fps is (10 sec: 98303.5, 60 sec: 91204.3, 300 sec: 86863.0). Total num frames: 1932984320. Throughput: 0: 22584.9. Samples: 483272192. Policy #0 lag: (min: 3.0, avg: 65.0, max: 211.0) [2024-06-15 23:24:10,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:24:10,992][1653645] Updated weights for policy 0, policy_version 943842 (0.0012) [2024-06-15 23:24:13,547][1653645] Updated weights for policy 0, policy_version 943889 (0.0010) [2024-06-15 23:24:13,988][1653645] Updated weights for policy 0, policy_version 943931 (0.0010) [2024-06-15 23:24:15,014][1653645] Updated weights for policy 0, policy_version 943984 (0.0011) [2024-06-15 23:24:15,302][1651596] Signal inference workers to stop experience collection... (49050 times) [2024-06-15 23:24:15,348][1653645] InferenceWorker_p0-w0: stopping experience collection (49050 times) [2024-06-15 23:24:15,461][1651596] Signal inference workers to resume experience collection... (49050 times) [2024-06-15 23:24:15,462][1653645] InferenceWorker_p0-w0: resuming experience collection (49050 times) [2024-06-15 23:24:15,958][1648982] Fps is (10 sec: 88475.5, 60 sec: 91204.6, 300 sec: 86974.1). Total num frames: 1933410304. Throughput: 0: 22687.3. Samples: 483407360. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:15,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:24:15,999][1653645] Updated weights for policy 0, policy_version 944059 (0.0014) [2024-06-15 23:24:16,793][1653645] Updated weights for policy 0, policy_version 944112 (0.0011) [2024-06-15 23:24:19,767][1653645] Updated weights for policy 0, policy_version 944176 (0.0102) [2024-06-15 23:24:20,958][1648982] Fps is (10 sec: 78643.0, 60 sec: 90658.1, 300 sec: 86863.0). Total num frames: 1933770752. Throughput: 0: 22562.2. Samples: 483473408. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:24:20,964][1653645] Updated weights for policy 0, policy_version 944230 (0.0011) [2024-06-15 23:24:21,795][1653645] Updated weights for policy 0, policy_version 944293 (0.0066) [2024-06-15 23:24:22,511][1653645] Updated weights for policy 0, policy_version 944340 (0.0011) [2024-06-15 23:24:25,054][1653645] Updated weights for policy 0, policy_version 944385 (0.0011) [2024-06-15 23:24:25,695][1653645] Updated weights for policy 0, policy_version 944445 (0.0011) [2024-06-15 23:24:25,958][1648982] Fps is (10 sec: 81917.9, 60 sec: 89565.5, 300 sec: 86862.9). Total num frames: 1934229504. Throughput: 0: 22596.2. Samples: 483606016. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:25,959][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:24:26,885][1653645] Updated weights for policy 0, policy_version 944496 (0.0011) [2024-06-15 23:24:27,606][1653645] Updated weights for policy 0, policy_version 944547 (0.0009) [2024-06-15 23:24:28,236][1653645] Updated weights for policy 0, policy_version 944598 (0.0012) [2024-06-15 23:24:30,958][1648982] Fps is (10 sec: 85196.9, 60 sec: 87381.3, 300 sec: 86862.9). Total num frames: 1934622720. Throughput: 0: 22676.0. Samples: 483746304. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:30,961][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:24:31,153][1653645] Updated weights for policy 0, policy_version 944656 (0.0010) [2024-06-15 23:24:31,699][1653645] Updated weights for policy 0, policy_version 944704 (0.0012) [2024-06-15 23:24:32,548][1653645] Updated weights for policy 0, policy_version 944756 (0.0010) [2024-06-15 23:24:33,383][1653645] Updated weights for policy 0, policy_version 944830 (0.0078) [2024-06-15 23:24:34,365][1653645] Updated weights for policy 0, policy_version 944880 (0.0011) [2024-06-15 23:24:35,957][1648982] Fps is (10 sec: 91754.1, 60 sec: 89566.2, 300 sec: 87529.5). Total num frames: 1935147008. Throughput: 0: 22644.1. Samples: 483805184. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:35,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 23:24:37,611][1653645] Updated weights for policy 0, policy_version 944955 (0.0010) [2024-06-15 23:24:38,061][1651596] Signal inference workers to stop experience collection... (49100 times) [2024-06-15 23:24:38,102][1653645] InferenceWorker_p0-w0: stopping experience collection (49100 times) [2024-06-15 23:24:38,209][1651596] Signal inference workers to resume experience collection... (49100 times) [2024-06-15 23:24:38,214][1653645] InferenceWorker_p0-w0: resuming experience collection (49100 times) [2024-06-15 23:24:38,340][1653645] Updated weights for policy 0, policy_version 944997 (0.0014) [2024-06-15 23:24:39,135][1653645] Updated weights for policy 0, policy_version 945060 (0.0062) [2024-06-15 23:24:40,191][1653645] Updated weights for policy 0, policy_version 945120 (0.0011) [2024-06-15 23:24:40,958][1648982] Fps is (10 sec: 104854.9, 60 sec: 91204.0, 300 sec: 87862.6). Total num frames: 1935671296. Throughput: 0: 22459.7. Samples: 483940352. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:40,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:24:42,867][1653645] Updated weights for policy 0, policy_version 945172 (0.0010) [2024-06-15 23:24:43,938][1653645] Updated weights for policy 0, policy_version 945234 (0.0011) [2024-06-15 23:24:44,470][1653645] Updated weights for policy 0, policy_version 945280 (0.0011) [2024-06-15 23:24:45,192][1653645] Updated weights for policy 0, policy_version 945335 (0.0010) [2024-06-15 23:24:45,958][1648982] Fps is (10 sec: 98302.5, 60 sec: 92296.7, 300 sec: 88195.9). Total num frames: 1936130048. Throughput: 0: 22436.9. Samples: 484077568. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:45,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:24:45,970][1653645] Updated weights for policy 0, policy_version 945381 (0.0009) [2024-06-15 23:24:48,537][1653645] Updated weights for policy 0, policy_version 945425 (0.0011) [2024-06-15 23:24:49,504][1653645] Updated weights for policy 0, policy_version 945474 (0.0009) [2024-06-15 23:24:50,198][1653645] Updated weights for policy 0, policy_version 945525 (0.0009) [2024-06-15 23:24:50,958][1648982] Fps is (10 sec: 88475.6, 60 sec: 91757.3, 300 sec: 88196.0). Total num frames: 1936556032. Throughput: 0: 22653.2. Samples: 484155392. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:50,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:24:51,011][1653645] Updated weights for policy 0, policy_version 945593 (0.0081) [2024-06-15 23:24:52,072][1653645] Updated weights for policy 0, policy_version 945648 (0.0011) [2024-06-15 23:24:54,783][1653645] Updated weights for policy 0, policy_version 945712 (0.0011) [2024-06-15 23:24:55,470][1653645] Updated weights for policy 0, policy_version 945760 (0.0011) [2024-06-15 23:24:55,958][1648982] Fps is (10 sec: 85195.4, 60 sec: 91750.3, 300 sec: 88418.0). Total num frames: 1936982016. Throughput: 0: 22698.6. Samples: 484293632. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:24:55,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:24:56,192][1653645] Updated weights for policy 0, policy_version 945812 (0.0010) [2024-06-15 23:24:56,285][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000945824_1937047552.pth... [2024-06-15 23:24:56,373][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000935264_1915420672.pth [2024-06-15 23:24:57,309][1653645] Updated weights for policy 0, policy_version 945881 (0.0013) [2024-06-15 23:25:00,461][1651596] Signal inference workers to stop experience collection... (49150 times) [2024-06-15 23:25:00,505][1653645] InferenceWorker_p0-w0: stopping experience collection (49150 times) [2024-06-15 23:25:00,517][1653645] Updated weights for policy 0, policy_version 945959 (0.0011) [2024-06-15 23:25:00,577][1651596] Signal inference workers to resume experience collection... (49150 times) [2024-06-15 23:25:00,578][1653645] InferenceWorker_p0-w0: resuming experience collection (49150 times) [2024-06-15 23:25:00,957][1648982] Fps is (10 sec: 81921.2, 60 sec: 89565.9, 300 sec: 87973.8). Total num frames: 1937375232. Throughput: 0: 22710.1. Samples: 484429312. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:00,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:25:01,144][1653645] Updated weights for policy 0, policy_version 946005 (0.0011) [2024-06-15 23:25:01,753][1653645] Updated weights for policy 0, policy_version 946054 (0.0010) [2024-06-15 23:25:02,996][1653645] Updated weights for policy 0, policy_version 946114 (0.0012) [2024-06-15 23:25:05,958][1648982] Fps is (10 sec: 78640.7, 60 sec: 87380.9, 300 sec: 88084.7). Total num frames: 1937768448. Throughput: 0: 22493.6. Samples: 484485632. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:25:06,273][1653645] Updated weights for policy 0, policy_version 946192 (0.0011) [2024-06-15 23:25:06,963][1653645] Updated weights for policy 0, policy_version 946242 (0.0027) [2024-06-15 23:25:07,604][1653645] Updated weights for policy 0, policy_version 946291 (0.0010) [2024-06-15 23:25:08,453][1653645] Updated weights for policy 0, policy_version 946359 (0.0011) [2024-06-15 23:25:09,225][1653645] Updated weights for policy 0, policy_version 946400 (0.0009) [2024-06-15 23:25:10,958][1648982] Fps is (10 sec: 91748.9, 60 sec: 88473.5, 300 sec: 88862.3). Total num frames: 1938292736. Throughput: 0: 22550.9. Samples: 484620800. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:10,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:25:12,095][1653645] Updated weights for policy 0, policy_version 946451 (0.0010) [2024-06-15 23:25:12,987][1653645] Updated weights for policy 0, policy_version 946518 (0.0011) [2024-06-15 23:25:13,853][1653645] Updated weights for policy 0, policy_version 946581 (0.0012) [2024-06-15 23:25:14,916][1653645] Updated weights for policy 0, policy_version 946629 (0.0010) [2024-06-15 23:25:15,572][1653645] Updated weights for policy 0, policy_version 946686 (0.0011) [2024-06-15 23:25:15,958][1648982] Fps is (10 sec: 104863.2, 60 sec: 90112.1, 300 sec: 88862.4). Total num frames: 1938817024. Throughput: 0: 22357.4. Samples: 484752384. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 23:25:18,312][1653645] Updated weights for policy 0, policy_version 946736 (0.0010) [2024-06-15 23:25:19,094][1653645] Updated weights for policy 0, policy_version 946787 (0.0010) [2024-06-15 23:25:19,940][1651596] Signal inference workers to stop experience collection... (49200 times) [2024-06-15 23:25:19,958][1653645] Updated weights for policy 0, policy_version 946850 (0.0010) [2024-06-15 23:25:19,991][1653645] InferenceWorker_p0-w0: stopping experience collection (49200 times) [2024-06-15 23:25:20,081][1651596] Signal inference workers to resume experience collection... (49200 times) [2024-06-15 23:25:20,082][1653645] InferenceWorker_p0-w0: resuming experience collection (49200 times) [2024-06-15 23:25:20,780][1653645] Updated weights for policy 0, policy_version 946912 (0.0075) [2024-06-15 23:25:20,958][1648982] Fps is (10 sec: 98304.7, 60 sec: 91750.4, 300 sec: 89084.6). Total num frames: 1939275776. Throughput: 0: 22789.6. Samples: 484830720. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:20,958][1648982] Avg episode reward: [(0, '37.300')] [2024-06-15 23:25:21,159][1653645] Updated weights for policy 0, policy_version 946944 (0.0009) [2024-06-15 23:25:24,065][1653645] Updated weights for policy 0, policy_version 946998 (0.0010) [2024-06-15 23:25:24,974][1653645] Updated weights for policy 0, policy_version 947072 (0.0013) [2024-06-15 23:25:25,657][1653645] Updated weights for policy 0, policy_version 947122 (0.0013) [2024-06-15 23:25:25,958][1648982] Fps is (10 sec: 91748.5, 60 sec: 91750.5, 300 sec: 89306.7). Total num frames: 1939734528. Throughput: 0: 22801.1. Samples: 484966400. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:25,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:25:26,482][1653645] Updated weights for policy 0, policy_version 947154 (0.0013) [2024-06-15 23:25:29,402][1653645] Updated weights for policy 0, policy_version 947219 (0.0011) [2024-06-15 23:25:30,067][1653645] Updated weights for policy 0, policy_version 947270 (0.0011) [2024-06-15 23:25:30,909][1653645] Updated weights for policy 0, policy_version 947330 (0.0014) [2024-06-15 23:25:30,958][1648982] Fps is (10 sec: 85196.4, 60 sec: 91750.3, 300 sec: 89306.7). Total num frames: 1940127744. Throughput: 0: 22789.7. Samples: 485103104. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:25:32,137][1653645] Updated weights for policy 0, policy_version 947397 (0.0080) [2024-06-15 23:25:32,904][1653645] Updated weights for policy 0, policy_version 947452 (0.0012) [2024-06-15 23:25:35,640][1653645] Updated weights for policy 0, policy_version 947504 (0.0010) [2024-06-15 23:25:35,958][1648982] Fps is (10 sec: 78643.2, 60 sec: 89565.4, 300 sec: 89306.7). Total num frames: 1940520960. Throughput: 0: 22289.0. Samples: 485158400. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:35,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:25:36,283][1653645] Updated weights for policy 0, policy_version 947552 (0.0010) [2024-06-15 23:25:37,056][1653645] Updated weights for policy 0, policy_version 947604 (0.0009) [2024-06-15 23:25:37,880][1653645] Updated weights for policy 0, policy_version 947664 (0.0010) [2024-06-15 23:25:38,439][1653645] Updated weights for policy 0, policy_version 947707 (0.0011) [2024-06-15 23:25:40,958][1648982] Fps is (10 sec: 78643.5, 60 sec: 87381.7, 300 sec: 89306.7). Total num frames: 1940914176. Throughput: 0: 22380.2. Samples: 485300736. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:25:41,340][1653645] Updated weights for policy 0, policy_version 947748 (0.0012) [2024-06-15 23:25:41,702][1651596] Signal inference workers to stop experience collection... (49250 times) [2024-06-15 23:25:41,749][1653645] InferenceWorker_p0-w0: stopping experience collection (49250 times) [2024-06-15 23:25:41,887][1651596] Signal inference workers to resume experience collection... (49250 times) [2024-06-15 23:25:41,888][1653645] InferenceWorker_p0-w0: resuming experience collection (49250 times) [2024-06-15 23:25:42,344][1653645] Updated weights for policy 0, policy_version 947824 (0.0012) [2024-06-15 23:25:43,121][1653645] Updated weights for policy 0, policy_version 947877 (0.0010) [2024-06-15 23:25:43,912][1653645] Updated weights for policy 0, policy_version 947928 (0.0015) [2024-06-15 23:25:44,313][1653645] Updated weights for policy 0, policy_version 947967 (0.0010) [2024-06-15 23:25:45,958][1648982] Fps is (10 sec: 91749.9, 60 sec: 88473.3, 300 sec: 89528.8). Total num frames: 1941438464. Throughput: 0: 22266.2. Samples: 485431296. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:45,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:25:47,310][1653645] Updated weights for policy 0, policy_version 948025 (0.0009) [2024-06-15 23:25:48,280][1653645] Updated weights for policy 0, policy_version 948085 (0.0011) [2024-06-15 23:25:48,970][1653645] Updated weights for policy 0, policy_version 948144 (0.0010) [2024-06-15 23:25:49,576][1653645] Updated weights for policy 0, policy_version 948181 (0.0009) [2024-06-15 23:25:50,958][1648982] Fps is (10 sec: 104858.5, 60 sec: 90112.2, 300 sec: 90084.3). Total num frames: 1941962752. Throughput: 0: 22573.8. Samples: 485501440. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:25:52,457][1653645] Updated weights for policy 0, policy_version 948228 (0.0009) [2024-06-15 23:25:53,148][1653645] Updated weights for policy 0, policy_version 948288 (0.0017) [2024-06-15 23:25:54,259][1653645] Updated weights for policy 0, policy_version 948352 (0.0010) [2024-06-15 23:25:54,899][1653645] Updated weights for policy 0, policy_version 948401 (0.0015) [2024-06-15 23:25:55,630][1653645] Updated weights for policy 0, policy_version 948467 (0.0012) [2024-06-15 23:25:55,958][1648982] Fps is (10 sec: 104857.5, 60 sec: 91750.3, 300 sec: 90639.6). Total num frames: 1942487040. Throughput: 0: 22618.9. Samples: 485638656. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:25:55,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:25:57,907][1653645] Updated weights for policy 0, policy_version 948483 (0.0009) [2024-06-15 23:25:59,288][1653645] Updated weights for policy 0, policy_version 948548 (0.0010) [2024-06-15 23:25:59,923][1653645] Updated weights for policy 0, policy_version 948608 (0.0010) [2024-06-15 23:26:00,725][1653645] Updated weights for policy 0, policy_version 948672 (0.0010) [2024-06-15 23:26:00,958][1648982] Fps is (10 sec: 95024.9, 60 sec: 92296.1, 300 sec: 90639.6). Total num frames: 1942913024. Throughput: 0: 22766.8. Samples: 485776896. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:26:00,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:26:01,158][1651596] Signal inference workers to stop experience collection... (49300 times) [2024-06-15 23:26:01,204][1653645] InferenceWorker_p0-w0: stopping experience collection (49300 times) [2024-06-15 23:26:01,297][1651596] Signal inference workers to resume experience collection... (49300 times) [2024-06-15 23:26:01,297][1653645] InferenceWorker_p0-w0: resuming experience collection (49300 times) [2024-06-15 23:26:01,368][1653645] Updated weights for policy 0, policy_version 948726 (0.0010) [2024-06-15 23:26:04,176][1653645] Updated weights for policy 0, policy_version 948773 (0.0010) [2024-06-15 23:26:05,344][1653645] Updated weights for policy 0, policy_version 948818 (0.0011) [2024-06-15 23:26:05,958][1648982] Fps is (10 sec: 78644.8, 60 sec: 91751.1, 300 sec: 90528.6). Total num frames: 1943273472. Throughput: 0: 22664.5. Samples: 485850624. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:26:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:26:05,960][1653645] Updated weights for policy 0, policy_version 948870 (0.0009) [2024-06-15 23:26:06,810][1653645] Updated weights for policy 0, policy_version 948944 (0.0011) [2024-06-15 23:26:07,316][1653645] Updated weights for policy 0, policy_version 948989 (0.0009) [2024-06-15 23:26:09,999][1653645] Updated weights for policy 0, policy_version 949040 (0.0012) [2024-06-15 23:26:10,958][1648982] Fps is (10 sec: 75366.7, 60 sec: 89565.8, 300 sec: 90639.5). Total num frames: 1943666688. Throughput: 0: 22698.7. Samples: 485987840. Policy #0 lag: (min: 15.0, avg: 111.5, max: 271.0) [2024-06-15 23:26:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:26:11,204][1653645] Updated weights for policy 0, policy_version 949088 (0.0009) [2024-06-15 23:26:12,014][1653645] Updated weights for policy 0, policy_version 949152 (0.0059) [2024-06-15 23:26:13,152][1653645] Updated weights for policy 0, policy_version 949243 (0.0012) [2024-06-15 23:26:15,958][1648982] Fps is (10 sec: 85197.3, 60 sec: 88473.6, 300 sec: 90417.5). Total num frames: 1944125440. Throughput: 0: 22596.3. Samples: 486119936. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:15,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:26:16,132][1653645] Updated weights for policy 0, policy_version 949307 (0.0009) [2024-06-15 23:26:17,633][1653645] Updated weights for policy 0, policy_version 949369 (0.0060) [2024-06-15 23:26:18,254][1653645] Updated weights for policy 0, policy_version 949424 (0.0010) [2024-06-15 23:26:18,974][1653645] Updated weights for policy 0, policy_version 949476 (0.0009) [2024-06-15 23:26:20,957][1648982] Fps is (10 sec: 91752.8, 60 sec: 88473.8, 300 sec: 90639.7). Total num frames: 1944584192. Throughput: 0: 22687.4. Samples: 486179328. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:20,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:26:21,539][1653645] Updated weights for policy 0, policy_version 949521 (0.0010) [2024-06-15 23:26:23,123][1653645] Updated weights for policy 0, policy_version 949569 (0.0010) [2024-06-15 23:26:23,933][1651596] Signal inference workers to stop experience collection... (49350 times) [2024-06-15 23:26:23,984][1653645] InferenceWorker_p0-w0: stopping experience collection (49350 times) [2024-06-15 23:26:24,002][1653645] Updated weights for policy 0, policy_version 949638 (0.0022) [2024-06-15 23:26:24,066][1651596] Signal inference workers to resume experience collection... (49350 times) [2024-06-15 23:26:24,066][1653645] InferenceWorker_p0-w0: resuming experience collection (49350 times) [2024-06-15 23:26:25,011][1653645] Updated weights for policy 0, policy_version 949714 (0.0062) [2024-06-15 23:26:25,474][1653645] Updated weights for policy 0, policy_version 949760 (0.0009) [2024-06-15 23:26:25,958][1648982] Fps is (10 sec: 98302.5, 60 sec: 89565.9, 300 sec: 90639.6). Total num frames: 1945108480. Throughput: 0: 22368.7. Samples: 486307328. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:25,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:26:27,831][1653645] Updated weights for policy 0, policy_version 949813 (0.0009) [2024-06-15 23:26:29,291][1653645] Updated weights for policy 0, policy_version 949859 (0.0011) [2024-06-15 23:26:29,995][1653645] Updated weights for policy 0, policy_version 949920 (0.0010) [2024-06-15 23:26:30,569][1653645] Updated weights for policy 0, policy_version 949968 (0.0009) [2024-06-15 23:26:30,974][1648982] Fps is (10 sec: 101410.3, 60 sec: 91179.0, 300 sec: 90634.5). Total num frames: 1945600000. Throughput: 0: 22485.6. Samples: 486443520. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:30,975][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:26:31,128][1653645] Updated weights for policy 0, policy_version 950013 (0.0013) [2024-06-15 23:26:33,152][1653645] Updated weights for policy 0, policy_version 950068 (0.0021) [2024-06-15 23:26:34,752][1653645] Updated weights for policy 0, policy_version 950102 (0.0010) [2024-06-15 23:26:35,601][1653645] Updated weights for policy 0, policy_version 950164 (0.0010) [2024-06-15 23:26:35,958][1648982] Fps is (10 sec: 88474.7, 60 sec: 91204.5, 300 sec: 90528.5). Total num frames: 1945993216. Throughput: 0: 22505.2. Samples: 486514176. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:35,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:26:36,190][1653645] Updated weights for policy 0, policy_version 950215 (0.0009) [2024-06-15 23:26:36,769][1653645] Updated weights for policy 0, policy_version 950267 (0.0009) [2024-06-15 23:26:38,513][1653645] Updated weights for policy 0, policy_version 950324 (0.0070) [2024-06-15 23:26:40,208][1653645] Updated weights for policy 0, policy_version 950368 (0.0009) [2024-06-15 23:26:40,752][1653645] Updated weights for policy 0, policy_version 950402 (0.0009) [2024-06-15 23:26:40,958][1648982] Fps is (10 sec: 85339.4, 60 sec: 92296.6, 300 sec: 90306.4). Total num frames: 1946451968. Throughput: 0: 22710.2. Samples: 486660608. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:40,960][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:26:41,372][1653645] Updated weights for policy 0, policy_version 950450 (0.0017) [2024-06-15 23:26:42,057][1653645] Updated weights for policy 0, policy_version 950512 (0.0009) [2024-06-15 23:26:43,801][1651596] Signal inference workers to stop experience collection... (49400 times) [2024-06-15 23:26:43,856][1653645] InferenceWorker_p0-w0: stopping experience collection (49400 times) [2024-06-15 23:26:43,884][1653645] Updated weights for policy 0, policy_version 950569 (0.0058) [2024-06-15 23:26:43,927][1651596] Signal inference workers to resume experience collection... (49400 times) [2024-06-15 23:26:43,927][1653645] InferenceWorker_p0-w0: resuming experience collection (49400 times) [2024-06-15 23:26:45,761][1653645] Updated weights for policy 0, policy_version 950624 (0.0008) [2024-06-15 23:26:45,958][1648982] Fps is (10 sec: 88474.1, 60 sec: 90658.6, 300 sec: 90417.5). Total num frames: 1946877952. Throughput: 0: 22778.4. Samples: 486801920. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:45,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:26:46,720][1653645] Updated weights for policy 0, policy_version 950672 (0.0061) [2024-06-15 23:26:47,500][1653645] Updated weights for policy 0, policy_version 950736 (0.0010) [2024-06-15 23:26:49,068][1653645] Updated weights for policy 0, policy_version 950787 (0.0011) [2024-06-15 23:26:49,736][1653645] Updated weights for policy 0, policy_version 950842 (0.0009) [2024-06-15 23:26:50,958][1648982] Fps is (10 sec: 88471.2, 60 sec: 89565.4, 300 sec: 90639.6). Total num frames: 1947336704. Throughput: 0: 22402.7. Samples: 486858752. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:50,959][1648982] Avg episode reward: [(0, '37.210')] [2024-06-15 23:26:51,724][1653645] Updated weights for policy 0, policy_version 950887 (0.0011) [2024-06-15 23:26:52,693][1653645] Updated weights for policy 0, policy_version 950945 (0.0011) [2024-06-15 23:26:53,388][1653645] Updated weights for policy 0, policy_version 950996 (0.0009) [2024-06-15 23:26:54,693][1653645] Updated weights for policy 0, policy_version 951044 (0.0009) [2024-06-15 23:26:55,304][1653645] Updated weights for policy 0, policy_version 951100 (0.0010) [2024-06-15 23:26:55,958][1648982] Fps is (10 sec: 98300.3, 60 sec: 89565.7, 300 sec: 90861.7). Total num frames: 1947860992. Throughput: 0: 22493.8. Samples: 487000064. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:26:55,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:26:55,969][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000951104_1947860992.pth... [2024-06-15 23:26:56,004][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000940512_1926168576.pth [2024-06-15 23:26:57,343][1653645] Updated weights for policy 0, policy_version 951152 (0.0010) [2024-06-15 23:26:58,122][1653645] Updated weights for policy 0, policy_version 951204 (0.0010) [2024-06-15 23:26:58,832][1653645] Updated weights for policy 0, policy_version 951248 (0.0010) [2024-06-15 23:26:59,385][1653645] Updated weights for policy 0, policy_version 951296 (0.0013) [2024-06-15 23:27:00,713][1653645] Updated weights for policy 0, policy_version 951358 (0.0011) [2024-06-15 23:27:00,958][1648982] Fps is (10 sec: 104860.3, 60 sec: 91204.6, 300 sec: 91083.9). Total num frames: 1948385280. Throughput: 0: 22721.4. Samples: 487142400. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:00,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:27:02,937][1653645] Updated weights for policy 0, policy_version 951408 (0.0009) [2024-06-15 23:27:03,618][1653645] Updated weights for policy 0, policy_version 951445 (0.0009) [2024-06-15 23:27:04,240][1653645] Updated weights for policy 0, policy_version 951489 (0.0016) [2024-06-15 23:27:04,869][1653645] Updated weights for policy 0, policy_version 951548 (0.0010) [2024-06-15 23:27:05,779][1653645] Updated weights for policy 0, policy_version 951589 (0.0010) [2024-06-15 23:27:05,958][1648982] Fps is (10 sec: 101583.7, 60 sec: 93388.8, 300 sec: 90972.9). Total num frames: 1948876800. Throughput: 0: 23119.6. Samples: 487219712. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:27:07,659][1651596] Signal inference workers to stop experience collection... (49450 times) [2024-06-15 23:27:07,704][1653645] Updated weights for policy 0, policy_version 951621 (0.0013) [2024-06-15 23:27:07,716][1653645] InferenceWorker_p0-w0: stopping experience collection (49450 times) [2024-06-15 23:27:07,787][1651596] Signal inference workers to resume experience collection... (49450 times) [2024-06-15 23:27:07,787][1653645] InferenceWorker_p0-w0: resuming experience collection (49450 times) [2024-06-15 23:27:08,307][1653645] Updated weights for policy 0, policy_version 951676 (0.0060) [2024-06-15 23:27:09,431][1653645] Updated weights for policy 0, policy_version 951728 (0.0011) [2024-06-15 23:27:10,770][1653645] Updated weights for policy 0, policy_version 951795 (0.0010) [2024-06-15 23:27:10,958][1648982] Fps is (10 sec: 91748.8, 60 sec: 93934.9, 300 sec: 90861.7). Total num frames: 1949302784. Throughput: 0: 23415.5. Samples: 487361024. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:10,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:27:11,639][1653645] Updated weights for policy 0, policy_version 951864 (0.0012) [2024-06-15 23:27:13,872][1653645] Updated weights for policy 0, policy_version 951920 (0.0011) [2024-06-15 23:27:15,279][1653645] Updated weights for policy 0, policy_version 951984 (0.0015) [2024-06-15 23:27:15,958][1648982] Fps is (10 sec: 81920.6, 60 sec: 92842.7, 300 sec: 90639.6). Total num frames: 1949696000. Throughput: 0: 23344.5. Samples: 487493632. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:15,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:27:16,309][1653645] Updated weights for policy 0, policy_version 952032 (0.0010) [2024-06-15 23:27:17,235][1653645] Updated weights for policy 0, policy_version 952096 (0.0010) [2024-06-15 23:27:19,290][1653645] Updated weights for policy 0, policy_version 952148 (0.0011) [2024-06-15 23:27:19,731][1653645] Updated weights for policy 0, policy_version 952189 (0.0012) [2024-06-15 23:27:20,958][1648982] Fps is (10 sec: 81920.7, 60 sec: 92296.2, 300 sec: 90417.5). Total num frames: 1950121984. Throughput: 0: 23278.9. Samples: 487561728. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:20,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:27:21,527][1653645] Updated weights for policy 0, policy_version 952256 (0.0012) [2024-06-15 23:27:22,547][1653645] Updated weights for policy 0, policy_version 952310 (0.0034) [2024-06-15 23:27:23,412][1653645] Updated weights for policy 0, policy_version 952382 (0.0086) [2024-06-15 23:27:25,712][1653645] Updated weights for policy 0, policy_version 952442 (0.0017) [2024-06-15 23:27:25,957][1648982] Fps is (10 sec: 91750.7, 60 sec: 91750.7, 300 sec: 90640.2). Total num frames: 1950613504. Throughput: 0: 22926.2. Samples: 487692288. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:25,958][1648982] Avg episode reward: [(0, '37.520')] [2024-06-15 23:27:27,443][1653645] Updated weights for policy 0, policy_version 952504 (0.0057) [2024-06-15 23:27:28,367][1653645] Updated weights for policy 0, policy_version 952566 (0.0012) [2024-06-15 23:27:28,702][1651596] Signal inference workers to stop experience collection... (49500 times) [2024-06-15 23:27:28,744][1653645] InferenceWorker_p0-w0: stopping experience collection (49500 times) [2024-06-15 23:27:28,821][1651596] Signal inference workers to resume experience collection... (49500 times) [2024-06-15 23:27:28,821][1653645] InferenceWorker_p0-w0: resuming experience collection (49500 times) [2024-06-15 23:27:29,124][1653645] Updated weights for policy 0, policy_version 952637 (0.0012) [2024-06-15 23:27:30,965][1648982] Fps is (10 sec: 88410.7, 60 sec: 90126.3, 300 sec: 90637.4). Total num frames: 1951006720. Throughput: 0: 22740.5. Samples: 487825408. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:30,966][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:27:31,573][1653645] Updated weights for policy 0, policy_version 952675 (0.0012) [2024-06-15 23:27:32,981][1653645] Updated weights for policy 0, policy_version 952726 (0.0010) [2024-06-15 23:27:33,810][1653645] Updated weights for policy 0, policy_version 952790 (0.0073) [2024-06-15 23:27:34,619][1653645] Updated weights for policy 0, policy_version 952848 (0.0015) [2024-06-15 23:27:35,282][1653645] Updated weights for policy 0, policy_version 952896 (0.0012) [2024-06-15 23:27:35,958][1648982] Fps is (10 sec: 91750.0, 60 sec: 92296.6, 300 sec: 90972.9). Total num frames: 1951531008. Throughput: 0: 23006.0. Samples: 487894016. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:35,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:27:37,654][1653645] Updated weights for policy 0, policy_version 952954 (0.0049) [2024-06-15 23:27:39,259][1653645] Updated weights for policy 0, policy_version 952998 (0.0009) [2024-06-15 23:27:40,015][1653645] Updated weights for policy 0, policy_version 953056 (0.0010) [2024-06-15 23:27:40,958][1648982] Fps is (10 sec: 98374.2, 60 sec: 92296.4, 300 sec: 90861.8). Total num frames: 1951989760. Throughput: 0: 22903.6. Samples: 488030720. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:40,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:27:41,072][1653645] Updated weights for policy 0, policy_version 953122 (0.0010) [2024-06-15 23:27:43,206][1653645] Updated weights for policy 0, policy_version 953184 (0.0010) [2024-06-15 23:27:44,580][1653645] Updated weights for policy 0, policy_version 953220 (0.0014) [2024-06-15 23:27:45,464][1653645] Updated weights for policy 0, policy_version 953290 (0.0074) [2024-06-15 23:27:45,958][1648982] Fps is (10 sec: 88473.2, 60 sec: 92296.4, 300 sec: 90639.6). Total num frames: 1952415744. Throughput: 0: 22675.9. Samples: 488162816. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:27:46,090][1653645] Updated weights for policy 0, policy_version 953335 (0.0010) [2024-06-15 23:27:46,691][1653645] Updated weights for policy 0, policy_version 953377 (0.0009) [2024-06-15 23:27:48,800][1653645] Updated weights for policy 0, policy_version 953428 (0.0012) [2024-06-15 23:27:50,319][1653645] Updated weights for policy 0, policy_version 953489 (0.0012) [2024-06-15 23:27:50,940][1651596] Signal inference workers to stop experience collection... (49550 times) [2024-06-15 23:27:50,958][1648982] Fps is (10 sec: 85192.2, 60 sec: 91749.9, 300 sec: 90639.5). Total num frames: 1952841728. Throughput: 0: 22561.9. Samples: 488235008. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:50,959][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:27:50,989][1653645] InferenceWorker_p0-w0: stopping experience collection (49550 times) [2024-06-15 23:27:50,991][1653645] Updated weights for policy 0, policy_version 953540 (0.0009) [2024-06-15 23:27:51,083][1651596] Signal inference workers to resume experience collection... (49550 times) [2024-06-15 23:27:51,083][1653645] InferenceWorker_p0-w0: resuming experience collection (49550 times) [2024-06-15 23:27:51,603][1653645] Updated weights for policy 0, policy_version 953591 (0.0014) [2024-06-15 23:27:52,415][1653645] Updated weights for policy 0, policy_version 953648 (0.0010) [2024-06-15 23:27:54,760][1653645] Updated weights for policy 0, policy_version 953712 (0.0011) [2024-06-15 23:27:55,958][1648982] Fps is (10 sec: 85195.7, 60 sec: 90112.3, 300 sec: 90306.3). Total num frames: 1953267712. Throughput: 0: 22505.2. Samples: 488373760. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:27:55,959][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:27:56,230][1653645] Updated weights for policy 0, policy_version 953776 (0.0013) [2024-06-15 23:27:56,843][1653645] Updated weights for policy 0, policy_version 953816 (0.0013) [2024-06-15 23:27:57,281][1653645] Updated weights for policy 0, policy_version 953855 (0.0010) [2024-06-15 23:27:58,331][1653645] Updated weights for policy 0, policy_version 953912 (0.0012) [2024-06-15 23:28:00,373][1653645] Updated weights for policy 0, policy_version 953968 (0.0010) [2024-06-15 23:28:00,957][1648982] Fps is (10 sec: 91756.4, 60 sec: 89565.9, 300 sec: 90417.5). Total num frames: 1953759232. Throughput: 0: 22687.3. Samples: 488514560. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:28:00,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:28:01,788][1653645] Updated weights for policy 0, policy_version 954042 (0.0010) [2024-06-15 23:28:02,559][1653645] Updated weights for policy 0, policy_version 954083 (0.0010) [2024-06-15 23:28:03,697][1653645] Updated weights for policy 0, policy_version 954147 (0.0013) [2024-06-15 23:28:05,878][1653645] Updated weights for policy 0, policy_version 954208 (0.0016) [2024-06-15 23:28:05,958][1648982] Fps is (10 sec: 95028.7, 60 sec: 89019.8, 300 sec: 90528.5). Total num frames: 1954217984. Throughput: 0: 22641.8. Samples: 488580608. Policy #0 lag: (min: 47.0, avg: 125.8, max: 285.0) [2024-06-15 23:28:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:28:06,959][1653645] Updated weights for policy 0, policy_version 954256 (0.0010) [2024-06-15 23:28:07,562][1653645] Updated weights for policy 0, policy_version 954304 (0.0011) [2024-06-15 23:28:08,461][1653645] Updated weights for policy 0, policy_version 954352 (0.0011) [2024-06-15 23:28:09,205][1653645] Updated weights for policy 0, policy_version 954401 (0.0010) [2024-06-15 23:28:10,958][1648982] Fps is (10 sec: 91746.9, 60 sec: 89565.6, 300 sec: 90639.6). Total num frames: 1954676736. Throughput: 0: 22732.6. Samples: 488715264. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:10,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:28:11,200][1653645] Updated weights for policy 0, policy_version 954434 (0.0011) [2024-06-15 23:28:11,864][1653645] Updated weights for policy 0, policy_version 954493 (0.0009) [2024-06-15 23:28:13,281][1653645] Updated weights for policy 0, policy_version 954555 (0.0013) [2024-06-15 23:28:13,599][1651596] Signal inference workers to stop experience collection... (49600 times) [2024-06-15 23:28:13,614][1653645] InferenceWorker_p0-w0: stopping experience collection (49600 times) [2024-06-15 23:28:13,731][1651596] Signal inference workers to resume experience collection... (49600 times) [2024-06-15 23:28:13,732][1653645] InferenceWorker_p0-w0: resuming experience collection (49600 times) [2024-06-15 23:28:14,269][1653645] Updated weights for policy 0, policy_version 954617 (0.0011) [2024-06-15 23:28:14,714][1653645] Updated weights for policy 0, policy_version 954644 (0.0009) [2024-06-15 23:28:15,957][1648982] Fps is (10 sec: 98304.5, 60 sec: 91750.4, 300 sec: 91083.9). Total num frames: 1955201024. Throughput: 0: 22941.3. Samples: 488857600. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:15,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:28:16,748][1653645] Updated weights for policy 0, policy_version 954689 (0.0010) [2024-06-15 23:28:17,403][1653645] Updated weights for policy 0, policy_version 954745 (0.0081) [2024-06-15 23:28:18,818][1653645] Updated weights for policy 0, policy_version 954808 (0.0015) [2024-06-15 23:28:19,724][1653645] Updated weights for policy 0, policy_version 954850 (0.0011) [2024-06-15 23:28:20,435][1653645] Updated weights for policy 0, policy_version 954910 (0.0010) [2024-06-15 23:28:20,958][1648982] Fps is (10 sec: 104859.3, 60 sec: 93388.7, 300 sec: 91083.9). Total num frames: 1955725312. Throughput: 0: 23119.5. Samples: 488934400. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:20,959][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:28:22,684][1653645] Updated weights for policy 0, policy_version 954962 (0.0011) [2024-06-15 23:28:23,087][1653645] Updated weights for policy 0, policy_version 955003 (0.0014) [2024-06-15 23:28:24,527][1653645] Updated weights for policy 0, policy_version 955056 (0.0011) [2024-06-15 23:28:25,478][1653645] Updated weights for policy 0, policy_version 955104 (0.0010) [2024-06-15 23:28:25,958][1648982] Fps is (10 sec: 91749.5, 60 sec: 91750.2, 300 sec: 90639.6). Total num frames: 1956118528. Throughput: 0: 23040.0. Samples: 489067520. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:25,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:28:26,244][1653645] Updated weights for policy 0, policy_version 955168 (0.0051) [2024-06-15 23:28:28,101][1653645] Updated weights for policy 0, policy_version 955205 (0.0011) [2024-06-15 23:28:28,734][1653645] Updated weights for policy 0, policy_version 955254 (0.0012) [2024-06-15 23:28:29,988][1653645] Updated weights for policy 0, policy_version 955284 (0.0011) [2024-06-15 23:28:30,958][1648982] Fps is (10 sec: 81921.1, 60 sec: 92307.5, 300 sec: 90750.7). Total num frames: 1956544512. Throughput: 0: 23176.5. Samples: 489205760. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:30,958][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:28:31,086][1653645] Updated weights for policy 0, policy_version 955360 (0.0072) [2024-06-15 23:28:31,755][1653645] Updated weights for policy 0, policy_version 955410 (0.0010) [2024-06-15 23:28:33,973][1653645] Updated weights for policy 0, policy_version 955457 (0.0012) [2024-06-15 23:28:34,610][1653645] Updated weights for policy 0, policy_version 955515 (0.0012) [2024-06-15 23:28:35,852][1651596] Signal inference workers to stop experience collection... (49650 times) [2024-06-15 23:28:35,925][1653645] InferenceWorker_p0-w0: stopping experience collection (49650 times) [2024-06-15 23:28:35,958][1648982] Fps is (10 sec: 81918.0, 60 sec: 90111.5, 300 sec: 90639.6). Total num frames: 1956937728. Throughput: 0: 23074.3. Samples: 489273344. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:35,959][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:28:35,989][1651596] Signal inference workers to resume experience collection... (49650 times) [2024-06-15 23:28:35,989][1653645] InferenceWorker_p0-w0: resuming experience collection (49650 times) [2024-06-15 23:28:36,108][1653645] Updated weights for policy 0, policy_version 955558 (0.0014) [2024-06-15 23:28:36,763][1653645] Updated weights for policy 0, policy_version 955616 (0.0012) [2024-06-15 23:28:37,538][1653645] Updated weights for policy 0, policy_version 955680 (0.0012) [2024-06-15 23:28:39,765][1653645] Updated weights for policy 0, policy_version 955728 (0.0013) [2024-06-15 23:28:40,957][1648982] Fps is (10 sec: 88474.6, 60 sec: 90658.3, 300 sec: 90972.9). Total num frames: 1957429248. Throughput: 0: 23028.7. Samples: 489410048. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:40,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:28:41,621][1653645] Updated weights for policy 0, policy_version 955777 (0.0015) [2024-06-15 23:28:42,295][1653645] Updated weights for policy 0, policy_version 955830 (0.0012) [2024-06-15 23:28:43,305][1653645] Updated weights for policy 0, policy_version 955912 (0.0013) [2024-06-15 23:28:43,918][1653645] Updated weights for policy 0, policy_version 955968 (0.0010) [2024-06-15 23:28:45,958][1648982] Fps is (10 sec: 95028.3, 60 sec: 91204.1, 300 sec: 90974.2). Total num frames: 1957888000. Throughput: 0: 22914.7. Samples: 489545728. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:45,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:28:46,066][1653645] Updated weights for policy 0, policy_version 956022 (0.0010) [2024-06-15 23:28:47,543][1653645] Updated weights for policy 0, policy_version 956064 (0.0016) [2024-06-15 23:28:48,426][1653645] Updated weights for policy 0, policy_version 956130 (0.0010) [2024-06-15 23:28:49,060][1653645] Updated weights for policy 0, policy_version 956177 (0.0018) [2024-06-15 23:28:50,958][1648982] Fps is (10 sec: 91749.9, 60 sec: 91751.3, 300 sec: 91084.0). Total num frames: 1958346752. Throughput: 0: 22914.9. Samples: 489611776. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:50,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:28:51,099][1653645] Updated weights for policy 0, policy_version 956226 (0.0011) [2024-06-15 23:28:51,693][1653645] Updated weights for policy 0, policy_version 956280 (0.0059) [2024-06-15 23:28:53,380][1653645] Updated weights for policy 0, policy_version 956336 (0.0010) [2024-06-15 23:28:54,129][1653645] Updated weights for policy 0, policy_version 956384 (0.0009) [2024-06-15 23:28:55,017][1653645] Updated weights for policy 0, policy_version 956448 (0.0056) [2024-06-15 23:28:55,033][1651596] Signal inference workers to stop experience collection... (49700 times) [2024-06-15 23:28:55,088][1653645] InferenceWorker_p0-w0: stopping experience collection (49700 times) [2024-06-15 23:28:55,198][1651596] Signal inference workers to resume experience collection... (49700 times) [2024-06-15 23:28:55,198][1653645] InferenceWorker_p0-w0: resuming experience collection (49700 times) [2024-06-15 23:28:55,958][1648982] Fps is (10 sec: 98301.1, 60 sec: 93388.3, 300 sec: 91083.8). Total num frames: 1958871040. Throughput: 0: 23028.6. Samples: 489751552. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:28:55,959][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:28:55,965][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000956480_1958871040.pth... [2024-06-15 23:28:56,040][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000945824_1937047552.pth [2024-06-15 23:28:57,112][1653645] Updated weights for policy 0, policy_version 956512 (0.0013) [2024-06-15 23:28:58,784][1653645] Updated weights for policy 0, policy_version 956563 (0.0012) [2024-06-15 23:28:59,780][1653645] Updated weights for policy 0, policy_version 956624 (0.0015) [2024-06-15 23:29:00,635][1653645] Updated weights for policy 0, policy_version 956688 (0.0012) [2024-06-15 23:29:00,958][1648982] Fps is (10 sec: 98303.9, 60 sec: 92842.6, 300 sec: 90861.9). Total num frames: 1959329792. Throughput: 0: 22801.0. Samples: 489883648. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:00,960][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:29:02,991][1653645] Updated weights for policy 0, policy_version 956752 (0.0011) [2024-06-15 23:29:03,531][1653645] Updated weights for policy 0, policy_version 956797 (0.0011) [2024-06-15 23:29:05,183][1653645] Updated weights for policy 0, policy_version 956856 (0.0011) [2024-06-15 23:29:05,873][1653645] Updated weights for policy 0, policy_version 956898 (0.0015) [2024-06-15 23:29:05,958][1648982] Fps is (10 sec: 85201.2, 60 sec: 91750.5, 300 sec: 90639.6). Total num frames: 1959723008. Throughput: 0: 22573.6. Samples: 489950208. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:05,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:29:06,660][1653645] Updated weights for policy 0, policy_version 956965 (0.0070) [2024-06-15 23:29:08,688][1653645] Updated weights for policy 0, policy_version 957008 (0.0009) [2024-06-15 23:29:09,260][1653645] Updated weights for policy 0, policy_version 957055 (0.0010) [2024-06-15 23:29:10,777][1653645] Updated weights for policy 0, policy_version 957091 (0.0010) [2024-06-15 23:29:10,958][1648982] Fps is (10 sec: 81918.5, 60 sec: 91204.5, 300 sec: 90639.6). Total num frames: 1960148992. Throughput: 0: 22641.7. Samples: 490086400. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:10,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:29:11,516][1653645] Updated weights for policy 0, policy_version 957152 (0.0013) [2024-06-15 23:29:11,980][1653645] Updated weights for policy 0, policy_version 957184 (0.0010) [2024-06-15 23:29:12,670][1653645] Updated weights for policy 0, policy_version 957242 (0.0010) [2024-06-15 23:29:14,689][1653645] Updated weights for policy 0, policy_version 957296 (0.0013) [2024-06-15 23:29:15,958][1648982] Fps is (10 sec: 88473.5, 60 sec: 90112.0, 300 sec: 90972.9). Total num frames: 1960607744. Throughput: 0: 22721.5. Samples: 490228224. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:15,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:29:16,137][1653645] Updated weights for policy 0, policy_version 957344 (0.0010) [2024-06-15 23:29:16,915][1653645] Updated weights for policy 0, policy_version 957392 (0.0010) [2024-06-15 23:29:17,416][1653645] Updated weights for policy 0, policy_version 957433 (0.0010) [2024-06-15 23:29:17,529][1651596] Signal inference workers to stop experience collection... (49750 times) [2024-06-15 23:29:17,561][1651596] Signal inference workers to resume experience collection... (49750 times) [2024-06-15 23:29:17,563][1653645] InferenceWorker_p0-w0: stopping experience collection (49750 times) [2024-06-15 23:29:17,576][1653645] InferenceWorker_p0-w0: resuming experience collection (49750 times) [2024-06-15 23:29:17,910][1653645] Updated weights for policy 0, policy_version 957459 (0.0010) [2024-06-15 23:29:19,727][1653645] Updated weights for policy 0, policy_version 957520 (0.0011) [2024-06-15 23:29:20,958][1648982] Fps is (10 sec: 95028.3, 60 sec: 89566.0, 300 sec: 91084.0). Total num frames: 1961099264. Throughput: 0: 22641.9. Samples: 490292224. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:20,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:29:21,522][1653645] Updated weights for policy 0, policy_version 957584 (0.0012) [2024-06-15 23:29:22,038][1653645] Updated weights for policy 0, policy_version 957627 (0.0011) [2024-06-15 23:29:22,762][1653645] Updated weights for policy 0, policy_version 957671 (0.0019) [2024-06-15 23:29:23,595][1653645] Updated weights for policy 0, policy_version 957712 (0.0011) [2024-06-15 23:29:25,330][1653645] Updated weights for policy 0, policy_version 957761 (0.0013) [2024-06-15 23:29:25,959][1648982] Fps is (10 sec: 98292.2, 60 sec: 91202.6, 300 sec: 91416.8). Total num frames: 1961590784. Throughput: 0: 22720.8. Samples: 490432512. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:25,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:29:25,970][1653645] Updated weights for policy 0, policy_version 957820 (0.0019) [2024-06-15 23:29:27,377][1653645] Updated weights for policy 0, policy_version 957863 (0.0011) [2024-06-15 23:29:28,175][1653645] Updated weights for policy 0, policy_version 957906 (0.0010) [2024-06-15 23:29:28,659][1653645] Updated weights for policy 0, policy_version 957950 (0.0065) [2024-06-15 23:29:29,549][1653645] Updated weights for policy 0, policy_version 958000 (0.0010) [2024-06-15 23:29:30,958][1648982] Fps is (10 sec: 95026.9, 60 sec: 91750.3, 300 sec: 91194.9). Total num frames: 1962049536. Throughput: 0: 22914.9. Samples: 490576896. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:30,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 23:29:31,249][1653645] Updated weights for policy 0, policy_version 958049 (0.0015) [2024-06-15 23:29:32,815][1653645] Updated weights for policy 0, policy_version 958103 (0.0010) [2024-06-15 23:29:33,676][1653645] Updated weights for policy 0, policy_version 958146 (0.0013) [2024-06-15 23:29:34,320][1653645] Updated weights for policy 0, policy_version 958201 (0.0068) [2024-06-15 23:29:35,271][1653645] Updated weights for policy 0, policy_version 958245 (0.0011) [2024-06-15 23:29:35,958][1648982] Fps is (10 sec: 95038.7, 60 sec: 93389.3, 300 sec: 91084.0). Total num frames: 1962541056. Throughput: 0: 22983.1. Samples: 490646016. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:29:36,921][1653645] Updated weights for policy 0, policy_version 958294 (0.0021) [2024-06-15 23:29:37,323][1653645] Updated weights for policy 0, policy_version 958333 (0.0009) [2024-06-15 23:29:38,791][1653645] Updated weights for policy 0, policy_version 958391 (0.0011) [2024-06-15 23:29:39,659][1653645] Updated weights for policy 0, policy_version 958437 (0.0010) [2024-06-15 23:29:40,593][1651596] Signal inference workers to stop experience collection... (49800 times) [2024-06-15 23:29:40,615][1653645] Updated weights for policy 0, policy_version 958482 (0.0011) [2024-06-15 23:29:40,659][1653645] InferenceWorker_p0-w0: stopping experience collection (49800 times) [2024-06-15 23:29:40,732][1651596] Signal inference workers to resume experience collection... (49800 times) [2024-06-15 23:29:40,732][1653645] InferenceWorker_p0-w0: resuming experience collection (49800 times) [2024-06-15 23:29:40,958][1648982] Fps is (10 sec: 98302.9, 60 sec: 93388.3, 300 sec: 91195.0). Total num frames: 1963032576. Throughput: 0: 22994.6. Samples: 490786304. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:40,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:29:41,114][1653645] Updated weights for policy 0, policy_version 958528 (0.0011) [2024-06-15 23:29:42,970][1653645] Updated weights for policy 0, policy_version 958585 (0.0011) [2024-06-15 23:29:44,302][1653645] Updated weights for policy 0, policy_version 958624 (0.0010) [2024-06-15 23:29:44,885][1653645] Updated weights for policy 0, policy_version 958658 (0.0010) [2024-06-15 23:29:45,511][1653645] Updated weights for policy 0, policy_version 958715 (0.0010) [2024-06-15 23:29:45,958][1648982] Fps is (10 sec: 91749.6, 60 sec: 92842.9, 300 sec: 91195.0). Total num frames: 1963458560. Throughput: 0: 23187.9. Samples: 490927104. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:45,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:29:46,441][1653645] Updated weights for policy 0, policy_version 958768 (0.0011) [2024-06-15 23:29:48,155][1653645] Updated weights for policy 0, policy_version 958821 (0.0021) [2024-06-15 23:29:50,084][1653645] Updated weights for policy 0, policy_version 958896 (0.0010) [2024-06-15 23:29:50,958][1648982] Fps is (10 sec: 85198.4, 60 sec: 92296.5, 300 sec: 91195.1). Total num frames: 1963884544. Throughput: 0: 23165.1. Samples: 490992640. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:50,958][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 23:29:51,197][1653645] Updated weights for policy 0, policy_version 958949 (0.0019) [2024-06-15 23:29:51,826][1653645] Updated weights for policy 0, policy_version 958994 (0.0009) [2024-06-15 23:29:53,628][1653645] Updated weights for policy 0, policy_version 959056 (0.0011) [2024-06-15 23:29:54,172][1653645] Updated weights for policy 0, policy_version 959101 (0.0017) [2024-06-15 23:29:55,935][1653645] Updated weights for policy 0, policy_version 959153 (0.0010) [2024-06-15 23:29:55,958][1648982] Fps is (10 sec: 88472.4, 60 sec: 91204.7, 300 sec: 91417.1). Total num frames: 1964343296. Throughput: 0: 23199.3. Samples: 491130368. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:29:55,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:29:56,604][1653645] Updated weights for policy 0, policy_version 959200 (0.0010) [2024-06-15 23:29:57,635][1653645] Updated weights for policy 0, policy_version 959251 (0.0012) [2024-06-15 23:29:58,095][1653645] Updated weights for policy 0, policy_version 959294 (0.0010) [2024-06-15 23:29:59,462][1653645] Updated weights for policy 0, policy_version 959352 (0.0011) [2024-06-15 23:30:00,958][1648982] Fps is (10 sec: 88471.7, 60 sec: 90657.7, 300 sec: 91528.3). Total num frames: 1964769280. Throughput: 0: 23221.9. Samples: 491273216. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:30:00,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:30:01,453][1653645] Updated weights for policy 0, policy_version 959392 (0.0009) [2024-06-15 23:30:02,169][1653645] Updated weights for policy 0, policy_version 959445 (0.0011) [2024-06-15 23:30:02,814][1653645] Updated weights for policy 0, policy_version 959492 (0.0009) [2024-06-15 23:30:02,986][1651596] Signal inference workers to stop experience collection... (49850 times) [2024-06-15 23:30:03,006][1653645] InferenceWorker_p0-w0: stopping experience collection (49850 times) [2024-06-15 23:30:03,123][1651596] Signal inference workers to resume experience collection... (49850 times) [2024-06-15 23:30:03,124][1653645] InferenceWorker_p0-w0: resuming experience collection (49850 times) [2024-06-15 23:30:04,582][1653645] Updated weights for policy 0, policy_version 959556 (0.0011) [2024-06-15 23:30:05,208][1653645] Updated weights for policy 0, policy_version 959611 (0.0011) [2024-06-15 23:30:05,958][1648982] Fps is (10 sec: 95026.9, 60 sec: 92842.3, 300 sec: 91528.2). Total num frames: 1965293568. Throughput: 0: 23290.2. Samples: 491340288. Policy #0 lag: (min: 56.0, avg: 185.5, max: 312.0) [2024-06-15 23:30:05,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:30:07,642][1653645] Updated weights for policy 0, policy_version 959687 (0.0012) [2024-06-15 23:30:08,196][1653645] Updated weights for policy 0, policy_version 959734 (0.0011) [2024-06-15 23:30:08,735][1653645] Updated weights for policy 0, policy_version 959760 (0.0010) [2024-06-15 23:30:09,303][1653645] Updated weights for policy 0, policy_version 959805 (0.0011) [2024-06-15 23:30:10,907][1653645] Updated weights for policy 0, policy_version 959857 (0.0013) [2024-06-15 23:30:10,958][1648982] Fps is (10 sec: 101583.4, 60 sec: 93935.2, 300 sec: 91417.2). Total num frames: 1965785088. Throughput: 0: 23268.2. Samples: 491479552. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:10,958][1648982] Avg episode reward: [(0, '37.270')] [2024-06-15 23:30:12,431][1653645] Updated weights for policy 0, policy_version 959893 (0.0010) [2024-06-15 23:30:13,134][1653645] Updated weights for policy 0, policy_version 959952 (0.0011) [2024-06-15 23:30:13,819][1653645] Updated weights for policy 0, policy_version 960000 (0.0017) [2024-06-15 23:30:15,958][1648982] Fps is (10 sec: 91746.4, 60 sec: 93387.7, 300 sec: 91305.9). Total num frames: 1966211072. Throughput: 0: 23039.7. Samples: 491613696. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:15,959][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:30:16,480][1653645] Updated weights for policy 0, policy_version 960065 (0.0017) [2024-06-15 23:30:17,121][1653645] Updated weights for policy 0, policy_version 960122 (0.0012) [2024-06-15 23:30:18,644][1653645] Updated weights for policy 0, policy_version 960183 (0.0080) [2024-06-15 23:30:19,469][1653645] Updated weights for policy 0, policy_version 960240 (0.0011) [2024-06-15 23:30:20,526][1653645] Updated weights for policy 0, policy_version 960273 (0.0009) [2024-06-15 23:30:20,958][1648982] Fps is (10 sec: 91749.5, 60 sec: 93388.8, 300 sec: 91417.2). Total num frames: 1966702592. Throughput: 0: 23085.4. Samples: 491684864. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:20,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:30:22,532][1653645] Updated weights for policy 0, policy_version 960325 (0.0011) [2024-06-15 23:30:23,238][1653645] Updated weights for policy 0, policy_version 960382 (0.0009) [2024-06-15 23:30:24,209][1653645] Updated weights for policy 0, policy_version 960421 (0.0012) [2024-06-15 23:30:25,318][1653645] Updated weights for policy 0, policy_version 960503 (0.0013) [2024-06-15 23:30:25,958][1648982] Fps is (10 sec: 91755.2, 60 sec: 92298.1, 300 sec: 91528.2). Total num frames: 1967128576. Throughput: 0: 22937.7. Samples: 491818496. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:25,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:30:26,486][1651596] Signal inference workers to stop experience collection... (49900 times) [2024-06-15 23:30:26,526][1653645] InferenceWorker_p0-w0: stopping experience collection (49900 times) [2024-06-15 23:30:26,627][1651596] Signal inference workers to resume experience collection... (49900 times) [2024-06-15 23:30:26,628][1653645] InferenceWorker_p0-w0: resuming experience collection (49900 times) [2024-06-15 23:30:26,720][1653645] Updated weights for policy 0, policy_version 960545 (0.0012) [2024-06-15 23:30:28,588][1653645] Updated weights for policy 0, policy_version 960595 (0.0011) [2024-06-15 23:30:29,616][1653645] Updated weights for policy 0, policy_version 960643 (0.0013) [2024-06-15 23:30:30,628][1653645] Updated weights for policy 0, policy_version 960706 (0.0012) [2024-06-15 23:30:30,958][1648982] Fps is (10 sec: 88472.5, 60 sec: 92296.4, 300 sec: 91750.4). Total num frames: 1967587328. Throughput: 0: 22789.6. Samples: 491952640. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:30,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:30:31,301][1653645] Updated weights for policy 0, policy_version 960765 (0.0010) [2024-06-15 23:30:32,519][1653645] Updated weights for policy 0, policy_version 960828 (0.0019) [2024-06-15 23:30:34,720][1653645] Updated weights for policy 0, policy_version 960880 (0.0011) [2024-06-15 23:30:35,895][1653645] Updated weights for policy 0, policy_version 960928 (0.0010) [2024-06-15 23:30:35,958][1648982] Fps is (10 sec: 85197.2, 60 sec: 90658.0, 300 sec: 91750.4). Total num frames: 1967980544. Throughput: 0: 22835.2. Samples: 492020224. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:35,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:30:37,024][1653645] Updated weights for policy 0, policy_version 961015 (0.0012) [2024-06-15 23:30:38,358][1653645] Updated weights for policy 0, policy_version 961057 (0.0010) [2024-06-15 23:30:40,304][1653645] Updated weights for policy 0, policy_version 961110 (0.0010) [2024-06-15 23:30:40,958][1648982] Fps is (10 sec: 85198.0, 60 sec: 90112.2, 300 sec: 91528.3). Total num frames: 1968439296. Throughput: 0: 22755.6. Samples: 492154368. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:40,960][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:30:41,610][1653645] Updated weights for policy 0, policy_version 961168 (0.0011) [2024-06-15 23:30:42,352][1653645] Updated weights for policy 0, policy_version 961221 (0.0009) [2024-06-15 23:30:43,011][1653645] Updated weights for policy 0, policy_version 961280 (0.0009) [2024-06-15 23:30:44,424][1653645] Updated weights for policy 0, policy_version 961340 (0.0010) [2024-06-15 23:30:45,958][1648982] Fps is (10 sec: 85197.5, 60 sec: 89565.9, 300 sec: 91083.9). Total num frames: 1968832512. Throughput: 0: 22687.4. Samples: 492294144. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:45,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:30:46,436][1653645] Updated weights for policy 0, policy_version 961392 (0.0013) [2024-06-15 23:30:47,712][1653645] Updated weights for policy 0, policy_version 961456 (0.0015) [2024-06-15 23:30:48,373][1653645] Updated weights for policy 0, policy_version 961504 (0.0010) [2024-06-15 23:30:49,481][1651596] Signal inference workers to stop experience collection... (49950 times) [2024-06-15 23:30:49,516][1653645] InferenceWorker_p0-w0: stopping experience collection (49950 times) [2024-06-15 23:30:49,605][1651596] Signal inference workers to resume experience collection... (49950 times) [2024-06-15 23:30:49,605][1653645] InferenceWorker_p0-w0: resuming experience collection (49950 times) [2024-06-15 23:30:49,717][1653645] Updated weights for policy 0, policy_version 961553 (0.0011) [2024-06-15 23:30:50,958][1648982] Fps is (10 sec: 91750.9, 60 sec: 91204.3, 300 sec: 91084.0). Total num frames: 1969356800. Throughput: 0: 22607.7. Samples: 492357632. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:30:51,808][1653645] Updated weights for policy 0, policy_version 961603 (0.0016) [2024-06-15 23:30:52,466][1653645] Updated weights for policy 0, policy_version 961663 (0.0012) [2024-06-15 23:30:53,515][1653645] Updated weights for policy 0, policy_version 961728 (0.0012) [2024-06-15 23:30:54,254][1653645] Updated weights for policy 0, policy_version 961783 (0.0011) [2024-06-15 23:30:55,708][1653645] Updated weights for policy 0, policy_version 961828 (0.0010) [2024-06-15 23:30:55,958][1648982] Fps is (10 sec: 101576.8, 60 sec: 91750.1, 300 sec: 91306.0). Total num frames: 1969848320. Throughput: 0: 22459.5. Samples: 492490240. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:30:55,960][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:30:55,975][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000961856_1969881088.pth... [2024-06-15 23:30:56,014][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000951104_1947860992.pth [2024-06-15 23:30:56,017][1651596] Saving a milestone train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/milestones/checkpoint_000961856_1969881088.pth [2024-06-15 23:30:57,655][1653645] Updated weights for policy 0, policy_version 961872 (0.0072) [2024-06-15 23:30:58,510][1653645] Updated weights for policy 0, policy_version 961925 (0.0012) [2024-06-15 23:30:59,374][1653645] Updated weights for policy 0, policy_version 961987 (0.0012) [2024-06-15 23:31:00,056][1653645] Updated weights for policy 0, policy_version 962047 (0.0019) [2024-06-15 23:31:00,958][1648982] Fps is (10 sec: 95026.3, 60 sec: 92296.7, 300 sec: 91639.3). Total num frames: 1970307072. Throughput: 0: 22619.3. Samples: 492631552. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:00,958][1648982] Avg episode reward: [(0, '37.480')] [2024-06-15 23:31:01,492][1653645] Updated weights for policy 0, policy_version 962105 (0.0010) [2024-06-15 23:31:03,805][1653645] Updated weights for policy 0, policy_version 962160 (0.0012) [2024-06-15 23:31:04,368][1653645] Updated weights for policy 0, policy_version 962195 (0.0013) [2024-06-15 23:31:05,232][1653645] Updated weights for policy 0, policy_version 962261 (0.0010) [2024-06-15 23:31:05,958][1648982] Fps is (10 sec: 95030.2, 60 sec: 91750.6, 300 sec: 91972.6). Total num frames: 1970798592. Throughput: 0: 22778.3. Samples: 492709888. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:05,959][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:31:06,981][1653645] Updated weights for policy 0, policy_version 962324 (0.0010) [2024-06-15 23:31:09,099][1653645] Updated weights for policy 0, policy_version 962369 (0.0010) [2024-06-15 23:31:09,816][1653645] Updated weights for policy 0, policy_version 962425 (0.0012) [2024-06-15 23:31:10,450][1653645] Updated weights for policy 0, policy_version 962468 (0.0009) [2024-06-15 23:31:10,614][1651596] Signal inference workers to stop experience collection... (50000 times) [2024-06-15 23:31:10,646][1653645] InferenceWorker_p0-w0: stopping experience collection (50000 times) [2024-06-15 23:31:10,744][1651596] Signal inference workers to resume experience collection... (50000 times) [2024-06-15 23:31:10,745][1653645] InferenceWorker_p0-w0: resuming experience collection (50000 times) [2024-06-15 23:31:10,958][1648982] Fps is (10 sec: 88473.0, 60 sec: 90111.7, 300 sec: 91750.3). Total num frames: 1971191808. Throughput: 0: 22823.8. Samples: 492845568. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:10,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:31:11,272][1653645] Updated weights for policy 0, policy_version 962529 (0.0008) [2024-06-15 23:31:12,779][1653645] Updated weights for policy 0, policy_version 962592 (0.0076) [2024-06-15 23:31:15,104][1653645] Updated weights for policy 0, policy_version 962656 (0.0012) [2024-06-15 23:31:15,757][1653645] Updated weights for policy 0, policy_version 962704 (0.0014) [2024-06-15 23:31:15,958][1648982] Fps is (10 sec: 81919.6, 60 sec: 90112.8, 300 sec: 91639.3). Total num frames: 1971617792. Throughput: 0: 22858.0. Samples: 492981248. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:15,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 23:31:16,620][1653645] Updated weights for policy 0, policy_version 962768 (0.0009) [2024-06-15 23:31:17,193][1653645] Updated weights for policy 0, policy_version 962811 (0.0009) [2024-06-15 23:31:18,966][1653645] Updated weights for policy 0, policy_version 962875 (0.0011) [2024-06-15 23:31:20,958][1648982] Fps is (10 sec: 85197.2, 60 sec: 89019.7, 300 sec: 91306.1). Total num frames: 1972043776. Throughput: 0: 22823.8. Samples: 493047296. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:20,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:31:21,129][1653645] Updated weights for policy 0, policy_version 962928 (0.0011) [2024-06-15 23:31:21,815][1653645] Updated weights for policy 0, policy_version 962976 (0.0013) [2024-06-15 23:31:22,580][1653645] Updated weights for policy 0, policy_version 963029 (0.0020) [2024-06-15 23:31:24,275][1653645] Updated weights for policy 0, policy_version 963076 (0.0011) [2024-06-15 23:31:24,912][1653645] Updated weights for policy 0, policy_version 963130 (0.0010) [2024-06-15 23:31:25,958][1648982] Fps is (10 sec: 88474.4, 60 sec: 89566.0, 300 sec: 91200.2). Total num frames: 1972502528. Throughput: 0: 22789.7. Samples: 493179904. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:25,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:31:27,021][1653645] Updated weights for policy 0, policy_version 963169 (0.0017) [2024-06-15 23:31:27,877][1653645] Updated weights for policy 0, policy_version 963234 (0.0011) [2024-06-15 23:31:28,514][1653645] Updated weights for policy 0, policy_version 963285 (0.0010) [2024-06-15 23:31:30,055][1653645] Updated weights for policy 0, policy_version 963329 (0.0012) [2024-06-15 23:31:30,690][1653645] Updated weights for policy 0, policy_version 963385 (0.0011) [2024-06-15 23:31:30,958][1648982] Fps is (10 sec: 98305.9, 60 sec: 90658.6, 300 sec: 91639.3). Total num frames: 1973026816. Throughput: 0: 22710.1. Samples: 493316096. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:30,958][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:31:32,481][1651596] Signal inference workers to stop experience collection... (50050 times) [2024-06-15 23:31:32,516][1653645] InferenceWorker_p0-w0: stopping experience collection (50050 times) [2024-06-15 23:31:32,616][1651596] Signal inference workers to resume experience collection... (50050 times) [2024-06-15 23:31:32,617][1653645] InferenceWorker_p0-w0: resuming experience collection (50050 times) [2024-06-15 23:31:32,823][1653645] Updated weights for policy 0, policy_version 963440 (0.0011) [2024-06-15 23:31:33,456][1653645] Updated weights for policy 0, policy_version 963488 (0.0010) [2024-06-15 23:31:34,313][1653645] Updated weights for policy 0, policy_version 963552 (0.0081) [2024-06-15 23:31:35,783][1653645] Updated weights for policy 0, policy_version 963587 (0.0011) [2024-06-15 23:31:35,958][1648982] Fps is (10 sec: 95027.3, 60 sec: 91204.4, 300 sec: 91528.2). Total num frames: 1973452800. Throughput: 0: 22823.8. Samples: 493384704. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:35,961][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:31:38,106][1653645] Updated weights for policy 0, policy_version 963650 (0.0011) [2024-06-15 23:31:38,836][1653645] Updated weights for policy 0, policy_version 963712 (0.0010) [2024-06-15 23:31:39,536][1653645] Updated weights for policy 0, policy_version 963768 (0.0010) [2024-06-15 23:31:40,446][1653645] Updated weights for policy 0, policy_version 963833 (0.0011) [2024-06-15 23:31:40,958][1648982] Fps is (10 sec: 91749.7, 60 sec: 91750.5, 300 sec: 91750.4). Total num frames: 1973944320. Throughput: 0: 22789.9. Samples: 493515776. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:40,958][1648982] Avg episode reward: [(0, '37.340')] [2024-06-15 23:31:42,118][1653645] Updated weights for policy 0, policy_version 963888 (0.0009) [2024-06-15 23:31:44,057][1653645] Updated weights for policy 0, policy_version 963936 (0.0012) [2024-06-15 23:31:44,613][1653645] Updated weights for policy 0, policy_version 963970 (0.0029) [2024-06-15 23:31:45,289][1653645] Updated weights for policy 0, policy_version 964020 (0.0014) [2024-06-15 23:31:45,962][1648982] Fps is (10 sec: 94984.8, 60 sec: 92835.7, 300 sec: 91749.1). Total num frames: 1974403072. Throughput: 0: 22764.7. Samples: 493656064. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:45,963][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:31:45,995][1653645] Updated weights for policy 0, policy_version 964070 (0.0010) [2024-06-15 23:31:47,562][1653645] Updated weights for policy 0, policy_version 964131 (0.0010) [2024-06-15 23:31:49,200][1653645] Updated weights for policy 0, policy_version 964165 (0.0014) [2024-06-15 23:31:49,831][1653645] Updated weights for policy 0, policy_version 964222 (0.0011) [2024-06-15 23:31:50,958][1648982] Fps is (10 sec: 91750.5, 60 sec: 91750.5, 300 sec: 91528.4). Total num frames: 1974861824. Throughput: 0: 22596.3. Samples: 493726720. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:31:51,007][1653645] Updated weights for policy 0, policy_version 964289 (0.0057) [2024-06-15 23:31:51,672][1653645] Updated weights for policy 0, policy_version 964352 (0.0011) [2024-06-15 23:31:53,016][1651596] Signal inference workers to stop experience collection... (50100 times) [2024-06-15 23:31:53,060][1653645] InferenceWorker_p0-w0: stopping experience collection (50100 times) [2024-06-15 23:31:53,159][1651596] Signal inference workers to resume experience collection... (50100 times) [2024-06-15 23:31:53,160][1653645] InferenceWorker_p0-w0: resuming experience collection (50100 times) [2024-06-15 23:31:53,450][1653645] Updated weights for policy 0, policy_version 964409 (0.0012) [2024-06-15 23:31:55,428][1653645] Updated weights for policy 0, policy_version 964453 (0.0010) [2024-06-15 23:31:55,958][1648982] Fps is (10 sec: 85234.6, 60 sec: 90112.5, 300 sec: 91083.9). Total num frames: 1975255040. Throughput: 0: 22653.2. Samples: 493864960. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:31:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:31:56,253][1653645] Updated weights for policy 0, policy_version 964512 (0.0013) [2024-06-15 23:31:57,077][1653645] Updated weights for policy 0, policy_version 964576 (0.0009) [2024-06-15 23:31:58,504][1653645] Updated weights for policy 0, policy_version 964614 (0.0011) [2024-06-15 23:31:59,148][1653645] Updated weights for policy 0, policy_version 964671 (0.0011) [2024-06-15 23:32:00,958][1648982] Fps is (10 sec: 81917.7, 60 sec: 89565.6, 300 sec: 90861.7). Total num frames: 1975681024. Throughput: 0: 22812.4. Samples: 494007808. Policy #0 lag: (min: 11.0, avg: 93.1, max: 267.0) [2024-06-15 23:32:00,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:32:01,316][1653645] Updated weights for policy 0, policy_version 964732 (0.0012) [2024-06-15 23:32:02,106][1653645] Updated weights for policy 0, policy_version 964773 (0.0011) [2024-06-15 23:32:02,986][1653645] Updated weights for policy 0, policy_version 964834 (0.0010) [2024-06-15 23:32:04,522][1653645] Updated weights for policy 0, policy_version 964885 (0.0010) [2024-06-15 23:32:05,958][1648982] Fps is (10 sec: 91750.8, 60 sec: 89565.9, 300 sec: 91084.0). Total num frames: 1976172544. Throughput: 0: 22585.0. Samples: 494063616. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:05,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:32:06,788][1653645] Updated weights for policy 0, policy_version 964948 (0.0013) [2024-06-15 23:32:07,593][1653645] Updated weights for policy 0, policy_version 964999 (0.0011) [2024-06-15 23:32:08,407][1653645] Updated weights for policy 0, policy_version 965058 (0.0011) [2024-06-15 23:32:09,015][1653645] Updated weights for policy 0, policy_version 965113 (0.0011) [2024-06-15 23:32:10,335][1653645] Updated weights for policy 0, policy_version 965155 (0.0011) [2024-06-15 23:32:10,958][1648982] Fps is (10 sec: 101582.9, 60 sec: 91750.6, 300 sec: 91528.2). Total num frames: 1976696832. Throughput: 0: 22710.0. Samples: 494201856. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:32:12,079][1653645] Updated weights for policy 0, policy_version 965186 (0.0011) [2024-06-15 23:32:12,740][1653645] Updated weights for policy 0, policy_version 965242 (0.0013) [2024-06-15 23:32:13,779][1653645] Updated weights for policy 0, policy_version 965284 (0.0011) [2024-06-15 23:32:14,398][1653645] Updated weights for policy 0, policy_version 965333 (0.0011) [2024-06-15 23:32:14,539][1651596] Signal inference workers to stop experience collection... (50150 times) [2024-06-15 23:32:14,608][1653645] InferenceWorker_p0-w0: stopping experience collection (50150 times) [2024-06-15 23:32:14,666][1651596] Signal inference workers to resume experience collection... (50150 times) [2024-06-15 23:32:14,667][1653645] InferenceWorker_p0-w0: resuming experience collection (50150 times) [2024-06-15 23:32:14,806][1653645] Updated weights for policy 0, policy_version 965370 (0.0011) [2024-06-15 23:32:15,814][1653645] Updated weights for policy 0, policy_version 965408 (0.0012) [2024-06-15 23:32:15,958][1648982] Fps is (10 sec: 98304.0, 60 sec: 92296.7, 300 sec: 91639.3). Total num frames: 1977155584. Throughput: 0: 22892.1. Samples: 494346240. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:15,960][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:32:16,183][1653645] Updated weights for policy 0, policy_version 965440 (0.0010) [2024-06-15 23:32:18,276][1653645] Updated weights for policy 0, policy_version 965493 (0.0078) [2024-06-15 23:32:19,209][1653645] Updated weights for policy 0, policy_version 965540 (0.0012) [2024-06-15 23:32:20,099][1653645] Updated weights for policy 0, policy_version 965607 (0.0011) [2024-06-15 23:32:20,958][1648982] Fps is (10 sec: 91750.8, 60 sec: 92842.8, 300 sec: 91528.2). Total num frames: 1977614336. Throughput: 0: 23062.8. Samples: 494422528. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:20,958][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:32:21,227][1653645] Updated weights for policy 0, policy_version 965649 (0.0010) [2024-06-15 23:32:24,007][1653645] Updated weights for policy 0, policy_version 965715 (0.0011) [2024-06-15 23:32:24,620][1653645] Updated weights for policy 0, policy_version 965762 (0.0010) [2024-06-15 23:32:25,501][1653645] Updated weights for policy 0, policy_version 965829 (0.0011) [2024-06-15 23:32:25,958][1648982] Fps is (10 sec: 91749.5, 60 sec: 92842.5, 300 sec: 91752.6). Total num frames: 1978073088. Throughput: 0: 23131.0. Samples: 494556672. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:25,959][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 23:32:27,167][1653645] Updated weights for policy 0, policy_version 965904 (0.0013) [2024-06-15 23:32:27,671][1653645] Updated weights for policy 0, policy_version 965946 (0.0012) [2024-06-15 23:32:30,113][1653645] Updated weights for policy 0, policy_version 966000 (0.0076) [2024-06-15 23:32:30,845][1653645] Updated weights for policy 0, policy_version 966049 (0.0014) [2024-06-15 23:32:30,958][1648982] Fps is (10 sec: 88474.0, 60 sec: 91204.2, 300 sec: 91417.2). Total num frames: 1978499072. Throughput: 0: 22985.4. Samples: 494690304. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:30,958][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:32:31,487][1653645] Updated weights for policy 0, policy_version 966100 (0.0009) [2024-06-15 23:32:32,973][1653645] Updated weights for policy 0, policy_version 966166 (0.0010) [2024-06-15 23:32:33,371][1653645] Updated weights for policy 0, policy_version 966208 (0.0015) [2024-06-15 23:32:35,850][1653645] Updated weights for policy 0, policy_version 966258 (0.0012) [2024-06-15 23:32:35,958][1648982] Fps is (10 sec: 85195.7, 60 sec: 91203.9, 300 sec: 91306.0). Total num frames: 1978925056. Throughput: 0: 22789.6. Samples: 494752256. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:35,958][1648982] Avg episode reward: [(0, '37.320')] [2024-06-15 23:32:36,244][1651596] Signal inference workers to stop experience collection... (50200 times) [2024-06-15 23:32:36,275][1653645] InferenceWorker_p0-w0: stopping experience collection (50200 times) [2024-06-15 23:32:36,368][1651596] Signal inference workers to resume experience collection... (50200 times) [2024-06-15 23:32:36,370][1653645] InferenceWorker_p0-w0: resuming experience collection (50200 times) [2024-06-15 23:32:36,565][1653645] Updated weights for policy 0, policy_version 966320 (0.0010) [2024-06-15 23:32:37,427][1653645] Updated weights for policy 0, policy_version 966368 (0.0011) [2024-06-15 23:32:38,573][1653645] Updated weights for policy 0, policy_version 966432 (0.0014) [2024-06-15 23:32:40,962][1648982] Fps is (10 sec: 81882.3, 60 sec: 89559.1, 300 sec: 91193.6). Total num frames: 1979318272. Throughput: 0: 22867.0. Samples: 494894080. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:40,963][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:32:41,150][1653645] Updated weights for policy 0, policy_version 966482 (0.0010) [2024-06-15 23:32:41,876][1653645] Updated weights for policy 0, policy_version 966544 (0.0010) [2024-06-15 23:32:42,420][1653645] Updated weights for policy 0, policy_version 966585 (0.0010) [2024-06-15 23:32:43,279][1653645] Updated weights for policy 0, policy_version 966640 (0.0010) [2024-06-15 23:32:44,364][1653645] Updated weights for policy 0, policy_version 966690 (0.0009) [2024-06-15 23:32:45,958][1648982] Fps is (10 sec: 91752.2, 60 sec: 90664.8, 300 sec: 91528.4). Total num frames: 1979842560. Throughput: 0: 22789.8. Samples: 495033344. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:45,958][1648982] Avg episode reward: [(0, '37.230')] [2024-06-15 23:32:46,959][1653645] Updated weights for policy 0, policy_version 966757 (0.0023) [2024-06-15 23:32:47,805][1653645] Updated weights for policy 0, policy_version 966820 (0.0009) [2024-06-15 23:32:48,831][1653645] Updated weights for policy 0, policy_version 966867 (0.0011) [2024-06-15 23:32:49,302][1653645] Updated weights for policy 0, policy_version 966910 (0.0012) [2024-06-15 23:32:50,364][1653645] Updated weights for policy 0, policy_version 966971 (0.0010) [2024-06-15 23:32:50,957][1648982] Fps is (10 sec: 104906.2, 60 sec: 91750.5, 300 sec: 91861.5). Total num frames: 1980366848. Throughput: 0: 23062.8. Samples: 495101440. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:50,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:32:52,682][1653645] Updated weights for policy 0, policy_version 967028 (0.0010) [2024-06-15 23:32:53,454][1653645] Updated weights for policy 0, policy_version 967088 (0.0009) [2024-06-15 23:32:54,565][1653645] Updated weights for policy 0, policy_version 967136 (0.0010) [2024-06-15 23:32:55,699][1653645] Updated weights for policy 0, policy_version 967185 (0.0012) [2024-06-15 23:32:55,958][1648982] Fps is (10 sec: 98297.9, 60 sec: 92841.7, 300 sec: 91750.2). Total num frames: 1980825600. Throughput: 0: 23130.7. Samples: 495242752. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:32:55,959][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:32:56,224][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000967232_1980891136.pth... [2024-06-15 23:32:56,233][1653645] Updated weights for policy 0, policy_version 967232 (0.0064) [2024-06-15 23:32:56,258][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000956480_1958871040.pth [2024-06-15 23:32:58,458][1651596] Signal inference workers to stop experience collection... (50250 times) [2024-06-15 23:32:58,472][1653645] Updated weights for policy 0, policy_version 967281 (0.0010) [2024-06-15 23:32:58,484][1653645] InferenceWorker_p0-w0: stopping experience collection (50250 times) [2024-06-15 23:32:58,574][1651596] Signal inference workers to resume experience collection... (50250 times) [2024-06-15 23:32:58,575][1653645] InferenceWorker_p0-w0: resuming experience collection (50250 times) [2024-06-15 23:32:59,320][1653645] Updated weights for policy 0, policy_version 967351 (0.0011) [2024-06-15 23:33:00,494][1653645] Updated weights for policy 0, policy_version 967393 (0.0009) [2024-06-15 23:33:00,958][1648982] Fps is (10 sec: 91750.0, 60 sec: 93389.2, 300 sec: 91750.4). Total num frames: 1981284352. Throughput: 0: 22903.5. Samples: 495376896. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:00,960][1648982] Avg episode reward: [(0, '37.360')] [2024-06-15 23:33:01,493][1653645] Updated weights for policy 0, policy_version 967445 (0.0011) [2024-06-15 23:33:03,448][1653645] Updated weights for policy 0, policy_version 967491 (0.0010) [2024-06-15 23:33:04,169][1653645] Updated weights for policy 0, policy_version 967552 (0.0010) [2024-06-15 23:33:05,034][1653645] Updated weights for policy 0, policy_version 967613 (0.0011) [2024-06-15 23:33:05,958][1648982] Fps is (10 sec: 88479.2, 60 sec: 92296.5, 300 sec: 91639.4). Total num frames: 1981710336. Throughput: 0: 22812.4. Samples: 495449088. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:05,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:33:06,464][1653645] Updated weights for policy 0, policy_version 967671 (0.0012) [2024-06-15 23:33:07,582][1653645] Updated weights for policy 0, policy_version 967714 (0.0010) [2024-06-15 23:33:09,448][1653645] Updated weights for policy 0, policy_version 967764 (0.0010) [2024-06-15 23:33:10,396][1653645] Updated weights for policy 0, policy_version 967827 (0.0011) [2024-06-15 23:33:10,958][1648982] Fps is (10 sec: 91750.2, 60 sec: 91750.5, 300 sec: 91528.2). Total num frames: 1982201856. Throughput: 0: 22858.0. Samples: 495585280. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:10,958][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:33:11,821][1653645] Updated weights for policy 0, policy_version 967875 (0.0011) [2024-06-15 23:33:12,488][1653645] Updated weights for policy 0, policy_version 967936 (0.0011) [2024-06-15 23:33:13,537][1653645] Updated weights for policy 0, policy_version 967985 (0.0017) [2024-06-15 23:33:15,483][1653645] Updated weights for policy 0, policy_version 968035 (0.0010) [2024-06-15 23:33:15,962][1648982] Fps is (10 sec: 88433.9, 60 sec: 90651.3, 300 sec: 91082.6). Total num frames: 1982595072. Throughput: 0: 22901.2. Samples: 495720960. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:15,963][1648982] Avg episode reward: [(0, '37.410')] [2024-06-15 23:33:16,443][1653645] Updated weights for policy 0, policy_version 968112 (0.0010) [2024-06-15 23:33:18,021][1653645] Updated weights for policy 0, policy_version 968163 (0.0013) [2024-06-15 23:33:19,064][1653645] Updated weights for policy 0, policy_version 968208 (0.0012) [2024-06-15 23:33:20,833][1653645] Updated weights for policy 0, policy_version 968257 (0.0011) [2024-06-15 23:33:20,958][1648982] Fps is (10 sec: 81920.2, 60 sec: 90112.0, 300 sec: 91195.0). Total num frames: 1983021056. Throughput: 0: 22892.2. Samples: 495782400. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:20,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:33:21,265][1651596] Signal inference workers to stop experience collection... (50300 times) [2024-06-15 23:33:21,298][1653645] InferenceWorker_p0-w0: stopping experience collection (50300 times) [2024-06-15 23:33:21,374][1651596] Signal inference workers to resume experience collection... (50300 times) [2024-06-15 23:33:21,375][1653645] InferenceWorker_p0-w0: resuming experience collection (50300 times) [2024-06-15 23:33:21,484][1653645] Updated weights for policy 0, policy_version 968306 (0.0010) [2024-06-15 23:33:22,355][1653645] Updated weights for policy 0, policy_version 968384 (0.0014) [2024-06-15 23:33:23,968][1653645] Updated weights for policy 0, policy_version 968447 (0.0010) [2024-06-15 23:33:25,241][1653645] Updated weights for policy 0, policy_version 968512 (0.0012) [2024-06-15 23:33:25,958][1648982] Fps is (10 sec: 91790.5, 60 sec: 90658.1, 300 sec: 91417.1). Total num frames: 1983512576. Throughput: 0: 22837.4. Samples: 495921664. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:25,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:33:27,537][1653645] Updated weights for policy 0, policy_version 968585 (0.0012) [2024-06-15 23:33:29,134][1653645] Updated weights for policy 0, policy_version 968643 (0.0013) [2024-06-15 23:33:29,735][1653645] Updated weights for policy 0, policy_version 968702 (0.0012) [2024-06-15 23:33:30,905][1653645] Updated weights for policy 0, policy_version 968752 (0.0011) [2024-06-15 23:33:30,958][1648982] Fps is (10 sec: 98304.0, 60 sec: 91750.4, 300 sec: 91750.5). Total num frames: 1984004096. Throughput: 0: 22732.8. Samples: 496056320. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:30,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:33:32,796][1653645] Updated weights for policy 0, policy_version 968788 (0.0011) [2024-06-15 23:33:33,397][1653645] Updated weights for policy 0, policy_version 968836 (0.0010) [2024-06-15 23:33:34,063][1653645] Updated weights for policy 0, policy_version 968896 (0.0013) [2024-06-15 23:33:35,313][1653645] Updated weights for policy 0, policy_version 968957 (0.0011) [2024-06-15 23:33:35,958][1648982] Fps is (10 sec: 91749.5, 60 sec: 91750.4, 300 sec: 91528.1). Total num frames: 1984430080. Throughput: 0: 22835.0. Samples: 496129024. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:35,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:33:36,476][1653645] Updated weights for policy 0, policy_version 968996 (0.0011) [2024-06-15 23:33:38,170][1653645] Updated weights for policy 0, policy_version 969045 (0.0011) [2024-06-15 23:33:38,937][1653645] Updated weights for policy 0, policy_version 969107 (0.0009) [2024-06-15 23:33:39,412][1653645] Updated weights for policy 0, policy_version 969152 (0.0011) [2024-06-15 23:33:40,958][1648982] Fps is (10 sec: 88470.6, 60 sec: 92849.2, 300 sec: 91528.2). Total num frames: 1984888832. Throughput: 0: 22789.9. Samples: 496268288. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:40,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:33:41,128][1653645] Updated weights for policy 0, policy_version 969212 (0.0014) [2024-06-15 23:33:42,236][1653645] Updated weights for policy 0, policy_version 969274 (0.0012) [2024-06-15 23:33:43,584][1651596] Signal inference workers to stop experience collection... (50350 times) [2024-06-15 23:33:43,611][1653645] InferenceWorker_p0-w0: stopping experience collection (50350 times) [2024-06-15 23:33:43,714][1651596] Signal inference workers to resume experience collection... (50350 times) [2024-06-15 23:33:43,715][1653645] InferenceWorker_p0-w0: resuming experience collection (50350 times) [2024-06-15 23:33:44,112][1653645] Updated weights for policy 0, policy_version 969328 (0.0010) [2024-06-15 23:33:44,982][1653645] Updated weights for policy 0, policy_version 969399 (0.0013) [2024-06-15 23:33:45,958][1648982] Fps is (10 sec: 91752.8, 60 sec: 91750.5, 300 sec: 91528.2). Total num frames: 1985347584. Throughput: 0: 22732.8. Samples: 496399872. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:45,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:33:46,595][1653645] Updated weights for policy 0, policy_version 969447 (0.0022) [2024-06-15 23:33:47,617][1653645] Updated weights for policy 0, policy_version 969488 (0.0010) [2024-06-15 23:33:49,329][1653645] Updated weights for policy 0, policy_version 969537 (0.0013) [2024-06-15 23:33:49,970][1653645] Updated weights for policy 0, policy_version 969590 (0.0011) [2024-06-15 23:33:50,791][1653645] Updated weights for policy 0, policy_version 969660 (0.0012) [2024-06-15 23:33:50,958][1648982] Fps is (10 sec: 98307.0, 60 sec: 91750.3, 300 sec: 91528.4). Total num frames: 1985871872. Throughput: 0: 22721.4. Samples: 496471552. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:50,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:33:52,819][1653645] Updated weights for policy 0, policy_version 969725 (0.0013) [2024-06-15 23:33:53,745][1653645] Updated weights for policy 0, policy_version 969776 (0.0027) [2024-06-15 23:33:55,509][1653645] Updated weights for policy 0, policy_version 969827 (0.0009) [2024-06-15 23:33:55,959][1648982] Fps is (10 sec: 91741.5, 60 sec: 90657.7, 300 sec: 91305.8). Total num frames: 1986265088. Throughput: 0: 22743.7. Samples: 496608768. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:33:55,959][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:33:56,189][1653645] Updated weights for policy 0, policy_version 969888 (0.0011) [2024-06-15 23:33:58,062][1653645] Updated weights for policy 0, policy_version 969952 (0.0009) [2024-06-15 23:33:59,058][1653645] Updated weights for policy 0, policy_version 970005 (0.0011) [2024-06-15 23:33:59,518][1653645] Updated weights for policy 0, policy_version 970048 (0.0009) [2024-06-15 23:34:00,958][1648982] Fps is (10 sec: 78643.5, 60 sec: 89565.9, 300 sec: 91306.1). Total num frames: 1986658304. Throughput: 0: 22871.6. Samples: 496750080. Policy #0 lag: (min: 47.0, avg: 178.0, max: 303.0) [2024-06-15 23:34:00,958][1648982] Avg episode reward: [(0, '37.330')] [2024-06-15 23:34:01,573][1653645] Updated weights for policy 0, policy_version 970100 (0.0011) [2024-06-15 23:34:02,374][1653645] Updated weights for policy 0, policy_version 970170 (0.0012) [2024-06-15 23:34:03,823][1653645] Updated weights for policy 0, policy_version 970213 (0.0010) [2024-06-15 23:34:04,419][1651596] Signal inference workers to stop experience collection... (50400 times) [2024-06-15 23:34:04,456][1653645] InferenceWorker_p0-w0: stopping experience collection (50400 times) [2024-06-15 23:34:04,534][1651596] Signal inference workers to resume experience collection... (50400 times) [2024-06-15 23:34:04,535][1653645] InferenceWorker_p0-w0: resuming experience collection (50400 times) [2024-06-15 23:34:04,537][1653645] Updated weights for policy 0, policy_version 970272 (0.0010) [2024-06-15 23:34:05,958][1648982] Fps is (10 sec: 91759.6, 60 sec: 91204.4, 300 sec: 91639.4). Total num frames: 1987182592. Throughput: 0: 23074.1. Samples: 496820736. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:05,958][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:34:06,643][1653645] Updated weights for policy 0, policy_version 970323 (0.0011) [2024-06-15 23:34:07,285][1653645] Updated weights for policy 0, policy_version 970373 (0.0011) [2024-06-15 23:34:07,882][1653645] Updated weights for policy 0, policy_version 970427 (0.0010) [2024-06-15 23:34:09,491][1653645] Updated weights for policy 0, policy_version 970468 (0.0012) [2024-06-15 23:34:10,227][1653645] Updated weights for policy 0, policy_version 970528 (0.0078) [2024-06-15 23:34:10,958][1648982] Fps is (10 sec: 104857.4, 60 sec: 91750.4, 300 sec: 91861.5). Total num frames: 1987706880. Throughput: 0: 23074.2. Samples: 496960000. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:10,979][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:34:12,125][1653645] Updated weights for policy 0, policy_version 970576 (0.0010) [2024-06-15 23:34:12,989][1653645] Updated weights for policy 0, policy_version 970640 (0.0072) [2024-06-15 23:34:13,593][1653645] Updated weights for policy 0, policy_version 970688 (0.0012) [2024-06-15 23:34:15,271][1653645] Updated weights for policy 0, policy_version 970743 (0.0010) [2024-06-15 23:34:15,944][1653645] Updated weights for policy 0, policy_version 970786 (0.0011) [2024-06-15 23:34:15,958][1648982] Fps is (10 sec: 98303.3, 60 sec: 92849.6, 300 sec: 91750.4). Total num frames: 1988165632. Throughput: 0: 23108.2. Samples: 497096192. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:15,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:34:18,097][1653645] Updated weights for policy 0, policy_version 970848 (0.0011) [2024-06-15 23:34:18,905][1653645] Updated weights for policy 0, policy_version 970912 (0.0012) [2024-06-15 23:34:20,663][1653645] Updated weights for policy 0, policy_version 970960 (0.0012) [2024-06-15 23:34:20,958][1648982] Fps is (10 sec: 85194.1, 60 sec: 92296.0, 300 sec: 91417.4). Total num frames: 1988558848. Throughput: 0: 23062.7. Samples: 497166848. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:20,959][1648982] Avg episode reward: [(0, '37.290')] [2024-06-15 23:34:21,407][1653645] Updated weights for policy 0, policy_version 971012 (0.0013) [2024-06-15 23:34:22,065][1653645] Updated weights for policy 0, policy_version 971072 (0.0020) [2024-06-15 23:34:24,403][1653645] Updated weights for policy 0, policy_version 971137 (0.0012) [2024-06-15 23:34:25,039][1653645] Updated weights for policy 0, policy_version 971196 (0.0013) [2024-06-15 23:34:25,958][1648982] Fps is (10 sec: 85194.0, 60 sec: 91750.1, 300 sec: 91417.1). Total num frames: 1989017600. Throughput: 0: 22892.1. Samples: 497298432. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:25,959][1648982] Avg episode reward: [(0, '37.380')] [2024-06-15 23:34:26,595][1651596] Signal inference workers to stop experience collection... (50450 times) [2024-06-15 23:34:26,631][1653645] InferenceWorker_p0-w0: stopping experience collection (50450 times) [2024-06-15 23:34:26,738][1651596] Signal inference workers to resume experience collection... (50450 times) [2024-06-15 23:34:26,738][1653645] InferenceWorker_p0-w0: resuming experience collection (50450 times) [2024-06-15 23:34:27,240][1653645] Updated weights for policy 0, policy_version 971253 (0.0013) [2024-06-15 23:34:27,948][1653645] Updated weights for policy 0, policy_version 971312 (0.0010) [2024-06-15 23:34:29,613][1653645] Updated weights for policy 0, policy_version 971347 (0.0009) [2024-06-15 23:34:30,463][1653645] Updated weights for policy 0, policy_version 971412 (0.0010) [2024-06-15 23:34:30,955][1653645] Updated weights for policy 0, policy_version 971454 (0.0011) [2024-06-15 23:34:30,958][1648982] Fps is (10 sec: 95030.6, 60 sec: 91750.4, 300 sec: 91417.2). Total num frames: 1989509120. Throughput: 0: 23051.4. Samples: 497437184. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:30,958][1648982] Avg episode reward: [(0, '37.280')] [2024-06-15 23:34:32,558][1653645] Updated weights for policy 0, policy_version 971491 (0.0010) [2024-06-15 23:34:33,221][1653645] Updated weights for policy 0, policy_version 971539 (0.0010) [2024-06-15 23:34:33,707][1653645] Updated weights for policy 0, policy_version 971582 (0.0009) [2024-06-15 23:34:35,537][1653645] Updated weights for policy 0, policy_version 971635 (0.0011) [2024-06-15 23:34:35,973][1648982] Fps is (10 sec: 94888.1, 60 sec: 92273.8, 300 sec: 91301.5). Total num frames: 1989967872. Throughput: 0: 22964.1. Samples: 497505280. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:35,975][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:34:36,411][1653645] Updated weights for policy 0, policy_version 971708 (0.0010) [2024-06-15 23:34:38,548][1653645] Updated weights for policy 0, policy_version 971760 (0.0010) [2024-06-15 23:34:39,246][1653645] Updated weights for policy 0, policy_version 971810 (0.0010) [2024-06-15 23:34:40,958][1648982] Fps is (10 sec: 81919.6, 60 sec: 90658.6, 300 sec: 91083.9). Total num frames: 1990328320. Throughput: 0: 22949.5. Samples: 497641472. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:40,960][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:34:41,002][1653645] Updated weights for policy 0, policy_version 971856 (0.0010) [2024-06-15 23:34:41,813][1653645] Updated weights for policy 0, policy_version 971920 (0.0010) [2024-06-15 23:34:42,344][1653645] Updated weights for policy 0, policy_version 971965 (0.0013) [2024-06-15 23:34:44,290][1653645] Updated weights for policy 0, policy_version 972025 (0.0011) [2024-06-15 23:34:44,989][1653645] Updated weights for policy 0, policy_version 972068 (0.0010) [2024-06-15 23:34:45,958][1648982] Fps is (10 sec: 88606.4, 60 sec: 91750.3, 300 sec: 91417.2). Total num frames: 1990852608. Throughput: 0: 22778.3. Samples: 497775104. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:45,958][1648982] Avg episode reward: [(0, '37.490')] [2024-06-15 23:34:46,691][1653645] Updated weights for policy 0, policy_version 972097 (0.0012) [2024-06-15 23:34:47,117][1651596] Signal inference workers to stop experience collection... (50500 times) [2024-06-15 23:34:47,171][1653645] InferenceWorker_p0-w0: stopping experience collection (50500 times) [2024-06-15 23:34:47,251][1651596] Signal inference workers to resume experience collection... (50500 times) [2024-06-15 23:34:47,252][1653645] InferenceWorker_p0-w0: resuming experience collection (50500 times) [2024-06-15 23:34:47,362][1653645] Updated weights for policy 0, policy_version 972146 (0.0009) [2024-06-15 23:34:48,312][1653645] Updated weights for policy 0, policy_version 972224 (0.0010) [2024-06-15 23:34:49,885][1653645] Updated weights for policy 0, policy_version 972288 (0.0010) [2024-06-15 23:34:50,923][1653645] Updated weights for policy 0, policy_version 972342 (0.0012) [2024-06-15 23:34:50,958][1648982] Fps is (10 sec: 101579.1, 60 sec: 91204.0, 300 sec: 91528.2). Total num frames: 1991344128. Throughput: 0: 22755.4. Samples: 497844736. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:50,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:34:52,878][1653645] Updated weights for policy 0, policy_version 972386 (0.0010) [2024-06-15 23:34:53,505][1653645] Updated weights for policy 0, policy_version 972437 (0.0011) [2024-06-15 23:34:55,000][1653645] Updated weights for policy 0, policy_version 972496 (0.0010) [2024-06-15 23:34:55,958][1648982] Fps is (10 sec: 91747.7, 60 sec: 91751.4, 300 sec: 91528.2). Total num frames: 1991770112. Throughput: 0: 22687.1. Samples: 497980928. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:34:55,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:34:56,192][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000972560_1991802880.pth... [2024-06-15 23:34:56,298][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000961856_1969881088.pth [2024-06-15 23:34:56,441][1653645] Updated weights for policy 0, policy_version 972576 (0.0012) [2024-06-15 23:34:56,874][1653645] Updated weights for policy 0, policy_version 972608 (0.0011) [2024-06-15 23:34:59,065][1653645] Updated weights for policy 0, policy_version 972672 (0.0011) [2024-06-15 23:34:59,787][1653645] Updated weights for policy 0, policy_version 972731 (0.0009) [2024-06-15 23:35:00,958][1648982] Fps is (10 sec: 88473.7, 60 sec: 92842.4, 300 sec: 91306.1). Total num frames: 1992228864. Throughput: 0: 22732.7. Samples: 498119168. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:00,958][1648982] Avg episode reward: [(0, '37.420')] [2024-06-15 23:35:01,057][1653645] Updated weights for policy 0, policy_version 972784 (0.0009) [2024-06-15 23:35:02,091][1653645] Updated weights for policy 0, policy_version 972817 (0.0010) [2024-06-15 23:35:02,547][1653645] Updated weights for policy 0, policy_version 972863 (0.0009) [2024-06-15 23:35:04,422][1653645] Updated weights for policy 0, policy_version 972913 (0.0009) [2024-06-15 23:35:05,166][1653645] Updated weights for policy 0, policy_version 972976 (0.0010) [2024-06-15 23:35:05,958][1648982] Fps is (10 sec: 91750.6, 60 sec: 91749.9, 300 sec: 91194.9). Total num frames: 1992687616. Throughput: 0: 22858.0. Samples: 498195456. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:05,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:35:06,521][1653645] Updated weights for policy 0, policy_version 973040 (0.0011) [2024-06-15 23:35:07,883][1653645] Updated weights for policy 0, policy_version 973088 (0.0009) [2024-06-15 23:35:09,371][1651596] Signal inference workers to stop experience collection... (50550 times) [2024-06-15 23:35:09,408][1653645] InferenceWorker_p0-w0: stopping experience collection (50550 times) [2024-06-15 23:35:09,492][1651596] Signal inference workers to resume experience collection... (50550 times) [2024-06-15 23:35:09,493][1653645] InferenceWorker_p0-w0: resuming experience collection (50550 times) [2024-06-15 23:35:09,494][1653645] Updated weights for policy 0, policy_version 973136 (0.0010) [2024-06-15 23:35:10,307][1653645] Updated weights for policy 0, policy_version 973190 (0.0012) [2024-06-15 23:35:10,958][1648982] Fps is (10 sec: 95028.8, 60 sec: 91204.3, 300 sec: 91417.4). Total num frames: 1993179136. Throughput: 0: 23085.7. Samples: 498337280. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:10,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:35:10,969][1653645] Updated weights for policy 0, policy_version 973248 (0.0012) [2024-06-15 23:35:12,432][1653645] Updated weights for policy 0, policy_version 973312 (0.0012) [2024-06-15 23:35:13,786][1653645] Updated weights for policy 0, policy_version 973373 (0.0062) [2024-06-15 23:35:15,803][1653645] Updated weights for policy 0, policy_version 973440 (0.0010) [2024-06-15 23:35:15,958][1648982] Fps is (10 sec: 91751.0, 60 sec: 90657.8, 300 sec: 91195.0). Total num frames: 1993605120. Throughput: 0: 22983.0. Samples: 498471424. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:15,959][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:35:16,448][1653645] Updated weights for policy 0, policy_version 973489 (0.0011) [2024-06-15 23:35:18,085][1653645] Updated weights for policy 0, policy_version 973537 (0.0010) [2024-06-15 23:35:19,028][1653645] Updated weights for policy 0, policy_version 973584 (0.0011) [2024-06-15 23:35:19,634][1653645] Updated weights for policy 0, policy_version 973632 (0.0017) [2024-06-15 23:35:20,958][1648982] Fps is (10 sec: 85195.4, 60 sec: 91204.5, 300 sec: 91195.0). Total num frames: 1994031104. Throughput: 0: 22899.6. Samples: 498535424. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:20,959][1648982] Avg episode reward: [(0, '37.310')] [2024-06-15 23:35:21,391][1653645] Updated weights for policy 0, policy_version 973680 (0.0010) [2024-06-15 23:35:22,208][1653645] Updated weights for policy 0, policy_version 973744 (0.0010) [2024-06-15 23:35:23,815][1653645] Updated weights for policy 0, policy_version 973792 (0.0010) [2024-06-15 23:35:25,153][1653645] Updated weights for policy 0, policy_version 973856 (0.0011) [2024-06-15 23:35:25,556][1653645] Updated weights for policy 0, policy_version 973888 (0.0012) [2024-06-15 23:35:25,958][1648982] Fps is (10 sec: 91752.7, 60 sec: 91751.0, 300 sec: 91306.2). Total num frames: 1994522624. Throughput: 0: 22869.4. Samples: 498670592. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:25,958][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:35:27,156][1653645] Updated weights for policy 0, policy_version 973952 (0.0011) [2024-06-15 23:35:28,074][1653645] Updated weights for policy 0, policy_version 974016 (0.0011) [2024-06-15 23:35:29,969][1653645] Updated weights for policy 0, policy_version 974077 (0.0010) [2024-06-15 23:35:30,963][1648982] Fps is (10 sec: 91700.8, 60 sec: 90649.7, 300 sec: 91415.5). Total num frames: 1994948608. Throughput: 0: 22968.9. Samples: 498808832. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:30,965][1648982] Avg episode reward: [(0, '37.460')] [2024-06-15 23:35:31,082][1651596] Signal inference workers to stop experience collection... (50600 times) [2024-06-15 23:35:31,129][1653645] InferenceWorker_p0-w0: stopping experience collection (50600 times) [2024-06-15 23:35:31,242][1651596] Signal inference workers to resume experience collection... (50600 times) [2024-06-15 23:35:31,243][1653645] InferenceWorker_p0-w0: resuming experience collection (50600 times) [2024-06-15 23:35:31,418][1653645] Updated weights for policy 0, policy_version 974144 (0.0013) [2024-06-15 23:35:33,034][1653645] Updated weights for policy 0, policy_version 974205 (0.0018) [2024-06-15 23:35:33,911][1653645] Updated weights for policy 0, policy_version 974264 (0.0068) [2024-06-15 23:35:35,694][1653645] Updated weights for policy 0, policy_version 974308 (0.0011) [2024-06-15 23:35:35,957][1648982] Fps is (10 sec: 91750.8, 60 sec: 91227.2, 300 sec: 91528.3). Total num frames: 1995440128. Throughput: 0: 22789.8. Samples: 498870272. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:35,958][1648982] Avg episode reward: [(0, '37.430')] [2024-06-15 23:35:36,673][1653645] Updated weights for policy 0, policy_version 974355 (0.0013) [2024-06-15 23:35:38,189][1653645] Updated weights for policy 0, policy_version 974416 (0.0011) [2024-06-15 23:35:38,988][1653645] Updated weights for policy 0, policy_version 974480 (0.0010) [2024-06-15 23:35:40,870][1653645] Updated weights for policy 0, policy_version 974534 (0.0011) [2024-06-15 23:35:40,966][1648982] Fps is (10 sec: 91722.9, 60 sec: 92283.4, 300 sec: 91636.7). Total num frames: 1995866112. Throughput: 0: 22762.8. Samples: 499005440. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:40,967][1648982] Avg episode reward: [(0, '37.500')] [2024-06-15 23:35:41,512][1653645] Updated weights for policy 0, policy_version 974592 (0.0010) [2024-06-15 23:35:42,729][1653645] Updated weights for policy 0, policy_version 974651 (0.0011) [2024-06-15 23:35:44,575][1653645] Updated weights for policy 0, policy_version 974708 (0.0011) [2024-06-15 23:35:45,410][1653645] Updated weights for policy 0, policy_version 974779 (0.0013) [2024-06-15 23:35:45,958][1648982] Fps is (10 sec: 91749.7, 60 sec: 91750.4, 300 sec: 91528.2). Total num frames: 1996357632. Throughput: 0: 22744.3. Samples: 499142656. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:45,958][1648982] Avg episode reward: [(0, '37.450')] [2024-06-15 23:35:47,211][1653645] Updated weights for policy 0, policy_version 974832 (0.0012) [2024-06-15 23:35:48,498][1653645] Updated weights for policy 0, policy_version 974885 (0.0018) [2024-06-15 23:35:50,089][1653645] Updated weights for policy 0, policy_version 974944 (0.0011) [2024-06-15 23:35:50,729][1653645] Updated weights for policy 0, policy_version 974992 (0.0011) [2024-06-15 23:35:50,958][1648982] Fps is (10 sec: 95105.8, 60 sec: 91204.1, 300 sec: 91417.2). Total num frames: 1996816384. Throughput: 0: 22710.0. Samples: 499217408. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:50,959][1648982] Avg episode reward: [(0, '37.400')] [2024-06-15 23:35:51,290][1653645] Updated weights for policy 0, policy_version 975036 (0.0015) [2024-06-15 23:35:52,871][1653645] Updated weights for policy 0, policy_version 975077 (0.0009) [2024-06-15 23:35:53,897][1651596] Signal inference workers to stop experience collection... (50650 times) [2024-06-15 23:35:53,948][1653645] InferenceWorker_p0-w0: stopping experience collection (50650 times) [2024-06-15 23:35:54,035][1651596] Signal inference workers to resume experience collection... (50650 times) [2024-06-15 23:35:54,036][1653645] InferenceWorker_p0-w0: resuming experience collection (50650 times) [2024-06-15 23:35:54,353][1653645] Updated weights for policy 0, policy_version 975138 (0.0012) [2024-06-15 23:35:55,637][1653645] Updated weights for policy 0, policy_version 975190 (0.0014) [2024-06-15 23:35:55,958][1648982] Fps is (10 sec: 88473.6, 60 sec: 91204.7, 300 sec: 91306.1). Total num frames: 1997242368. Throughput: 0: 22505.2. Samples: 499350016. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:35:55,958][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:35:56,248][1653645] Updated weights for policy 0, policy_version 975235 (0.0012) [2024-06-15 23:35:58,204][1653645] Updated weights for policy 0, policy_version 975298 (0.0010) [2024-06-15 23:35:58,825][1653645] Updated weights for policy 0, policy_version 975349 (0.0011) [2024-06-15 23:35:59,891][1653645] Updated weights for policy 0, policy_version 975381 (0.0010) [2024-06-15 23:36:00,958][1648982] Fps is (10 sec: 85196.9, 60 sec: 90658.0, 300 sec: 91083.9). Total num frames: 1997668352. Throughput: 0: 22562.1. Samples: 499486720. Policy #0 lag: (min: 15.0, avg: 126.4, max: 271.0) [2024-06-15 23:36:00,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:36:01,049][1653645] Updated weights for policy 0, policy_version 975427 (0.0012) [2024-06-15 23:36:01,736][1653645] Updated weights for policy 0, policy_version 975488 (0.0017) [2024-06-15 23:36:02,782][1653645] Updated weights for policy 0, policy_version 975551 (0.0012) [2024-06-15 23:36:04,507][1653645] Updated weights for policy 0, policy_version 975606 (0.0058) [2024-06-15 23:36:05,959][1648982] Fps is (10 sec: 85183.7, 60 sec: 90110.1, 300 sec: 91194.6). Total num frames: 1998094336. Throughput: 0: 22629.7. Samples: 499553792. Policy #0 lag: (min: 7.0, avg: 141.0, max: 268.0) [2024-06-15 23:36:05,960][1648982] Avg episode reward: [(0, '37.370')] [2024-06-15 23:36:06,086][1653645] Updated weights for policy 0, policy_version 975652 (0.0010) [2024-06-15 23:36:07,083][1653645] Updated weights for policy 0, policy_version 975716 (0.0012) [2024-06-15 23:36:08,407][1653645] Updated weights for policy 0, policy_version 975766 (0.0013) [2024-06-15 23:36:08,862][1653645] Updated weights for policy 0, policy_version 975808 (0.0010) [2024-06-15 23:36:10,197][1653645] Updated weights for policy 0, policy_version 975870 (0.0011) [2024-06-15 23:36:10,958][1648982] Fps is (10 sec: 91752.8, 60 sec: 90112.0, 300 sec: 91417.2). Total num frames: 1998585856. Throughput: 0: 22653.1. Samples: 499689984. Policy #0 lag: (min: 7.0, avg: 141.0, max: 268.0) [2024-06-15 23:36:10,958][1648982] Avg episode reward: [(0, '37.440')] [2024-06-15 23:36:11,874][1653645] Updated weights for policy 0, policy_version 975921 (0.0015) [2024-06-15 23:36:12,528][1653645] Updated weights for policy 0, policy_version 975955 (0.0011) [2024-06-15 23:36:14,080][1653645] Updated weights for policy 0, policy_version 976017 (0.0012) [2024-06-15 23:36:14,549][1653645] Updated weights for policy 0, policy_version 976064 (0.0010) [2024-06-15 23:36:15,958][1648982] Fps is (10 sec: 101595.0, 60 sec: 91750.5, 300 sec: 91750.4). Total num frames: 1999110144. Throughput: 0: 22769.7. Samples: 499833344. Policy #0 lag: (min: 7.0, avg: 141.0, max: 268.0) [2024-06-15 23:36:15,959][1648982] Avg episode reward: [(0, '37.390')] [2024-06-15 23:36:15,969][1653645] Updated weights for policy 0, policy_version 976128 (0.0016) [2024-06-15 23:36:16,889][1651596] Signal inference workers to stop experience collection... (50700 times) [2024-06-15 23:36:16,927][1653645] InferenceWorker_p0-w0: stopping experience collection (50700 times) [2024-06-15 23:36:17,034][1651596] Signal inference workers to resume experience collection... (50700 times) [2024-06-15 23:36:17,034][1653645] InferenceWorker_p0-w0: resuming experience collection (50700 times) [2024-06-15 23:36:17,357][1653645] Updated weights for policy 0, policy_version 976184 (0.0010) [2024-06-15 23:36:18,509][1653645] Updated weights for policy 0, policy_version 976240 (0.0016) [2024-06-15 23:36:20,031][1653645] Updated weights for policy 0, policy_version 976294 (0.0013) [2024-06-15 23:36:20,957][1648982] Fps is (10 sec: 91751.0, 60 sec: 91204.6, 300 sec: 91528.3). Total num frames: 1999503360. Throughput: 0: 22846.6. Samples: 499898368. Policy #0 lag: (min: 7.0, avg: 141.0, max: 268.0) [2024-06-15 23:36:20,958][1648982] Avg episode reward: [(0, '37.350')] [2024-06-15 23:36:21,561][1653645] Updated weights for policy 0, policy_version 976352 (0.0011) [2024-06-15 23:36:22,232][1653645] Updated weights for policy 0, policy_version 976389 (0.0011) [2024-06-15 23:36:22,891][1653645] Updated weights for policy 0, policy_version 976448 (0.0057) [2024-06-15 23:36:24,541][1653645] Updated weights for policy 0, policy_version 976507 (0.0023) [2024-06-15 23:36:25,598][1653645] Updated weights for policy 0, policy_version 976550 (0.0011) [2024-06-15 23:36:25,958][1648982] Fps is (10 sec: 91750.3, 60 sec: 91750.1, 300 sec: 91528.2). Total num frames: 2000027648. Throughput: 0: 22907.7. Samples: 500036096. Policy #0 lag: (min: 7.0, avg: 141.0, max: 268.0) [2024-06-15 23:36:25,958][1648982] Avg episode reward: [(0, '37.470')] [2024-06-15 23:36:26,978][1653650] Stopping RolloutWorker_w3... [2024-06-15 23:36:26,978][1653645] Updated weights for policy 0, policy_version 976592 (0.0010) [2024-06-15 23:36:26,978][1653647] Stopping RolloutWorker_w2... [2024-06-15 23:36:26,978][1653650] Loop rollout_proc3_evt_loop terminating... [2024-06-15 23:36:26,979][1653647] Loop rollout_proc2_evt_loop terminating... [2024-06-15 23:36:26,979][1648982] Component RolloutWorker_w2 stopped! [2024-06-15 23:36:26,978][1653646] Stopping RolloutWorker_w0... [2024-06-15 23:36:26,979][1653646] Loop rollout_proc0_evt_loop terminating... [2024-06-15 23:36:26,979][1648982] Component RolloutWorker_w3 stopped! [2024-06-15 23:36:26,979][1648982] Component RolloutWorker_w0 stopped! [2024-06-15 23:36:26,990][1648982] Component RolloutWorker_w1 stopped! [2024-06-15 23:36:26,990][1653648] Stopping RolloutWorker_w1... [2024-06-15 23:36:26,991][1653648] Loop rollout_proc1_evt_loop terminating... [2024-06-15 23:36:27,031][1648982] Component Batcher_0 stopped! [2024-06-15 23:36:27,031][1651596] Stopping Batcher_0... [2024-06-15 23:36:27,032][1651596] Loop batcher_evt_loop terminating... [2024-06-15 23:36:27,108][1653645] Weights refcount: 2 0 [2024-06-15 23:36:27,110][1653645] Stopping InferenceWorker_p0-w0... [2024-06-15 23:36:27,110][1653645] Loop inference_proc0-0_evt_loop terminating... [2024-06-15 23:36:27,110][1648982] Component InferenceWorker_p0-w0 stopped! [2024-06-15 23:36:27,157][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000976608_2000093184.pth... [2024-06-15 23:36:27,195][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000967232_1980891136.pth [2024-06-15 23:36:27,327][1651596] Saving train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000976624_2000125952.pth... [2024-06-15 23:36:27,356][1651596] Removing train_dir/atari_2B_atari_carnival_1111/checkpoint_p0/checkpoint_000972560_1991802880.pth [2024-06-15 23:36:27,358][1651596] Stopping LearnerWorker_p0... [2024-06-15 23:36:27,359][1651596] Loop learner_proc0_evt_loop terminating... [2024-06-15 23:36:27,359][1648982] Component LearnerWorker_p0 stopped! [2024-06-15 23:36:27,359][1648982] Waiting for process learner_proc0 to stop... [2024-06-15 23:36:28,654][1648982] Waiting for process inference_proc0-0 to join... [2024-06-15 23:36:28,655][1648982] Waiting for process rollout_proc0 to join... [2024-06-15 23:36:28,655][1648982] Waiting for process rollout_proc1 to join... [2024-06-15 23:36:28,655][1648982] Waiting for process rollout_proc2 to join... [2024-06-15 23:36:28,655][1648982] Waiting for process rollout_proc3 to join... [2024-06-15 23:36:28,655][1648982] Batcher 0 profile tree view: batching: 2423.9575, releasing_batches: 5004.3028 [2024-06-15 23:36:28,655][1648982] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0051 wait_policy_total: 13383.5697 update_model: 545.4866 weight_update: 0.0010 one_step: 0.0670 handle_policy_step: 21174.2447 deserialize: 19.7563, stack: 3514.4141, obs_to_device_normalize: 12201.3587, forward: 4109.2313, prepare_outputs: 877.7856, send_messages: 169.4965 [2024-06-15 23:36:28,656][1648982] Learner 0 profile tree view: misc: 0.5547, prepare_batch: 6056.7978 train: 15632.9793 epoch_init: 3.2998, minibatch_init: 195.5176, losses_postprocess: 2258.2877, kl_divergence: 1197.0855, update: 5938.6551, after_optimizer: 2903.4472 calculate_losses: 2918.9729 losses_init: 6.0309, forward_head: 1180.1483, bptt_initial: 19.6793, bptt: 29.1445, tail: 599.5032, advantages_returns: 178.8295, losses: 718.5282 [2024-06-15 23:36:28,656][1648982] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.7047, enqueue_policy_requests: 2039.1399, process_policy_outputs: 72.4326, env_step: 21965.6413, finalize_trajectories: 28.9815, complete_rollouts: 8.4277 post_env_step: 139.5441 process_env_step: 59.2937 [2024-06-15 23:36:28,656][1648982] RolloutWorker_w3 profile tree view: wait_for_trajectories: 0.6437, enqueue_policy_requests: 1988.8568, process_policy_outputs: 72.1163, env_step: 22209.1675, finalize_trajectories: 28.1415, complete_rollouts: 9.1357 post_env_step: 137.0658 process_env_step: 59.1435 [2024-06-15 23:36:28,656][1648982] Loop Runner_EvtLoop terminating... [2024-06-15 23:36:28,656][1648982] Runner profile tree view: main_loop: 43519.1608 [2024-06-15 23:36:28,656][1648982] Collected {0: 2000125952}, FPS: 45959.7